nlp - Best tokenization method for dealing with informal english text data? -

August 15, 2013

in natural langauge processing, there tokenization tools designed accurately tokenization informal english text data sentences? eg: informal sources such reddit comments or forum data.

i've tried stanford tokenizer not seem informal text sources ones mentioned.

with influx of informal text data social media, hoping there more accurate way of tokenizing such data further processing.

Search This Blog

Force Net

nlp - Best tokenization method for dealing with informal english text data? -

Comments

Post a Comment

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -