nlp - Best tokenization method for dealing with informal english text data? -
in natural langauge processing, there tokenization tools designed accurately tokenization informal english text data sentences? eg: informal sources such reddit comments or forum data.
i've tried stanford tokenizer not seem informal text sources ones mentioned.
with influx of informal text data social media, hoping there more accurate way of tokenizing such data further processing.
Comments
Post a Comment