- Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require more extensive token pre-processing, which is usually called segmentation.
- View on NuGet: http://www.nuget.org/packages/Stanford.NLP.Segmenter
Here are the packages that version 126.96.36.199 of Stanford.NLP.Segmenter depends on.net.sf.mpxj-ikvm : (7.3.0)
Installing with NuGet
PM> Install-Package Stanford.NLP.Segmenter -Version 188.8.131.52
Packages that Depend on Stanford.NLP.Segmenter
No packages were found that depend on version 184.108.40.206.