English»Natural Language Processing and Computational Linguistics»Morphology and Stemming | searchivarius.org
log in | about 
 



A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine  Ilya Segalovich
ChaSen   - Word segmentation, POS tagging and morphological analysis in Japanese.
Dutch lemmatizer  
Information Retrieval Based on Context Distance and Morphology  Hongyan Jing, Evelyne Tzoukermann - In some cases, queries are retrieved with more accuracy because the morphological variants happen tobe also semantically related. In other cases, queries are retrieved with less accuracy because the morphological variants are not semantically related, thus stemming introduces noises in the statistical count. Authors propose to take semantic closeness into account to improve the usefulness of stemming.
LemmaGen   - Extremley Efficient High-throughput Lemmatization.
OpenNLP   - The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.
RiTa   - an easy-to-use toolkit for experiments in natural language and generative literature. It includes text-generation via CFGs and Markov chains; taggers for syllables, phonemes, stress, POS. Rita also provides a Java-API for WordNet and morphology modules.
Snowball   - Open source stemmer library for english, french, finnish, russian and some other languages...