English»Software»Natural Language Processing & Information Extraction»Frameworks

Blog

Directory

Component management systems aggregating NLP components such as part of speech taggers, named entity taggers, semantic role labelers, and syntactic parsers.

(Curator) Illinois NLP - An analog of UIMA and GATE.

Allen NLP

ClearTK - ClearTK provides a framework for developing statistical natural language processing (NLP) components in Java and is built on top of Apache UIMA. It includes a common interface and wrappers for popular machine learning libraries such as SVMlight, LIBSVM, OpenNLP MaxEnt, and Mallet.

FLAIR - a very simple framework for state-of-the-art NLP developed by Zalando Research.

GATE - General Architecture for Text Engineering - the Eclipse of Natural Language Engineering, the Lucene of Information Extraction, the leading toolkit for Text Mining

MLlib - Apache Spark's scalable machine learning library.

Nakala - a text mining framework inspired by UIMA. It contains a number of core classes organized in a framework that allows for rapid prototyping and maximizes code reuse, all resulting in quicker deployment to production.

Pattern - is a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization.

PyText - A natural language modeling framework based on PyTorch developed by Facebook.

Textacy - higher-level NLP built on spaCy

The (CSE) Configuration Space Exploration Framework - The CSE is a core component of the OAQA project and is a framework built on top UIMA.

UIMA Unstructured Information Management application - A powerful framework, which was used in IBM Watson.