English»Software»Natural Language Processing & Information Extraction | searchivarius.org
log in | about 

Annotation tools

Conversational agents (dialog systems and chatbots)

Coreference resolution

Question Answering (QA), Catalogs/lists, Sentiment Analysis and Opinion Mining...

Distributional semantics

Document Classification & Categorization

Document Parsers & Cleaners

Extraction & Summarization
Temporal taggers


Knowledge Bases & Knowledge Base Completion

Language Modelling/Generation/Detection

Machine Translation
Statistical systems, Rule-based systems, Example-based systems...

Morphology and Stemming

Ontologies/Encyclopedias & Semantic Web


Parsing & Tagging
Constituency and Dependency Parsers, Named Entity Recognizers (NER), Part of Speech (POS) tagging...

Question Answering (QA)
Slot filling, Question generation

Reasoning & Inference & Rule Engines

Search Engines
Crawlers, Forward & Graph Indices, Indri & Lemur...

Sentiment analysis

Toolkits & Frameworks

Topic Modelling

Various "banks"
WordNet, FrameNet

Word and document embeddings

Word segmentation/tokenization

Word Sense Disambiguation (WSD)


A curated list of machine learning and NLP software  
A Survey of Text Mining Architectures and the UIMA Standard.  Mathias Bank, Martin Schierle
A what-to-use diagram (and a blog post) for opensource NLP software  
Deep learning for NLP in PyTorch  
DeepNLP-models-Pytorch   - Pytorch implementations of various Deep NLP models in cs-224n (Stanford Univ: NLP with Deep Learning).
DKPRO a set of useful open-source UIMA components  
English Parser Evaluation Corpus   - dependency-parser evaluation data
GramLab   - free and open source linguistic tools for the processing of textual information.
Implementation of the Brown hierarchical word clustering algorithm.  Percy Liang
Intelligent Archive  
Practical PyTorch  
Semantic Matching   - Semantic matching is a type of ontology matching technique that relies on semantic information encoded in lightweight ontologies to identify nodes that are semantically related in graph-like structures.
Stanford's Tregex   - a utility for matching patterns in trees, based on tree relationships and regular expression matches on nodes (the name is short for "tree regular expressions").
TextCat   - a language guesser and text categorizer.
UW SPF - The University of Washington Semantic Parsing Framework  
warp-ctc   - A fast parallel implementation of Connectionist Temporal Classification (CTC), on both CPU and GPU.