English»Software»Natural Language Processing & Information Extraction»Language Modelling/Generation/Detection

Blog

Directory

CMU Sphinx - Speech Recognition Toolkit

AWD-LSTM / AWD-QRNN Language Model

berkeleylm An N-gram Language Model Library from UC Berkeley

Bow (or libbow) is a C-library for language modeling.

Compact Language Detector 2 - a very fast and compact library that detects over 80 languages in UTF-8 text.

Continuous Space Language Model toolkit

Downloadable NLG systems

EasyGen - a visual user interface to help set up simple neural network generation tasks.

faster-rnnlm Yandex - Faster Recurrent Neural Network Language Modeling Toolkit with Noise Contrastive Estimation and Hierarchical Softmax.

gensim - topic modelling for humans

GloVe: Global Vectors for Word Representation

IRST LM Toolkit - free/open-source language modelling tool, which can be used with Moses instead of SRILM (the latter is not free).

KenLM Language Model Toolkit - Small-memory efficient language modeling.

NPLM - is a toolkit for training and using feedforward neural language models due to Bengio, 2003.

Open Source Speech Software from Carnegie Mellon University

RandLM: space-efficient ngram-based language models built using randomized representations. David Talbot, Miles Osborne, Oliver Wilson

Recurrent Neural Network (RNN) Language Modeling by Microsoft Research

RiTa - an easy-to-use toolkit for experiments in natural language and generative literature. It includes text-generation via CFGs and Markov chains; taggers for syllables, phonemes, stress, POS. Rita also provides a Java-API for WordNet and morphology modules.

RNNLM Toolkit Tomas Mikolov - neural network based language models.

SimpleNLG - Java API for Natural Language Generation.

SRILM - The SRI Language Modeling Toolkit

The CMU-Cambridge Statistical Language Modeling toolkit R Rosenfeld, Philip Clarkson

The MIT Language Modeling (MITLM) toolkit

The Word Vector Tool - The Word Vector Tool is a flexible Java library for statistical language modelling. It especially supports the creation of word vector representations of text documents in the vector space model (each document is represented by the terms it contains). The vector space model is the point of departure for many text processing applications (e.g. web mining, text classification or information retrieval).

word2vec: vector space representations for words and skip-grams