English»Software»Multimedia Processing»Speech»End-to-end systems

Blog

Directory

DeepSpeech - A TensorFlow implementation of Baidu's DeepSpeech architecture.

deepspeech.pytorch - Speech Recognition using DeepSpeech2 and the CTC activation function.

Eesen - an end-to-end speech recognition toolkit

End-to-end automatic speech recognition from scratch in Tensorflow.

espnet - an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech.

Espresso - A fast end-to-end neural automatic speech recognition (ASR) toolkit, based on fairseq and pytorch

OpenSeq2Seq - Toolkit for efficient experimentation with various sequence-to-sequence models.

SPEECH3 - CMU's Deep Learning Research Toolkit. SPEECH3 supports Deep Neural Networks (DNNs), Convolutional Neural Network (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memory (LSTM), Connectionist Temporal Classification (CTC), Asynchronous Stochastic Gradient Descent (ASGD) across multiple GPUs

wav2letter Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve, Neil Zeghidour, Vitaliy Liptchinsky - a simple and efficient end-to-end Automatic Speech Recognition (ASR) system from Facebook AI Research.