English»Software»Multimedia Processing | searchivarius.org
log in | contact | about 

BoofCV   - an open source Java library for real-time computer vision and robotics applications. Written from scratch for ease of use and high performance.
Bulkr   - A tool to download Flickr images
Convolutional Shape Encoder  Dan Fischetti
Deep Learning Video Classifier/Editor with Caffe for Obscene Shots  
Dlib   - a modern C toolkit containing machine learning algorithms and tools for creating complex software in C .
Eesen   - an end-to-end speech recognition toolkit
End-to-end automatic speech recognition from scratch in Tensorflow.  
Essentia   - C++ library of algorithms to extract features from audio files, including Python bindings.
face_classification  Octavio Arriaga - Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.
Festival   - a free software multi-lingual speech synthesis workbench.
ImageHash   - image hashing library written in Python. See also an example here.
imutils   - A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.
jsfeat   - JavaScript Computer Vision library
Kaldi is a toolkit for speech recognition written in C++.  
LIRE: Lucene Image Retrieval  
MelodyShape   - an open source Java library and tool to compute the melodic similarity between monophonic music pieces.
MPEG-7 Feature Extraction Library  
Neural Doodle  Alex J. Champandard - Use a deep neural network to borrow the skills of real artists and turn your two-bit doodles into masterpieces!
ocropy   - Python-based OCR package using recurrent neural networks.
OpenCV (Open Source Computer Vision)   - is a library of programming functions for real time computer vision. It includes a statistical learning software for several methods, including naive Bayes, SVM, and gradient boosting.
OpenFace  Brandon Amos - Face recognition with Google's FaceNet deep neural network.
OpenFace  a state-of-the art open source tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
openMVG   - Open Multiple View Geometry library. Basis for 3D computer vision and Structure from Motion.
OpenPose   - A Real-Time Multi-Person Keypoint Detection And Multi-Threading C++ Library
OpenSMILE   - The Munich Versatile and Fast Open-Source Audio Feature Extractor
OverFeat   - an image recognizer and feature extractor built around a convolutional network.
Praat   - phonetics software package
Pretrained ConvNets for pytorch ResNeXt101, ResNet152, InceptionV4, InceptionResnetV2, etc.  
Speech Kitchen   - virtual machines for speech recognition experiments
SPEECH3 CMU's Deep Learning Research Toolkit   - SPEECH3 supports Deep Neural Networks (DNNs), Convolutional Neural Network (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memory (LSTM), Connectionist Temporal Classification (CTC), Asynchronous Stochastic Gradient Descent (ASGD) across multiple GPUs
Sphinx-4   - speech recognizer written entirely in the Java
STAIR Vision Library   - codenamed lasik contains computer vision research code initially developed to support the STanford AI Robot project.
TLK: The transLectures-UPV Toolkit  
Torch implementation of DeepMask and SharpMask  
VLFeat   - is an open source library implements popular computer vision algorithms including HOG, SIFT, MSER, k-means, hierarchical k-means, agglomerative information bottleneck, SLIC superpixels, and quick shift.
Wide residual networks   - This code was used for experiments with Wide Residual Networks by Sergey Zagoruyko and Nikos Komodakis.
Yolo   - fast state-of-the-art real-time processing system