gnumpy - a simple Python module that interfaces in a way almost identical to numpy, but does its computations on your computer's GPU.
horovod - Distributed training framework for TensorFlow (claimed to be faster than traditional Distributed TensorFlow).
Hummingbird - a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to seamlessly leverage neural network frameworks (such as PyTorch) to accelerate traditional ML models.
Keras.JS - Run trained Keras models in your browser, GPU-powered using WebGL. Models are serialized directly from the Keras JSON-format configuration file and associated HDF5 weights.
NNPACK - Acceleration package for neural networks on multi-core CPUs
OverFeat - an image recognizer and feature extractor built around a convolutional network.
Rapids - a suite of open source software libraries allowing one to execute end-to-end data science and analytics pipelines entirely on GPUs.
SqueezeNet - AlexNet-level accuracy with 50x fewer parameters
TorchCraft - an interface between StarCraft: Brood War and Torch, the deep learning environment.