log in | about 
 

Ever had to train a complex model using limited resources available at home? Perhaps, you thought about using GPUs or something. Yet, even the largest state-of-the-art neural network, requiring as many as 16 expensive GPUs, has only 10 billion connections. At the same time, a small computational device called felis catus (FC) has up to 1013 synapses, which is a three orders of magnitude more complex system. This computational power is put to good use. For example, it permits solving motion-related differential equations in real time.

Deep neural networks demonstrate good performance in the task of speech recognition. Thus, it was quite natural to apply an FC to this problem. However, we decided to take it one step further and trained the FC to recognize visual clues in addition to voice signals. It was a challenging task due to FC's proclivity to overtrain as well as lack of theoretical guarantees for convergence. The model has been slowly converging for more than a year, but this was worthwhile. We achieved an almost perfect recognition rate and our results are statistically robust. A demo is available online.

This post is co-authored with Anna Belova.