log in | about 
 
This a personal web page/blog of Leonid Boytsov. He is currently a Sr. Research Scientist at Amazon AWS AI Labs. Overall, he has been a professional computer scientist since 1996 (full time since 1997). Leonid remembers dependency parsing & rotary phones. He started working as a full-stack developer, but has been gradually drifting towards the land of computer science research. This drift started from the interest in the search applications and algorithms.

An important by-product of Leonid's research is an efficient and flexible library for k-NN search codenamed NMSLIB created in collaboration with several other folks. Thanks to the contribution of Yury Malkov, the library was adopted by Amazon. The core retrieval method HNSW contributed by Yury was also reimplemented in the Facebook library FAISS. A brief description of this collaboration can be found on this LTI news page. This work is discussed in a podcast with Radim Řehůřek (author of Gensim) in March 2018. Feel free to check our code on GitHub.

NMSLIB integrates with another retrieval toolkit called FlexNeuART, which in August, November and December 2020 produced best neural and traditional submissions on the MS MARCO document ranking leaderboard. In that, the strongest traditional run outperformed a number of neural systems.

Leonid also co-authored an extremely efficient algorithm for light-weight compression of sorted integer numbers. We show that this algorithm can decompress at the speed of reading from memory. You can find software on GitHub. This software grew out of a now-popular library FastPFor. FastPFor has Python bindings.

Leonid was a graduate research assistant (aka PhD student) at the Language Technologies Institute at Carnegie Mellon University (under the supervision of Professor Eric Nyberg). In his thesis "Efficient and Accurate Non-Metric k-NN Search with Applications to Text Matching" he explored how various linguistic, neural, and lexical features can be incorporated directly into a candidate generation component (via k-NN search). He was assisting in teaching the following courses & seminars: Algorithms for NLP (11-711) in 2013, Software Engineering I (11-791) and Data Science Seminar (11-631) in 2014.

Featured blog posts:

Additional information: