k-NN for question answering (QA) and information retrieval (IR). Written by me with the help of David Novak and Yury Malkov..
Non-Metric Space Library (NMSLIB). Bilegsaikhan Naidan, me, and other contributors.
Accurate BM25 similarity for Lucene.
SOLR Annographix: a prototypical engine to index and retrieve annotation graphs, which is built on top of Apache SOLR.
A C++ library to compress and intersect sorted lists of integers using SIMD instructions [library][data generation]. Daniel Lemire, me, Nathan Kurz, and several other contributors.
Permutation/randomization algorithms for unadjusted pair-wise significance testing and testing with an adjustment for multiple comparisons.
FastPFor: software for very efficient decoding of sorted integer arrays. (Daniel Lemire, me, and several other folks). I contributed to development of SIMD-based methods.
Source code (and datasets) for my papers on approximate dictionary searching.