log in | about 
 

Selected presentations:

  1. Data-efficient and Explainable Ranking with BERT models
    Talks @ Google, Glasgow University, and University of New Hampshire. Fall 2021-Spring 2022
    [Slides]

  2. Leonid Boytsov on k-NN search and information retrieval:
    A podcast with Radim Řehůřek (author of Gensim). March 2018.

  3. Off the Beaten Path: Let’s Replace Term-Based Retrieval with k-NN Search
    Talks @ Allen AI, Bloomberg, Amazon, & Visa Research. Fall 2017
    [Video] [Slides]

Thesis:

Efficient and Accurate Non-Metric k-NN Search with Applications to Text Matching, 2008.
[BLOG] [Software: (1)(2)]

Peer-reviewed papers and key tech-reports:

  1. I Mokrii, L Boytsov, P Braslavski., 2021
    A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models.

    In Proceedings of the The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021).
    [BLOG] [PDF] [Software (a straightforward use of FlexNeuART)]

  2. Boytsov, L., Kolter, Z., 2021.
    Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits.

    In Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021).
    [BLOG] [PDF]

  3. Boytsov, L., 2020.
    Traditional IR rivals neural models on the MS MARCO Document Ranking Leaderboard.

    Tech report.
    [BLOG] [PDF] [Software]

  4. Boytsov, L., Nyberg. E., 2020.
    Flexible retrieval with NMSLIB and FlexNeuART.

    In Proceedings of the 2nd EMNLP Workshop for Natural Language Processing Open Source Software (NLP-OSS), 2020.
    [PDF] [Software] [Slides & Video]

  5. C Kamphuis, AP de Vries, L Boytsov, J Lin, 2020.
    Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants.

    In Proceedings of (ECIR 2020): European Conference on Information Retrieval.
    [BLOG] [PDF]

  6. P Efimov, A Chertok, L Boytsov, P Braslavski, 2020.
    SberQuAD--Russian Reading Comprehension Dataset: Description and Analysis.

    In Proceedings of CLEF 2020: Experimental IR Meets Multilinguality, Multimodality, and Interaction.
    [BLOG] [PDF]

  7. Boytsov, L., Nyberg. E., 2019.
    Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study.

    In Proceedings of the 12th International Conference on Similarity Search and Applications (SISAP 2019).
    [BLOG] [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com

  8. Boytsov, L., Nyberg. E., 2019.
    Accurate and Fast Retrieval for Complex Non-metric Data via Neighborhood Graphs.

    In Proceedings of the 12th International Conference on Similarity Search and Applications (SISAP 2019).
    [BLOG] [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com

  9. Boytsov, L., Novak, D., Malkov, Y., Nyberg, E., 2016.
    Off the Beaten Path: Let’s Replace Term-Based Retrieval with k-NN Search

    In Proceedings of CIKM 2016.
    [BLOG & Talk] [PDF] [ Software: (1)(2) ] [Slides]

  10. Lemire, D., Boytsov, L., Kurz, N., 2016.
    SIMD Compression and the Intersection of Sorted Integers.

    Software: Practice and Experience.
    [BLOG] [PDF] [ Software: (1)(2) ] The definitive version is available on the publisher's site

  11. Bilegsaikhan. N., Boytsov, L., Nyberg, E., 2015.
    Permutation Search Methods are Efficient, Yet Faster Search is Possible.

    In Proceedings of the VLDB Endowment.
    [PDF] [Software] [Slides] [BLOG]

  12. Lemire, D., Boytsov, L., 2015. Decoding billions of integers per second through vectorization
    Software: Practice and Experience.
    [PDF] [Software] [Slides]
    The definitive version is available on the publisher's site

  13. Tsvetkov, Y., Boytsov, L., Gershman, A., Nyberg, E., Dyer, C. 2014.
    Metaphor Detection with Cross-Lingual Model Transfer.
    In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL'2014).
    [PDF]

  14. Ponomarenko, A., Avrelin, N., Bilegsaikhan. N., Boytsov, L., 2014.
    Comparative Analysis of Data Structures for Approximate Nearest Neighbor Search.

    In Proceedings of The Third International Conference on Data Analytics.
    [PDF] [Software] [BLOG]

  15. Boytsov, L., Bilegsaikhan. N., 2013.
    Learning to Prune in Metric and Non-Metric Spaces.

    In Advances in Neural Information Processing Systems 2013.
    [PDF] [Poster] [Supplemental] [Software]

  16. Boytsov, L., Bilegsaikhan. N., 2013.
    Engineering Efficient and Effective Non-Metric Space Library.

    In Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP 2013).
    [BLOG] [PDF] [Software] [Slides] [BLOG]
    The definitive version is available at www.springerlink.com

  17. Boytsov, L., Belova, A., Westfall, P., 2013.
    Deciding on an Adjustment for Multiplicity in IR Experiments
    .
    In proceedings of SIGIR 2013.
    [BLOG] [PDF] [Software] [Slides]

  18. Boytsov, L., 2012.
    Super-linear Indices for Approximate Dictionary Searching
    . In Proceedings of the 5th
    International Conference on Similarity Search and Applications (SISAP 2012).
    [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com (in my version a couple of errors are fixed)

  19. Boytsov, L., 2011. Journal of Experimental Algorithmics (JEA).
    Indexing methods for approximate dictionary searching: Comparative analysis
    [PDF] [Software]

Technical reports:

  1. Boytsov, L., 2017.
    A Simple Derivation of the Heap's Law from the Generalized Zipf's Law

    [PDF]

  2. Boytsov, L., 2015.
    Structured Retrieval using SOLR.

    [PDF] [Software]

  3. Wang, D., Boytsov, L., Araki, J., Patel, A., Gee, J., Liu, Z., Nyberg, E., Mitamura, T., 2014.
    CMU Multiple-choice Question Answering System at NTCIR-11 QA-Lab.
    In Proceedings of the 11th NTCIR Conference.
    [PDF]

  4. Boytsov, L., Belova, A., 2012.
    Does Category A Anchor Text Improve Category B Results?
    In TREC-21: Proceedings of the Nineteenth Text REtrieval Conference.
    [PDF]

  5. Boytsov, L., Belova, A., 2011.
    Evaluating Learning-to-Rank Methods in the Web
    Track Adhoc Task.

    In TREC-20: Proceedings of the Nineteenth Text REtrieval Conference.
    [PDF]

  6. Boytsov, L., Belova, A., 2010.
    Lessons Learned from Indexing Close Word Pairs.

    In TREC-19: Proceedings of the Nineteenth Text REtrieval Conference.
    [PDF]