log in | about 
 

Invited and Job Talks:

  1. Information Retrieval Science.
    A discussion on information retrieval & unsupervised training of IR models. Weavite Podcast, January 2023.

  2. Understanding Performance of Long-Document Ranking Models through Comprehensive Evaluation and Leaderboarding.
    Talks @ Seminar @ Naver Labs Europe July 2022

  3. Data-efficient and Explainable Ranking with BERT models
    Talks @ Google, Glasgow University, and University of New Hampshire. Fall 2021-Spring 2022
    [Slides]

  4. Leonid Boytsov on k-NN search and information retrieval:
    A podcast with Radim Řehůřek (author of Gensim), March 2018.

  5. Off the Beaten Path: Let’s Replace Term-Based Retrieval with k-NN Search
    Talks @ Allen AI, Bloomberg, Amazon, & Visa Research. Fall 2017
    [Video] [Slides]

Key peer-reviewed papers and tech-reports:

  1. L Boytsov, P Patel, V Sourabh, R Nisar, S Kundu, R Ramanathan, E Nyberg., 2024
    InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
    Transaction Of Machine Learning Research (TMLR), 2024
    [PDF] [Software]

  2. P Efimov, Pavel Braslavski, L Boytsov, E Arslanova, P Braslavski., 2021
    The impact of cross-lingual adjustment of contextual word representations on zero-shot transfer.

    InEuropean Conference on Information Retrieval 2023.
    [PDF]

  3. I Mokrii, L Boytsov, P Braslavski., 2021
    A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models.

    In Proceedings of the The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021).
    [BLOG] [PDF] [Software (a straightforward use of FlexNeuART)]

  4. Boytsov, L., Kolter, Z., 2021.
    Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits.

    In Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021).
    [BLOG] [PDF]

  5. Boytsov, L., 2020.
    Traditional IR rivals neural models on the MS MARCO Document Ranking Leaderboard.

    Tech report.
    [BLOG] [PDF] [Software]

  6. Boytsov, L., Nyberg. E., 2020.
    Flexible retrieval with NMSLIB and FlexNeuART.

    In Proceedings of the 2nd EMNLP Workshop for Natural Language Processing Open Source Software (NLP-OSS), 2020.
    [PDF] [Software] [Slides & Video]

  7. C Kamphuis, AP de Vries, L Boytsov, J Lin, 2020.
    Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants.

    In Proceedings of (ECIR 2020): European Conference on Information Retrieval.
    [BLOG] [PDF]

  8. P Efimov, A Chertok, L Boytsov, P Braslavski, 2020.
    SberQuAD--Russian Reading Comprehension Dataset: Description and Analysis.

    In Proceedings of CLEF 2020: Experimental IR Meets Multilinguality, Multimodality, and Interaction.
    [BLOG] [PDF]

  9. Boytsov, L., Nyberg. E., 2019.
    Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study.

    In Proceedings of the 12th International Conference on Similarity Search and Applications (SISAP 2019).
    [BLOG] [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com

  10. Boytsov, L., Nyberg. E., 2019.
    Accurate and Fast Retrieval for Complex Non-metric Data via Neighborhood Graphs.

    In Proceedings of the 12th International Conference on Similarity Search and Applications (SISAP 2019).
    [BLOG] [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com

  11. Boytsov, L., 2017.
    A Simple Derivation of the Heap's Law from the Generalized Zipf's Law

    [PDF]

  12. Boytsov, L., Novak, D., Malkov, Y., Nyberg, E., 2016.
    Off the Beaten Path: Let’s Replace Term-Based Retrieval with k-NN Search

    In Proceedings of CIKM 2016.
    [BLOG & Talk] [PDF] [ Software: (1)(2) ] [Slides]

  13. Lemire, D., Boytsov, L., Kurz, N., 2016.
    SIMD Compression and the Intersection of Sorted Integers.

    Software: Practice and Experience.
    [BLOG] [PDF] [ Software: (1)(2) ] The definitive version is available on the publisher's site

  14. Bilegsaikhan. N., Boytsov, L., Nyberg, E., 2015.
    Permutation Search Methods are Efficient, Yet Faster Search is Possible.

    In Proceedings of the VLDB Endowment.
    [PDF] [Software] [Slides] [BLOG]

  15. Lemire, D., Boytsov, L., 2015. Decoding billions of integers per second through vectorization
    Software: Practice and Experience.
    [PDF] [Software] [Slides]
    The definitive version is available on the publisher's site

  16. Tsvetkov, Y., Boytsov, L., Gershman, A., Nyberg, E., Dyer, C. 2014.
    Metaphor Detection with Cross-Lingual Model Transfer.
    In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL'2014).
    [PDF]

  17. Ponomarenko, A., Avrelin, N., Bilegsaikhan. N., Boytsov, L., 2014.
    Comparative Analysis of Data Structures for Approximate Nearest Neighbor Search.

    In Proceedings of The Third International Conference on Data Analytics.
    [PDF] [Software] [BLOG]

  18. Boytsov, L., Bilegsaikhan. N., 2013.
    Learning to Prune in Metric and Non-Metric Spaces.

    In Advances in Neural Information Processing Systems 2013.
    [PDF] [Poster] [Supplemental] [Software]

  19. Boytsov, L., Bilegsaikhan. N., 2013.
    Engineering Efficient and Effective Non-Metric Space Library.

    In Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP 2013).
    [BLOG] [PDF] [Software] [Slides] [BLOG]
    The definitive version is available at www.springerlink.com

  20. Boytsov, L., Belova, A., Westfall, P., 2013.
    Deciding on an Adjustment for Multiplicity in IR Experiments
    .
    In proceedings of SIGIR 2013.
    [BLOG] [PDF] [Software] [Slides]

  21. Boytsov, L., 2012.
    Super-linear Indices for Approximate Dictionary Searching
    . In Proceedings of the 5th
    International Conference on Similarity Search and Applications (SISAP 2012).
    [PDF] [Software] [Slides]
    The definitive version is available at www.springerlink.com (in my version a couple of errors are fixed)

  22. Boytsov, L., 2011. Journal of Experimental Algorithmics (JEA).
    Indexing methods for approximate dictionary searching: Comparative analysis
    [PDF] [Software]

  23. Boytsov, L., Belova, A., 2011.
    Evaluating Learning-to-Rank Methods in the Web
    Track Adhoc Task.

    In TREC-20: Proceedings of the Nineteenth Text REtrieval Conference.
    [PDF]

Thesis:

Efficient and Accurate Non-Metric k-NN Search with Applications to Text Matching, 2008.
[BLOG] [Software: (1)(2)]