A few days ago I launched a traditional IR system into (lower layers of) the Transformer cloud. Although inferior to most BERT-based models, it outperformed several neural submissions (as well as all non-neural ones), including two submissions that used a large pretrained Transformer model for re-ranking.
My objectives were:
To provide a stronger traditional baseline;
To develop an effective first-stage retrieval system,
which can be efficient and effective without expensive index-time precomputation.