CLEF tracks - "Ten Years of CLEF Test Data": a data set for system benchmarking and research purposes on the DIRECT system.
ClueWeb09 - This data set is used in Text Retrieval Conference. Contains two datasets: A and B. A contains approximately 500 million pages in 10 languages. B is a subset of A, which contains 50 million pages.
ClueWeb12 Jamie Callan et al. - A successor of ClueWeb09