English»Data Sets and State-of-the-art (SOTA)»Question Answering (QA) | searchivarius.org
log in | contact | about 

AQUA-RAT (Algebra Question Answering with Rationales)   - algebraic word problems with rationales.
CMU Question-Answer Dataset  Noah Smith et al.
Community QA data set  
FigureQA   - an annotated figure dataset for visual reasoning.
Great Auk knowledge master web-site  
Jeopardy! games  
Jimmy Lin's collections   - Various software and data sets, including MIT Aranea, MIT 109 (reusable collection for TREC 2002), and Pourpre scoring script for automatically evaluating complex questions.
MS MARCO   - A Reading Comprehension Dataset for the Artificial Intelligence research community.
Open Question Answering Over Curated and Extracted Knowledge Bases from KDD 2014  
Question answering dataset featured in "Teaching Machines to Read and Comprehend"  
Question-Generation Corpus (from MS research)   - This corpus contains candidate fill-in-the-blank questions and answers generated from sentences taken from articles on Wikipedia's listing of vital articles and popular pages, along with ratings of the question quality from multiple judges, as well as unique judge IDs.
Quiz Ball data  
Quiz-Zone   - quality quiz questions.
SQuAD   - The Stanford Question Answering Dataset (about 100K QA pairs based on Wikipedia passages)
Text REtrieval Conference (TREC) QA Track  
TriviaQA   - A Large Scale Dataset for Reading Comprehension and Question Answering
Visual Genome   - an ongoing effort to connect structured image concepts to language
WikiSQL   - A large annotated semantic parsing corpus for developing natural language interfaces.