AQUA-RAT (Algebra Question Answering with Rationales)   - algebraic word problems with rationales.
ARC, the AI2 Reasoning Challenge  
CMU Question-Answer Dataset  Noah Smith et al.
Community QA data set  
FigureQA   - an annotated figure dataset for visual reasoning.
Great Auk knowledge master web-site  
Jeopardy! games  
Jimmy Lin's collections   - Various software and data sets, including MIT Aranea, MIT 109 (reusable collection for TREC 2002), and Pourpre scoring script for automatically evaluating complex questions.
LC-QuAD   - a corpus for complex question answering over knowledge graphs (a data set of natural language queries with corresponding SPARQL queries).
MovieQA   - aims to evaluate automatic story comprehension from both video and text. The data set consists of almost 15,000 multiple choice question answers obtained from over 400 movies and features high semantic diversity.
MS MARCO   - A Reading Comprehension Dataset for the Artificial Intelligence research community.
Open Question Answering Over Curated and Extracted Knowledge Bases from KDD 2014  
Question answering dataset featured in "Teaching Machines to Read and Comprehend"  
Question-Generation Corpus (from MS research)   - This corpus contains candidate fill-in-the-blank questions and answers generated from sentences taken from articles on Wikipedia's listing of vital articles and popular pages, along with ratings of the question quality from multiple judges, as well as unique judge IDs.
Quiz Ball data  
Quiz-Zone   - quality quiz questions.
SQuAD   - The Stanford Question Answering Dataset (about 100K QA pairs based on Wikipedia passages)
Text REtrieval Conference (TREC) QA Track  
TriviaQA   - A Large Scale Dataset for Reading Comprehension and Question Answering
Visual Genome   - an ongoing effort to connect structured image concepts to language
WikiSQL   - A large annotated semantic parsing corpus for developing natural language interfaces.