100+ Interesting Data Sets for Statistics Robert Seaton |
1940 USA census |
6 Dataset Lists Curated by Data Scientists |
A comprehensive list of data sets for machine learning |
Allen Brain Observatory - standardized in vivo survey of physiological activity in the mouse visual cortex.
|
Data Depot - DataDepot is a set of tools for collaboratively uploading, sharing, and analyzing data. You can use DataDepot to track personal data, to explore public data, and to engage with scientific data.
|
Datasets for Data Mining, Analytics and Knowledge Discovery |
LinkData.org - a data publishing community/hub website.
|
Linked Data @ VU |
Mathematical Retrieval Project |
Million Song Dataset |
Nomao datasets - Data deduplication, learning to rank, online reviews, recommendations, text generation, voting networks.
|
Open speech corpora list Josh Meyer |
Pizza&Chili Corpus Gonzalo Navarro, Paolo Ferragina |
Publicly Available Large Data Sets for DB Research Daniel Lemire |
RedditSota - State-of-the-art result for all Machine Learning Problems
|
Research Pipeline Data sets |
Teens and Online Privacy - 2012 survey with questions about teens' attitudes towards privacy and their information management practices.
|
Time series data: classification and clustering datasets - A very diverse set of clustering/classification data.
|
UCI machine learning repository - UC Irvine Machine Learning Repository
|
Yahoo Data Sets - Includes n-grams and anonymized query logs.
|