Datasets and Applications

Kaggle

Kaggle has a number of datasets for public download. Kaggle also hosts a number of data science related contests often with significant cash prizes.

Kaggle


LINKS:

Open Refine

OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

Open Refine


LINKS:

KD Nuggets

KD Nuggets hosts a number of open datasets..

KD Nuggets


LINKS:

Data World

Data World is a commercial site hosting thousands of datasets. It offers search and hosting facilities with a free preview mode.

Data World


LINKS:

Awesome Public Datasets

An awesome list of public datasets hosted on Github.

Awesome Public Datasets


LINKS:

AWS Large Datasets

A collection of large datasets hosted on AWS.

AWS Large Datasets


LINKS:

data.gov

U.S. government open data tools and datasets.

data.gov


LINKS:

UCI Machine Learning Repo

UCI datasets for machine learning.

UCI Machine Learning Repo


LINKS:

Open Data Studio

Talend offers a free version of Open Data Studio.

Open Data Studio


LINKS:

R Studio

RStudio is a set of integrated tools designed to help you be more productive with R.

R Studio


LINKS: