/Datasets

Machine learning datasets used in tutorials on MachineLearningMastery.com

Machine Learning Datasets

This repository contains a copy of machine learning datasets used in tutorials on MachineLearningMastery.com.

This repository was created to ensure that the datasets used in tutorials remain available and are not dependent upon unreliable third parties.

In many cases, tutorials will link directly to the raw dataset URL, therefore dataset filenames should not be changed once added to the repository.

Datasets

This section provides a summary of the datasets in this repository.

Binary Classification Datasets

  • Breast Cancer (Wisconsin)
  • Breast Cancer (Yugoslavia)
  • Bank Note Authentication
  • Horse Colic
  • Ionosphere
  • Pima Indians Diabetes
  • Sonar Returns

Multiclass Classification Datasets

  • Glass Identification
  • Iris Flower Species
  • Wheat Seeds
  • Abalone Age (or regression)
  • Wine Quality (or regression)

Regression Datasets

  • Boston Housing
  • Longley Economic
  • Auto Insurance Total Claims

Univariate Time Series Datasets

  • Daily Minimum Temperatures in Melbourne
  • Daily Maximum Temperatures in Melbourne
  • Daily Female Births in California
  • Monthly International Airline Passengers
  • Monthly Armed Robberies in Boston
  • Monthly Sunspots
  • Monthly Champagne Sales
  • Monthly Shampoo Sales
  • Monthly Car Sales
  • Monthly Mean Temperatures in Nottingham Castle
  • Monthly Specialty Writing Paper Sales
  • Yearly Water Usage in Baltimore

Multivariate Time Series Datasets

  • Hourly Pollution Levels in Beijing
  • Minutely Individual Household Electric Power Consumption
  • Human Activity Recognition Using Smartphones
  • Indoor Movement Prediction