/ml-datasets

Datasets for machine learning

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

This repository is deprecated. No further datasets will be added, and the included Colab notebooks won't be updated. There are other websites dedicated to datasets, such as Kaggle.

If you want a dataset to be taken down because it violates copyright, please let me know at copyright@torosyan.dev.


Datasets for Machine Learning

This repo contains lots of datasets for various ML models.

Repository is licensed under the GNU GPL v3 license; all rights to content go to the respective owners.

Dataset Archive Description Image/Char Count Colab notebook
Capital English letters, 32x32 Download The entire English alphabet, in capital letters, in Segoe UI Bold. 26 -
TechCrunch articles about startups Download Various TechCrunch articles about startups. 48180 GPT-2
M. Saryan paintings, 512x512 Download Paintings from the Armenian painter Martiros Saryan. 106 HyperGAN
Tweets about ML Download 2018 Tweets with #MachineLearning 6179 -
2021 in progress