High level library for:
- download and unarchiving,
- discovering,
datasets as well as pre-trained models for Transfer Learning.
The PDL library gets generated based on the scripts/generate.py script, which depends on the https://lnkr-api.zerotosingularity.com api, which is currently online but not yet publicly accessible. Feel free to contact me at jan@zerotosingularity.com if you want to have your dataset added.
$ pip install pdl
from pdl import pdl
# Download a file (zip, tar, tgz, tar.gz)
pdl.download(url, data_dir="data/", keep_download=False, overwrite_download=False, verbose=False)
Below you can find the current supported datasets with their simplest invocation. Of course, you can still specify the parameters from the core: data_dir, keep_download, overwrite_download, verbose. Additionally, you can use info_only to print info about the dataset.
from pdl import pdl
# Download cifar-10 (http://www.cs.utoronto.ca/~kriz/cifar.html)
pdl.cifar_10()
# Example of more control, which can also be applied to the datasets below:
pdl.cifar_10(data_dir="my-data-dir/")
pdl.cifar_10(data_dir="my-data-dir/", verbose=True)
pdl.cifar_10(data_dir="my-data-dir/", overwrite_download=True, verbose=True)
pdl.cifar_10(data_dir="my-data-dir/", keep_download=True, verbose=True)
pdl.cifar_10(data_dir="my-data-dir/", keep_download=True, overwrite_download=True, verbose=True, info_only=False)
pdl.cifar_10("my-data-dir/", True, True, True)
# Download cifar-100 (http://www.cs.utoronto.ca/~kriz/cifar.html)
pdl.cifar_100()
# Download the Google Street View House (GSVH) numbers (http://ufldl.stanford.edu/housenumbers/)
pdl.gsvh_cropped()
# Download the Google Street View House (GSVH) numbers (http://ufldl.stanford.edu/housenumbers/)
pdl.gsvh_full()
# Download MNIST (http://yann.lecun.com/exdb/mnist/)
pdl.mnist()
# Download movie lens dataset(http://files.grouplens.org/datasets/movielens/)
pdl.movie_lens_latest()
from pdl import pdl
# Get the file name from a url
pdl.get_filename(url)
# Get the location of a file
pdl.get_file_location(data_dir, filename)
To run the tests from command line, simpy run:
$ pytest
For more details on pytest: Getting started with pytest.
- cats-dataset
- cifar_100_matlab
- cifar_100_python
- cifar_10_matlab
- cifar_10_python
- coco_2014
- coco_2015
- coco_2017
- dogscats
- glove-6b
- gsvh_cropped
- gsvh_full
- imagenette-160
- imagenette-320
- imagenette-full
- imdb
- joe-go
- mnist
- mnist-csv
- mnist-fashion
- movie_lens_latest
- nfpa
- open-images-dataset-v4
- oxford-iiit-pet
- pascal-voc-2007
- pascal-voc-2012
- sentiment140
- standford-squad
- trec-1
- trec-2
- trec-3
- trec-4
- trec-5
- trec-6
- trec-7
- trec-8
- trec-9
- twenty-newsgroups
- uecfood-100
- uecfood-256
- vqa-2015
- vqa-2016
- vqa-2017
- wikitext103
- yolo9000_weights
- yolo_voc_weights
- yolov2_tiny_voc_weights
- yolov2_tiny_weights
- yolov2_weights
- yolov3_weights