This dataset is proposed in the paper, (K. Yamaguchi, M. Hadi Kiapour, & T. L. Berg. 2013) Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items..
https://github.com/kyamagu/paperdoll
data/paperdoll_dataset.mat
: 568,340 snap image urls with categories- 53 categories
data/fashionista_v0.2.mat
: 685 snap images and segmentation mapstruths
(struct array): ground truth annotatationpredictions
(struct array): predicted parsing resultstest_index
: samples used for training
See the official page's Data foramt for the original schema of the mat files.
Our json schema is the followings.
labels/paperdoll.json
- 568,340 samples
snap_id
(int): starts from 1snap_url
(str): image urlpost_url
(str): post page url including the snapitems
category_id
(int): 1 to 53category
(str)
labels/categories.tsv
category_id
(int): 1 to 53category
(str)
pipenv sync
or
pip install -r requirements.txt
cd $DATASET_ROOT/raw
wget http://vision.cs.stonybrook.edu/~kyamagu/paperdoll/data-v1.0.tar
tar xvf data-v1.0.tar
wget https://github.com/kyamagu/paperdoll/raw/master/data/chictopia/chictopia.sql.gz
gunzip -c chictopia.sql.gz | sqlite3 chictopia.sqlite3
python mat2json_paperdoll.py
the followings will be made from data/paperdoll_dataset.mat
.
labels/paperdoll.json
labels/categories.tsv
python download_images.py
If time out occurs,
- increase
-r
(the number of requests) more than 100 (default) - increase
-t
(the time to wait until time out) more than 180[s] (default)