official github for paper "CELEB-500K: A LARGE TRAINING DATASET FOR FACE RECOGNITION"
link: https://pan.baidu.com/s/1FTasU-aHzEhkkqA2aI4K-g password: tqmn
Download image urls from the link above and run DownloadImages.py to download images.
Then, face detection and alignment should be applied according to your own interest.
We suggest label cleaning for 2-3 rounds as described in the paper to obtain a cleaned dataset.
Note that 2 files are too large to upload (baidu requires vip account to upload files larger than 4G), and I am working on it.
@inproceedings{cao2018celeb,
title={Celeb-500K: A Large Training Dataset for Face Recognition},
author={Cao, Jiajiong and Li, Yingming and Zhang, Zhongfei},
booktitle={2018 25th IEEE International Conference on Image Processing (ICIP)},
pages={2406--2410},
year={2018},
organization={IEEE}
}