Face Recognition dataset

IMDb-Face dataset

Running instruction

  • Download 'IMDb-Face.csv' file from https://drive.google.com/open?id=134kOnRcJgHZ2eREu8QRi99qj996Ap_ML
  • python imdb_crawl.py
    • Arguments
    -c: whether you crop the image with bounding box
    -d: delete existing data directory be
    
  • If you save non-cropped image, corresponding bounding box will also be recorded in bb.txt file for each direcory.
  • Make sure 'IMDb-Face.csv' and 'imdb_crawl.py' are located in same directory.

Megaface dataset

To run megaface test including identification(1m distractors), verification(@1e-6),

  1. Download distractors and probe dataset
  2. Preprocess dataset
  3. Generate bin files with a trained face recognition model
  4. Run megaface devkit

Download dataset

  • Distractors
    wget -c --user 'id' --password 'pwd' http://megaface.cs.washington.edu/dataset/download/content/MegaFace_dataset.tar.gz
  • Facescrub
    • wget -c --user 'id' --password 'pwd' http://megaface.cs.washington.edu/dataset/download/content/downloaded.tgz
    • python facescrub.py
      But, many urls can not be accessed.
  • Both datasets can be accessed at http://megaface.cs.washington.edu/participate/challenge.html
  • Dataset structure
     MEGAFACE -- distractors -- parent id -- ids -- images
              |                                  |- json file for each image 
              |
              |- facescrub -- ids -- images, bb.txt
                                  |- bb.txt
     
  • Facescrub bounding box file
    • facescrub.py
    • Need facescrub bounding box actor.txt, facescrub bounding box actress.txt files at megaface
    • Arguments
      - txt_files: [facescrub_actor.txt, facescrub_actress.txt]
      - timeout: timeout seconds for accessing url (need for downloading image)
      
      • This script file can also download image from text files, but many rows have missing urls.

Preprocess

Preprocess with your face detection/alignment model.

Generate bin files

  • gen_megaface.py (Need your face recognition model)
    • Make bin files of megaface distractors/facescrub images from trained face recognition model.

    • Arguments

      - megaface_path: path of pre-processed distractor images
      - facescrub_path: path of pre-processed facescrub images
      - megaface_noise: noise list of distractors
      - facescrub_noise: noise list of facescrub
      - megaface_bin_path: distractor bins save directory
      - facescrub_bin_path: facescrub bins save directory
      - ckpt: trained face recognition model
      - file_ending: file ending name, ex) _baseline.bin: aaa.jpg -> aaa_baseline.bin    
      
    • Resulting bin files of gen_megaface.py (ex: file_ending: _baseline.bin_)

      _baseline -- facescrub_bin -- ids -- bin files (***_baseline.bin)
                               | 
                               |- megaface_bin -- parend id -- ids -- bin files (***_baseline.bin)
      

Run megaface devkit

  • On terminal,
    python run_experiment.py --file_ending _baseline.bin --out_root baseline_results -d

  • run_experiment.py

    • Executes identification and verification binary files (bin/Identification, bin/FuseResults).
    • Arguments
      - distractor_feature_path: distractor bin files path (megaface_bin_path of gen_megaface.py)
      - probe_feature_path: facescrub bin files path (facescrub_bin_path of gen_megaface.py)
      - file_ending: file ending format (file_ending of gen_megaface.py)
      - sizes: number of distractors, set as [1000000] 
      
  • Caution

    • binary files (bin/Identification, bin/FuseResults) can be only executed on opencv2.4.

Reference