/AFLW2COCO

Convert AFLW dataset's annotation into COCO-style json format

Primary LanguagePythonMIT LicenseMIT

AFLW2COCO: convert AFLW annotation into COCO format

Annotated Facial Landmarks in the Wild (AFLW) [1] provides a large-scale collection of annotated face images gathered from the web, exhibiting a large variety in appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total about 25k faces are annotated with up to 21 landmarks per image.

In this repo, we provide a script for converting annotations into a COCO-style json file, which can be used by the provided PyTorch data loader.

Requirements

  • For annotation conversion:

    • sqlite3
    • json
  • For the data loader and visualization

    • pytorch (version > 1.0)

    • opencv

    • cocoapi

Step 1: Download dataset

Download the dataset following the instruction described here. After downloading, extract dataset files under a single directory, i.e., the root directory of the AFLW dataset. Such a directory should be as follows:

├── aflw-changelog.txt
├── aflw.sqlite
└── flickr
    ├── 0
    ├── 2
    └── 3

Step 2: Convert original annotations into COCO format

Original AFLW annotations (SQLite format) can be converted to COCO-style json format by using the script convert2coco.py as follows:

python3 convert2coco.py -h
usage: Convert AFLW dataset's annotation into COCO json format [-h] [-v] [--dataset_root DATASET_ROOT] [--json JSON]

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  --dataset_root DATASET_ROOT
                        AFLW root directory
  --json JSON           output COCO json annotation file

where <dataset_root> is the root directory of AFLW dataset (see above). After conversion, a .json file will be stored under the filename given by the argument --json (by default aflw_annotations.json).

Dataset visualization

An auxiliary script for loading (using PyTorch data loader) and visualizing AFLW is also provided as visualize_dataset.py. For using this script, you need to install cocoapi.

python3 visualize_data.py -h
usage: Visualize AFLW dataset (COCO-style annotations) [-h] [-v] [--batch_size BATCH_SIZE]
                                                       [--dim DIM]

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  --batch_size BATCH_SIZE
                        set batch size
  --dim DIM             input image dimension

Images and bounding box statistics

You can compute statistics about images widths and heights, as well as face bounding boxes widths and heights, using the following script:

python3 compute_dataset_statistics.py -h
usage: Compute AFLW dataset's statistics [-h] [-v] --dataset_root DATASET_ROOT [--json JSON]

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase output verbosity
  --dataset_root DATASET_ROOT
                        AFLW root directory
  --json JSON           COCO json annotation file

[1] Koestinger, Martin, et al. "Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization." 2011 IEEE international conference on computer vision workshops (ICCV workshops). IEEE, 2011.