English Version | 中文版
For a large number of given face images, face feature extraction component is used to extract face features, and then face clustering model is used for face clustering and archiving.
- Python >= 3.6
- sklearn
- infomap
- numpy
- faiss-gpu(or faiss-cpu)
- torch >= 1.2
- torchvision
download test data and pretrain model BaiduYun(passwd: trka)
Put face pictures in the file directory 'data/input_pictures/'. The format as follow:
Put the pretrain models in the file directory 'pretrain_models/'
'data_sample': all pictures in a file directory
'labeled_data_sample': this data you can evaluate the cluster result with set is_evaluate=True.
'pretrain_model': the feature extract pretraind model, you can retrain the model on your data(eg: masked face feature) with the method [hfsoftmax](https://github.com/yl-1993/hfsoftmax)
python main.py
The results in the file directory 'data/output_pictures' with default.
The output directory is constucted as follows:
.
├── data
| ├── output_pictures
| ├── ├── 0
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
| | | └── x.jpg
| ├── ├── 1
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
| | | └── 4.jpg
| ├── ├── ...
| ├── ├── n
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
all pictures in n file directory are the same person.
If you want evaluate the cluster result, you should label and organize the input pictures like the data 'labeled_data_sample' with the format as follow:
.
├── data
| ├── input_pictures
| ├── ├── people_0
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
| | | └── x.jpg
| ├── ├── people_2
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
| | | └── 4.jpg
| ├── ├── ...
| ├── ├── people_n
| | | └── 1.jpg
| | | └── 2.jpg
| | | └── 3.jpg
all pictures in people_n file directory are the same person.
In addition, you should set is_evaluate=True.