mxnet dataset to tfrecordsbackbone network architectures [vgg16, vgg19, resnet]backbone network architectures [resnet-se, resnext]LResNet50E-IRLResNet100E-IRAdditive Angular Margin LossCosineFace Losstrain network codeadd validate during trainingmulti-gpu training- evaluate code
epoch 0, total_step 20, total loss is 107.34 , inference loss is 80.60, weight deacy loss is 26.74, training accuracy is 0.000000, time 38.373 samples/sec
epoch 0, total_step 40, total loss is 109.65 , inference loss is 77.31, weight deacy loss is 32.34, training accuracy is 0.000000, time 38.281 samples/sec
epoch 0, total_step 60, total loss is 114.86 , inference loss is 82.29, weight deacy loss is 32.57, training accuracy is 0.000000, time 37.687 samples/sec
epoch 0, total_step 80, total loss is 104.92 , inference loss is 72.77, weight deacy loss is 32.15, training accuracy is 0.000000, time 38.402 samples/sec
epoch 0, total_step 100, total loss is 101.66 , inference loss is 69.99, weight deacy loss is 31.67, training accuracy is 0.000000, time 38.235 samples/sec
epoch 0, total_step 120, total loss is 101.70 , inference loss is 70.54, weight deacy loss is 31.16, training accuracy is 0.000000, time 37.822 samples/sec
epoch 0, total_step 140, total loss is 102.23 , inference loss is 71.61, weight deacy loss is 30.63, training accuracy is 0.000000, time 38.308 samples/sec
epoch 0, total_step 160, total loss is 103.26 , inference loss is 73.17, weight deacy loss is 30.08, training accuracy is 0.000000, time 38.054 samples/sec
epoch 0, total_step 180, total loss is 98.61 , inference loss is 69.07, weight deacy loss is 29.54, training accuracy is 0.000000, time 38.198 samples/sec
epoch 0, total_step 200, total loss is 95.20 , inference loss is 66.16, weight deacy loss is 29.04, training accuracy is 0.000000, time 38.217 samples/sec
- If you can't use large batch size(>128), you should use small learning rate
- If you can't use large batch size(>128), you can try batch renormalization(file
L_Resnet_E_IR_RBN.py
) - If use multiple gpus, you should keep at least 16 images each gpu.
- Try Group Normalization, you can use the code
L_Resnet_E_IR_GBN.py
model name | depth | normalization layer | batch size | total_steps | download |
---|---|---|---|---|---|
model A | 50 | group normalization | 16 | 1060k | model a |
dbname | accuracy |
---|---|
lfw | 0.9897 |
cfp_ff | 0.9876 |
cfp_fp | 0.84357 |
age_db30 | 0.914 |
model name | depth | normalization layer | batch size | total_steps | download |
---|---|---|---|---|---|
model B | 50 | batch normalization | 16 | 1100k | model_b |
dbname | accuracy |
---|---|
lfw | 0.9933 |
cfp_ff | 0.99357 |
cfp_fp | 0.8766 |
age_db30 | 0.9342 |
- TensorFlow 1.4 1.6
- TensorLayer 1.7
- cuda8&cudnn6 or cuda9&cudnn7
- Python3
GPU | cuda | cudnn | TensorFlow | TensorLayer | Maxnet | Gluon |
---|---|---|---|---|---|---|
Titan xp | 9.0 | 7.0 | 1.6 | 1.7 | 1.1.0 | 1.1.0 |
DL Tools | Max BatchSize(without bn and prelu) | Max BatchSize(with bn only) | Max BatchSize(with prelu only) | Max BatchSize(with bn and prelu) |
---|---|---|---|---|
TensorLayer | (8000, 9000) | (5000, 6000) | (3000, 4000) | (2000, 3000) |
Mxnet | (40000, 50000) | (20000, 30000) | (20000, 30000) | (10000, 20000) |
Gluon | (7000, 8000) | (3000, 4000) | no official method | None |
(8000, 9000) : 8000 without OOM, 9000 OOM Error
TensorLayer | Maxnet | Gluon |
---|---|---|
tensorlayer_batchsize_test.py | mxnet_batchsize_test.py | gluon_batchsize_test.py |