LiWentomng/OrientedRepPoints

Train with dota-train-dataset(1024,14384files),the mAP on dota-val-dataset is 70.84

Closed this issue · 5 comments

Thank you for your code, I'm learning how to use it, but I've had some problems and hope to get your help.
config: orientedreppoints_r50_demo.py
changes:
img_per_gpu=2 -> img_per_gpu=4
workers_per_gpu=2 -> workers_per_gpu=4
lr=0.01 -> lr=0.005
environment: 2 gpu(Tesla P40)
about mAP on val: 70.84.
classaps:[89.43 73.79 40.19 66.33 73.53 82.06 88.16 90.86 60.59 86.46 65.51 64.86 71.29 57.60 51.94 ]
my question: I use your checkpoints(form trainval-dataset) to detect dota-val-dataset and the mAP is about 82.
But the mAP 70.84(checkpoints form train-dota-dataset, test on val) feels lower than I expected(73 ~ 75). Is this normal?

For training on the train dataset,evaluation on the val dataset. My results can gain the mAP:73.37447
class APs: [89.89954584 75.09381718 51.91760568 69.30359075 75.60788996 82.47240929
88.02548317 90.72148874 66.22466264 87.10500443 69.58421786 68.80032583
72.45845151 61.51307246 51.88949827].
My trained model is here (password: aabb). You can try it.

I guess that your results are resulted by these three aspects:

  1. My train set include 15749 files, subsize=1024 x1024, gap=200. The number of your files is less than it. My script is prepare_dota1_train_val.py to prepare the train and val dataset, and you can refer to it.

  2. The learning rate is a sensitive factor for the model training. My device environment is as follow: 8 RTX2080ti, 2 imgs per gpu.
    You can try the learning rate of 0.006, 0.008.

  3. You can also add the “RandomRotate”in the config to get a better mAP, as following:
    dict(type='RandomRotate', rate=0.5, angles=[30, 60, 90, 120, 150], auto_bound=False)

If you have any questions for this problem, please let me know. I'll try to help you to get the normal results.

Yeah,the learning rate does have a significant impact on results. I got the mAP65 when the environment is 2 Tesla P40,4 imgs per gpu,lr=0.01(train on train-dota-dataset, test on val-dota-dataset).
My train-dota-dataset include 14384files(subsize=1024×1024, gap=100),maybe that's what makes the difference in results.

Have you tried mixed precision training?
I add ‘fp16 = dict(loss_scale=512.)’ to the config file, but the mAP is just 4.78.
btw: The mAP is 74.98 with same config file, FP32 training.

I haven't tried the mixed precision training to train this model.
As far as I know, Tesla P40 may not support FP16.
Besides, with a supportable GPU, the loss_scale=512 is used to adjust the magnification scale of loss and gradient during the training. The appropriate range is 0-1000. I guess that the model parameters have not been updated because the gradient is too small with fp16. Maybe a larger value will get a better result.