reshow/PRNet-PyTorch

for a good result

Opened this issue · 26 comments

Hi,Would you mind telling me how your model is trained? I didn't use the code to achieve your model effect.

If you run the code directly and correctly, the result would be slightly worse than mine (2d landmark nme is about 3.30±0.03 ) since the number of parameters is less than in PRN's paper.
To achieve a good performance, I employ a number of data augmentation methods which are not the same as PRN, such as random erasing, gauss blur, etc. These methods are arbitrary so that I remove them from my code.
Another way is to increase the parameter number of the network. Here I use exactly the same network structure as PRN's given model. The model size is 52MB while the model size in their paper is more than 150MB. I'm not sure about this part.

I train your 30 epoch ,but just got 2d nme 3.8

How about the NME on training data?

I do not test for training data

Sorry, I mean the printed 'metrics0' of training dataset and evaluation dataset

I'm sorry I didn't record it

I use the datasets for official generation method, not use your method。Does this affect the effect?

I reload the model ,and got this result

[epoch:0, iter:111/7653, time:51] Loss: 0.1049 Metrics0: 0.0379

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.

The metrics0 should reach 0.03 in less than 10 epochs.

Try to use my generation code.

And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

ok,I will try it. Thanks a lot.

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.

The metrics0 should reach 0.03 in less than 10 epochs.

Try to use my generation code.

And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

I do it follwing your all code,but its effect is still not good,

this is result:

[epoch:29, iter:7654/7653, time:1802] Loss: 0.0329 Metrics0: 0.0130

nme2d 0.04015569452557179
nme3d 0.054406630244023056
landmark2d 0.043106316771823916
landmark3d 0.05833802395872772

Look forward to your reply.

I didn't try it. There are some differences between our generation codes but I don't think they will affect the performance.
The metrics0 should reach 0.03 in less than 10 epochs.
Try to use my generation code.
And try to change the line 96 in torchmodel.py as below and remember to record metrics0:

scheduler_exp = optim.lr_scheduler.ExponentialLR(self.optimizer, 0.9)

I do it follwing your all code,but its effect is still not good,

this is result:

[epoch:29, iter:7654/7653, time:1802] Loss: 0.0329 Metrics0: 0.0130

nme2d 0.04015569452557179
nme3d 0.054406630244023056
landmark2d 0.043106316771823916
landmark3d 0.05833802395872772

Look forward to your reply.

The result on training set is good and better than mine. But the evaluation result is bad.
I guess this is because I remove some augmentation codes. Please give me an email and I'll send them to you.
I'll update it right now.

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

thanks

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I trained it myself again and I get nme3d 0.0445 in 30 epochs.
I don't known what causes this difference.
You can try to use another learning rate scheduler in the code

self.scheduler = optim.lr_scheduler.StepLR(self.optimizer, step_size=5, gamma=0.5)

and set the learning rate to 2.5e-5.

I use this scheduler long time ago since it takes more epochs.

for get nme2d=0.031,how many epochs have you trained?

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

for get nme2d=0.031,how many epochs have you trained?

I don't remember, but 45 epochs is enough.

I

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

ok ,I will try it.

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

And if I use smaller learning rate (lr=8e-6) to train it from the beginning, the nme is drop slower than before(lr=1e-4).

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

I don't use the RandomColor function in practice, forget it.
If you use a smaller learning rate, does it finally reach a good result? And if the speed is unbearable, you may try some strategies such as warm up (I don't really use that).

It is my email: mjanddyy@gmail.com .

I've updated it. Sorry for the trouble.

I'm sorry to bother you again,I use your augmentation codes, and train about 45 epoch ,just get nme2d 0.03363224604973234
nme3d 0.04689772832815957
. and loss no longer reduced. Is this normal?

I suggest you adjust the learning rate by increasing or decreasing it tenfold before change the scheduler to see if the result becomes better.

I first train 30 epochs using lr=2e-4 ,and get nme2d 0.345,then decrease lr to 2e-5 ,retrain 45 epochs,get nme2d 0.336.

It's strange...... Could you use a even smaller learning rate (lr=8e-6) to train it from the beginning? I intuitively think it will help.

Excuse me again,if I use randomcolor in your augmentation codes, the nme is always about 0.04,can't drop to 0.03,Is this normal?

I don't use the RandomColor function in practice, forget it.
If you use a smaller learning rate, does it finally reach a good result? And if the speed is unbearable, you may try some strategies such as warm up (I don't really use that).

I'm not got a good result for use smaller learning rate or use optim.lr_scheduler.StepLR. The best result is nme2d 0.336.