Could you provide some hints on how to achieve the performance of your pre-trained model from scratch?

Question

Could you provide some hints on how to achieve the performance of your pre-trained model from scratch?

windpls opened this issue 8 years ago · 16 comments

Hi trigeorgis, thanks for your nice code. I have compiled and run the training process but found there is a big gap between the model trained myself and the one you provided. Data augmentation is one factor as you have mentioned in README. Is there any other reason for this gap? And could you provide some details about your data augmentation? Thanks.

Answer 1 · 2016-12-07T17:41:12.000Z

Same result as yours. I cannot train the model as the auther provided.

Answer 2 · 2016-12-07T19:03:06.000Z

Thanks @windpls for letting me know. Could you please give some more detail on the results you currently obtain, along with the CED curve you obtain? I am to polish up the code soon, but it should still obtain comparable results as those in the paper.

Answer 3 · 2016-12-08T16:11:24.000Z

@trigeorgis I trained the model very long time ago to set as a baseline. If I remember correctly, the error in the training is about 20%. Therefore, I did not even test the model.

Answer 4 · 2016-12-09T13:12:02.000Z

The code has beed changed since then so I would suggest to try it again. I am also not sure what you mean by 20%. Do you mean that the total RMSE error was .2?

Answer 5 · 2016-12-12T04:19:10.000Z

I mean NMRSE=0.2. I will wait new version of your code that support latest TF and try again. Now I do not have access of the machine that has old version bazel and cannot rebuild yours Tensorflow. :(

Answer 6 · 2016-12-20T18:14:11.000Z

I ran your pretrained model with help of the provided ipython script and with this I achieved 41.14 AUC @0.08 on 300W test. I also trained a model however with that I only achieve 35.76 AUC as compared to the 45.32 AUC you report in the paper.

Answer 7 · 2017-04-22T13:34:52.000Z

@ShownX Hello, I could not compile the extract_patches.cc in the tensorflow r0.11. You said slice operation and tf.stack could work. Could please give me more details about it? I have spent two days on this problem, but I have not solved it...Thanks!

Answer 8 · 2017-04-22T19:57:39.000Z

@MoExplorer You can slice the tensor and join them together.

Answer 9 · 2017-04-23T01:05:44.000Z

@ShownX Thanks for your reply. Do you mean that you have written another code that can slice the tensor and join them together to replace extract_patches.cc? I'm newbie in tensorflow. So if it's convenient, could you please provide the corresponding code as a reference for me ? I'll really appreciate it !

Answer 10 · 2017-04-23T01:13:55.000Z

@MoExplorer I am happy to help you. But I cannot distribute any code without permission.
TensorFlow really make life harder. If you still want to, this page would definitely help.

Answer 11 · 2017-04-23T01:34:07.000Z

@ShownX Thanks for your kindly help. I have another question here: Dose the tensorflow you have used to run the MDM code come from the source provided by the aurthor? Mine(r.0.8.0) is installed inside the anaconda, but I failed to compile the extract_patches.cc in it, which really confuses me now...

Answer 12 · 2017-04-24T14:52:54.000Z

@MoExplorer I once compiled it successfully.

Answer 13 · 2017-05-25T07:40:57.000Z

@superphil0 Hello, you said you have trained the model sucessfully and got another result. But when I train the model, the training process always stops halfway because of error "Model diverged with loss = NaN". I really don't know what's wrong... Have you ever met this error? If so, please help me, I'll really appreciate it!

Answer 14 · 2017-05-25T12:26:00.000Z

No sorry, I didn't get this error, but you can try a lower learning rate? Good luck!

Answer 15 · 2017-05-25T13:59:33.000Z

@superphil0 Thanks for your reply, and I will have a try. And I have some other questions here:
Is the max_step you set in the training model 100,000?
Whether the total training loss keeps unchanged after a certain step? I found that every time I restart to train the model, the unchanged value of training loss is different. Even sometimes it comes to zero, whcih really confuses me now... Could you please give me some advice? Thanks a lot!

Answer 16 · 2018-03-22T13:54:09.000Z

Please have a look on the implementation provided in the experimental branch.

(It goes without saying that this is experimental)