’self._ valid_ Epoch() 'Wrong problem

Question

’self._ valid_ Epoch() 'Wrong problem

Sonel-YXL opened this issue 2 years ago · 6 comments

This is a great job
When I first started encoder training. In line 279 of "base_trainer. py", there is a problem with "val_ result=self. _validd_Epoch". This function cannot be found. My current solution is to annotate valid modules. Is there a better way to solve this problem?
Thank you.

Answer 1 · 2023-02-09T19:58:40.000Z

It looks like a typo was introduced to your copy of the code, it is val_result = self._valid_epoch() in the repo

Answer 2 · 2023-02-10T12:49:13.000Z

Sorry, I didn't describe this problem very clearly.
The following is the original error message.

Train iteration: 900, loss: 3.044715, recogLoss: 3.044715, CER: 0.990476, WER: 1.000000, sec_per_iter: 0.320079, avg_loss: 3.146585, avg_recogLoss: 3.146585, avg_CER: 0.997909, avg_WER: 1.000000,
validate
Train iteration: 1000, loss: 2.888014, recogLoss: 2.888014, CER: 0.979681, WER: 1.000000, sec_per_iter: 0.319059, avg_loss: 3.046097, avg_recogLoss: 3.046097, avg_CER: 0.990086, avg_WER: 0.999848,
WARNING: upsampling image to fit size
Traceback (most recent call last):
File "/home/sonel/code/dormitory_code_20221104/GAN/handwriting_line_generation-master/train.py", line 132, in
main(config, args.resume)
File "/home/sonel/code/dormitory_code_20221104/GAN/handwriting_line_generation-master/train.py", line 78, in main
trainer.train()
File "/home/sonel/code/dormitory_code_20221104/GAN/handwriting_line_generation-master/base/base_trainer.py", line 279, in train
val_result = self._valid_epoch()
File "/home/sonel/code/dormitory_code_20221104/GAN/handwriting_line_generation-master/trainer/hw_with_style_trainer.py", line 464, in _valid_epoch
pred, recon, losses = self.run(instance)
AttributeError: 'HWWithStyleTrainer' object has no attribute 'run'

Answer 3 · 2023-02-10T16:48:46.000Z

What is your training config?

Answer 4 · 2023-02-11T03:42:52.000Z

My training configuration is '-c configs/cf_IAM_hwr_cnnOnly_batchnorm_aug.json'
In the json file, I only changed the file_path.
I initially queried whether there is val_result = self._valid_epoch() function in the class, but I didn't seem to find it.So I'm considering whether it is caused by this factor.
Because this is the verification part, I deleted all the verification parts without affecting the final training.
There must be a better solution here.
Thank you for your detailed answer,have a good day！

Answer 5 · 2023-02-13T04:34:40.000Z

Looks like a bug from a refactor. Line 464 should be pred, recon, losses = self.run_hwr(instance)
I've now fixed this in the repo. Change your code and try it.

Answer 6 · 2023-02-14T12:45:08.000Z

Thank you for your advice, I will have a try