kimiyoung/ssl_bad_gan

Question about output value

Opened this issue · 2 comments

Thank you for providing us with the code.
I'm running the cifar_trainer.py
The code, data, hyper parameter were taken as is and not modified.
tail of cifar.FM+VI.run0(in the cifar10_log) looks like :

#189500 train: 0.0014, 0.0003 | dev: 0.9254, 0.1968 | best: 0.1787 | unl acc: 0.9633 | gen acc: 0.0328 | max unl acc: 0.9531 | max gen acc: 0.0259 | lab loss: 0.0235 | unl loss: 0.0856 | fm loss: 0.2706 | vi loss: -288.9219 | [Eval] unl acc: 0.9955, gen acc: 0.4895, max unl acc: 0.9950, max gen acc: 0.4595 | lr: 0.00060 #190000 train: 0.0003, 0.0000 | dev: 0.8812, 0.1845 | best: 0.1787 | unl acc: 0.9621 | gen acc: 0.0335 | max unl acc: 0.9526 | max gen acc: 0.0261 | lab loss: 0.0236 | unl loss: 0.0875 | fm loss: 0.2700 | vi loss: -288.7551 | **[Eval] unl acc: 0.9930, gen acc: 0.5690**, max unl acc: 0.9930, max gen acc: 0.5325 | lr: 0.00060

(1) After performing the 19000 iteration, the above results are obtained. I do not know what the cifar-10 error rate of 14.41% from the paper means.
In the eval section, unl acc, gen acc, which matches what value?

(2) I have tried to modify the code and apply other data, but it does not work well.
In the paper, experiments were conducted on three sample data of mnist, SVHN, and CIFAR-10
In the trainer.py source code, mnist uses pixel CNN and SVHM , CIFAR-10 don't uses pixel CNN architecture. What is the difference?
The hyperParameter setting is different for each example data in config.py.
I would like to ask if these config settings play an important role in training. Thank you!!


(1) 19000 iteration을 수행한 후, 위와 같은 결과가 나왔는데 논문에서 나온 cifar-10 에러율 14.41%는 어떤 값을 의미하는 지 모르겠습니다.
eval 부분에서 unl acc, gen acc가 나오는데 어떤값과 매칭이 되는 값인가요?

(2) 제가 코드를 수정하여 다른 데이터를 적용해 보았는데 학습이 잘 되지 않습니다.
논문에서 mnist, SVHN, CIFAR-10 세가지 예제 데이터에 대해 실험을 진행하였고
소스코드를 보면, trainer.py 부분에서 mnist는 pixel CNN을 사용하였고 SVHM, CIFAR-10은 pixel CNN을 사용하지 않은 구조인데 어떤 차이점이 있나요?
config.py에서 예제 데이터마다 hyperParameter 설정이 다른데
혹시, 이러한 config 설정값이 학습에 중요한 역할을 하는지 문의 드립니다. 감사합니다!!

Here is the log I retrieved from one of our runs:

#189500 train: 0.0001, 0.0000 | dev: 0.7278, 0.1617 | best: 0.1596 | unl accuracy: 0.9687 | gen accuracy: 0.0292 | lab loss: 0.0219 | unl loss: 0.0751 | gen loss: 0.2216 | enc loss: -1.9610 | gen pnorm: 281.7022 | enc pnorm: 277.7327 | lr 0.0003 #190000 train: 0.0001, 0.0000 | dev: 0.7337, 0.1641 | best: 0.1596 | unl accuracy: 0.9636 | gen accuracy: 0.0283 | lab loss: 0.0219 | unl loss: 0.0793 | gen loss: 0.2187 | enc loss: -1.9604 | gen pnorm: 282.0112 | enc pnorm: 278.0923 | lr 0.0003
The performance usually reach around 14.5% at about #400000 iterations.

In comparison, your log is a bit weird and it is difficult for us to see where the problem is.

For now, our suggestion is to double check the experiment setting, including the hyperparameters and the PyTorch version, is exactly the same as instructed. After that, please run the experiment again until #400000 iterations. If there is still some problem, please let us know and we will look into it.

@zihangdai Is the discriminator initial learning rate 6e-4 or 3e-4 by default? The log you posted shows a learning rate of 3e-4, however at 189500 iterations (epoch 379) it should not have started to decrease. Your config shows 6e-4 on the other hand.

edit: @kimiyoung