'checkpoints/gmf_factor8neg4_Epoch100_HR0.6391_NDCG0.2852.model'
SeekPoint opened this issue · 12 comments
mldl@ub1604:/ub16_prj/neural-collaborative-filtering/src$ python3 train.py/ub16_prj/neural-collaborative-filtering/src$
Range of userId is [0, 6039]
Range of itemId is [0, 3705]
MLP(
(embedding_user): Embedding(6040, 8)
(embedding_item): Embedding(3706, 8)
(fc_layers): ModuleList(
(0): Linear(in_features=16, out_features=64, bias=True)
(1): Linear(in_features=64, out_features=32, bias=True)
(2): Linear(in_features=32, out_features=16, bias=True)
(3): Linear(in_features=16, out_features=8, bias=True)
)
(affine_output): Linear(in_features=8, out_features=1, bias=True)
(logistic): Sigmoid()
)
Traceback (most recent call last):
File "train.py", line 79, in
engine = MLPEngine(config)
File "/home/mldl/ub16_prj/neural-collaborative-filtering/src/mlp.py", line 63, in init
self.model.load_pretrain_weights()
File "/home/mldl/ub16_prj/neural-collaborative-filtering/src/mlp.py", line 47, in load_pretrain_weights
resume_checkpoint(gmf_model, model_dir=config['pretrain_mf'], device_id=config['device_id'])
File "/home/mldl/ub16_prj/neural-collaborative-filtering/src/utils.py", line 14, in resume_checkpoint
map_location=lambda storage, loc: storage.cuda(device=device_id)) # ensure all storage are on gpu
File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 301, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/gmf_factor8neg4_Epoch100_HR0.6391_NDCG0.2852.model'
mldl@ub1604:
create a checkpoints folder yourself would fix this issue
No, it doesn't work
In fact, it lack of gmf_factor8neg4_Epoch100_HR0.6391_NDCG0.2852.model---the pretrained file
OK, this repo's author mentioned that he tested it without pretraining. You can add the pretrained model yourself though.
What I did is simply set the config['pretrain'] to False, which will converge slower, but still works.
@lovejasmine If you want to use pretrained GMF models while training MLP models, you have to specify where the pretrained GMF model file is, which means you would have to train your GMF model first.
@pd90506 You are right~
@LaceyChen17 Thank you for providing the code. It helps me a lot learning pytorch.
it works
File "train.py", line 95, in
engine.save(config['alias'], epoch, hit_ratio, ndcg)
File "/Users/ahmetavci/Desktop/ncf-pytorch/src/engine.py", line 86, in save
save_checkpoint(self.model, model_dir)
File "/Users/ahmetavci/Desktop/ncf-pytorch/src/utils.py", line 9, in save_checkpoint
torch.save(model.state_dict(), model_dir)
File "/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 219, in save
return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
File "/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 142, in _with_file_like
f = open(f, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/gmf_factor8neg4-implict_Epoch0_HR0.1025_NDCG0.0454.model'
I still get the same error, even though I set the config['pretrain'] to False
@ahmetavci07
mate -> FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/gmf_factor8neg4-implict_Epoch0_HR0.1025_NDCG0.0454.model'
This is the output file of your trained up the gmf_factor model.
So what you really needed to do:
- trained gmf_factor model
- found which one is the best model (since the name of this file will be different) within the checkpoint directory
- then replace the setting 'checkpoints/{}'.format(YOUR_BEST_MODEL_NAME)'
It works now, thanks!
@ahmetavci07
mate -> FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/gmf_factor8neg4-implict_Epoch0_HR0.1025_NDCG0.0454.model'
This is the output file of your trained up the gmf_factor model.
So what you really needed to do:
- trained gmf_factor model
- found which one is the best model (since the name of this file will be different) within the checkpoint directory
- then replace the setting 'checkpoints/{}'.format(YOUR_BEST_MODEL_NAME)'
Is it normal that my pre-trained best model is significantly stronger than the author's default model?
'pretrain_mlp': 'checkpoints/{}'.format('mlp_factor8neg4_Epoch100_HR0.5606_NDCG0.2463.model'), ----author's default model
But here's what I got HR0.64+ and NDGG0.37+ @yihong-chen @sleung852
I didn't change the code, why did I get such good results