RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch
amirj opened this issue · 1 comments
amirj commented
I have an implicit MF model (so-called model_implicit
) in the following. I'm going to initialize the user_embedding
and item_embedding
from this model to build a new model:
# create the representation layer
bilinear = BilinearNet(num_users=dataset_implicit.num_users,
num_items=dataset_implicit.num_items,
embedding_dim=LATENT_DIM,
user_embedding_layer=model_implicit._net.user_embeddings,
item_embedding_layer=model_implicit._net.item_embeddings)
when trying to train the model initialize with the above representation:
newmodel = ImplicitFactorizationModel(loss='bpr',
representation=binonlinear,
....)
I got the following strange error:
~/user_preferences_model/multi_implicit.py in fit(self, interactions, verbose)
240
241 # leverage the current batch of users/items as positive instances
--> 242 positive_prediction = self._net(batch_user, batch_item)
243
244 # find some negative instances for the current batch_users
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
~/user_preferences_model/representations.py in forward(self, user_ids, item_ids)
105 item_embedding = item_embedding.squeeze()
106
--> 107 output_representation_users = self._net(user_embedding)
108 output_representation_items = self._net(item_embedding)
109
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
89 def forward(self, input):
90 for module in self._modules.values():
---> 91 input = module(input)
92 return input
93
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/modules/linear.py in forward(self, input)
53
54 def forward(self, input):
---> 55 return F.linear(input, self.weight, self.bias)
56
57 def extra_repr(self):
/users/tr.amirhj/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/torch/nn/functional.py in linear(input, weight, bias)
990 if input.dim() == 2 and bias is not None:
991 # fused op is marginally faster
--> 992 return torch.addmm(bias, input, weight.t())
993
994 output = input.matmul(weight.t())
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCGeneral.cpp:411
What's the problem?
@maciejkula
amirj commented
It was my fault. I just use an old model with different dimensions.