I only have one GPU card, how to train psquare?
Arran5353 opened this issue · 1 comments
Arran5353 commented
Hi,
I commented 444-445 of ./agents/psquare/psquare.py
. But it seems didn't work.
Here is the complete error message:
Traceback (most recent call last):
File "train_psquare.py", line 105, in <module>
TrainLoop(opt).train()
File "/Persona-Dialogue-Generation/scripts/train_model_selfplay.py", line 271, in train
world.parley_episode(is_training=True, is_display=is_display)
File "/Persona-Dialogue-Generation/worlds/selfplay.py", line 186, in parley_episode
self.parley(is_display)
File "/Persona-Dialogue-Generation/worlds/selfplay.py", line 90, in parley
acts[0] = agents[0].act(is_display)
File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 604, in act
act = self.batch_act(self.observation)
File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 624, in batch_act
cand_inds, is_training)
File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 814, in transmitter_predict
raise e
File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 787, in transmitter_predict
sampling=True)
File "/Persona-Dialogue-Generation/agents/transmitter/gpt/model.py", line 103, in forward
prior_context = torch.cat([src_seq, start_tensor], dim=1)
RuntimeError: Expected object of backend CPU but got backend CUDA for sequence element 1 in sequence argument at position #1 'tensors'
There was a TODO in code
TODO: must at least two GPU as the receiver & transmitter cannot be run in the same GPU card within less than 24GB memory.
So I'm not sure whether this work could run on one GPU.
Can anyone help why this happened? Can I change anything to ensure that run receiver & transmitter in one GPU?
Thank you in advance!
SivilTaram commented
@Arran5353 It's weird. I will try to reproduce the problem when there is only one GPU card available.