SivilTaram/Persona-Dialogue-Generation

I only have one GPU card, how to train psquare?

Arran5353 opened this issue · 1 comments

Hi,

I commented 444-445 of ./agents/psquare/psquare.py. But it seems didn't work.

Here is the complete error message:

Traceback (most recent call last):  
File "train_psquare.py", line 105, in <module>  
    TrainLoop(opt).train()  
  File "/Persona-Dialogue-Generation/scripts/train_model_selfplay.py", line 271, in train  
    world.parley_episode(is_training=True, is_display=is_display)  
  File "/Persona-Dialogue-Generation/worlds/selfplay.py", line 186, in parley_episode  
    self.parley(is_display)  
  File "/Persona-Dialogue-Generation/worlds/selfplay.py", line 90, in parley  
    acts[0] = agents[0].act(is_display)  
  File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 604, in act  
    act = self.batch_act(self.observation)  
  File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 624, in batch_act  
    cand_inds, is_training)  
  File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 814, in transmitter_predict  
    raise e  
  File "/Persona-Dialogue-Generation/agents/psquare/psquare.py", line 787, in transmitter_predict  
    sampling=True)  
  File "/Persona-Dialogue-Generation/agents/transmitter/gpt/model.py", line 103, in forward  
    prior_context = torch.cat([src_seq, start_tensor], dim=1)  
RuntimeError: Expected object of backend CPU but got backend CUDA for sequence element 1 in sequence argument at position #1 'tensors'  

There was a TODO in code
TODO: must at least two GPU as the receiver & transmitter cannot be run in the same GPU card within less than 24GB memory. So I'm not sure whether this work could run on one GPU.

Can anyone help why this happened? Can I change anything to ensure that run receiver & transmitter in one GPU?

Thank you in advance!

@Arran5353 It's weird. I will try to reproduce the problem when there is only one GPU card available.