timeout when training
Closed this issue · 3 comments
csf123123 commented
Hi, when I want to training model with aishell1, I meet the problem that connect() timeout . Can you help me?
jctian98 commented
Hi, could you post some logs here so I can check the problem?
csf123123 commented
Sorry,I can't download the logs directly , The errors show as follow: " RuntimeError : connetct () timed out " ; in launch configs : rdzv_configs: { 'rank': 1, 'timeout': 900}
jctian98 commented
I'm a bit confused about the connect()
function. Is that the k2.connect()
, or some communication operation in DDP? I suppose we don't need a dictionary called rdzv_configs
?