EricGuo5513/momask-codes

Nan in training data

aragakiyui611 opened this issue · 5 comments

when I run python train_vq.py --name rvq_name --gpu_id 1 --dataset_name t2m --batch_size 256 --num_quantizers 6 --max_epoch 50, it prints this bug, is there are NaNs in the datasets?
无标题

Hi,

There should not be any NaN in HumanML3D. Please refer to the tutorial to check if your data is correct. Specifically, the error message should refer to the validation set. I suggest you check if you processed the dataset correctly. If it's overall correct, then just filter out the NaN sequence for a quick fix.

I regenerate the dataset and it still has nan. I tried to filter them out.
image

However there still some data problem.
无标题
Installing corresponding scipy and numpy version does not work.

There’s a problem in your current filter solution that you ignored the constraints of the covariance matrix. E.g. the diagonal elements must be larger than 0.

What I suggest for a quick fix is to check the motion sequences in the dataset. Hopefully, there would be only few illegal sequences. Then, filter out all the illegal sequences before any operations. I also suggest you check which operation involves NaN data but not mess them up.

Thank you! I found the data have nan and deleted them. There were only *007975.npy has nan.