Can't quite train this time

Question

Can't quite train this time

corranmac opened this issue 3 years ago · 5 comments

Hi again,

I now have all my flow and maskrcnn files labelled in the correct way, however Im getting this error when trying to train using !python train.py config/config.json --data_folder=/content/Experiment/data/.
Im also unsure of whether this should point to my data folder with the original, flow and maskrcnn frames or just the folder with the original frames, however, neither way is working.
.

Model has 264706 params
Model has 133122 params
Model has 416379 params
Model has 402945 params
Traceback (most recent call last):
File "train.py", line 380, in
main(json.load(f))
File "train.py", line 183, in main
jif_all = get_tuples(number_of_frames, video_frames)
File "/content/layered-neural-atlases/unwrap_utils.py", line 110, in get_tuples
return torch.cat(jif_all, dim=1)
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, Autograd, Profiler, Tracer, Autocast]

This time I can't quite figure out how to fix it. Thanks so much in advance.

Answer 1 · 2021-10-27T02:35:07.000Z

Aha, so it was my mistake for not reading documentation carefully. I was trying to pass data_folder without realizing I had to manually change the config to this path. I am now wondering if there is a way to utilize Microsofts Deepspeed to increase training time?

Answer 2 · 2021-10-31T18:54:49.000Z

Happy to hear it got sorted out! Regarding your question, we haven't tried to use Microsoft Deepspeed. It would be great if the training time can be decreased :)

Answer 3 · 2022-01-15T23:38:28.000Z

Hi people, i have the same problem i tried to edit the "config.json"
with my data path, it look like "data_folder": "/content/drive/MyDrive/ML_Test/data",
but it don't work what i am doing wrong,
Thank's for your amazing paper ++

Answer 4 · 2022-01-22T16:15:40.000Z

It should probably be "data_folder": "/content/drive/MyDrive/ML_Test/data/<video_name>".
So suppose the folder that contains the images of your video's frames is called "blackswan", you should edit the config.json to be
"data_folder": "/content/drive/MyDrive/ML_Test/data/blackswan"
and the parent folder "/content/drive/MyDrive/ML_Test/data" should also contain the folders "blackswan_flow" and "blackswan_maskrcnn"

Answer 5 · 2022-01-22T16:21:35.000Z

Whaaa really thank's for your explanation i'll try tonight, also i tryed
to use RAFT to extarct the optical flow with no luck
i can't link RAFT into my current project.
I will keep you posted,
Anyway you are my hero !!!!!