google/next-prediction

train without activity prediction, Colab Running Time

Closed this issue · 8 comments

Hi,

I'm trying to train Next without activity prediction module to make a fair comparison with other models that only output trajectories. Does it only need to remove the arguments --add_activity while training?

Thanks a lot!

Yes. For both training and testing.

Hi, thanks for your response.
And may I ask how long it took you to train it on your local machine? I am training on colab pro and it showed me an incredible estimated time (thousands of hours....). Since pro provides v100 GPU, I suspect that it takes too much time on I/O on google cloud.

I have not tried running it on Colab. With a 1080 TI GPU and an i5-core CPU, it would take about 36 hours to train with default settings. I/O should not be the bottleneck, since all data is packed into a .npz file, so in theory, everything needed would be loaded into RAM at the beginning. Could it be that RAM is not enough? Supposedly 24GB should be enough.

Hi, I am using 25G RAM and it's helpful but still has thousands of hours estimated. I have attached a screenshot that contains some initial outputs. Is it correct? I don't think there will be any differences after just preprocessing.
屏幕截图 2021-04-24 230932

The RAM might be limiting the performance. I cannot find the problem with these. Try looking at what is the CPU/RAM/GPU usage during the run.

It seems that it's getting the right estimated time after I left it training for hours. But still thank you so much for your help!

Hi, just a follow-up question. When I tried to restore the training using --load, it still starts from global_step == 0.
I saw the code that it seems reset everything even if we want to restore. It should import the .meta files first, right?

The code will restore from the latest checkpoint in the model path. global_step==0 means the step count is for this run. It does not reset anything.