yuhao-nie/SkyGPT

Question about input data for training SkyGPT.

Opened this issue · 3 comments

Hi!
Thanks for the great work and thanks so much for releasing the implementation of your work and related algorithms. I am learning a lot from reading your paper. Also, your model and how you shaped the problem into a probabilistic model is quite interesting.

Therefore, I am currently trying to run the training of SkyGPT.

But I don't know that what are going to be the inputs for

  1. Training the transformer ? (the data has to be specified by users).
  2. Training the VQVAE ? I am guessing it is a hdf5 called GPT_full_2min.hdf5...

My understanding is that I will have to generate the samples for training SkyGPT

  1. by running SkyGPT/script/reformat_input.py and then
  2. Then, use the result from previous step for SkyGPT/script/sample_gen.py

Then, I will get a hdf5 file containing

  • 'train_data': [B, H, W, 3] np.uint8,
  • 'train_idx': [B], np.int64 (start indexes for each video)
  • 'test_data': [B', H, W, 3] np.uint8,
  • 'test_idx': [B'], np.int64

But here are the problems....

  1. What is the input for SkyGPT/script/reformat_input.py ?
  2. Is GPT_full_2min.hdf5 a resulting file from SkyGPT/script/sample_gen.py ?
  3. What is the input for Training the transformer ?
  4. And how are these data related to the files that you provided in the Google drive ?

Hello, may I ask if the issue with the dataset you mentioned has been resolved?

微信图片_20250115154257
Hello, have you solved the input issue for the reformat function here? Is the file missing, or do we need additional code to generate it?

微信图片_20250115154257 您好,您解决了此处函数的输入问题reformat吗?文件是否丢失,或者我们需要额外的代码来生成它?

Hello, these files are from the video_prediction_dataset.hdf5 file.