Question about input data for training SkyGPT.
Opened this issue · 3 comments
GabbySuwichaya commented
Hi!
Thanks for the great work and thanks so much for releasing the implementation of your work and related algorithms. I am learning a lot from reading your paper. Also, your model and how you shaped the problem into a probabilistic model is quite interesting.
Therefore, I am currently trying to run the training of SkyGPT.
But I don't know that what are going to be the inputs for
- Training the transformer ? (the data has to be specified by users).
- Training the VQVAE ? I am guessing it is a hdf5 called
GPT_full_2min.hdf5
...
My understanding is that I will have to generate the samples for training SkyGPT
- by running
SkyGPT/script/reformat_input.py
and then - Then, use the result from previous step for
SkyGPT/script/sample_gen.py
Then, I will get a hdf5 file containing
- 'train_data': [B, H, W, 3] np.uint8,
- 'train_idx': [B], np.int64 (start indexes for each video)
- 'test_data': [B', H, W, 3] np.uint8,
- 'test_idx': [B'], np.int64
But here are the problems....
- What is the input for
SkyGPT/script/reformat_input.py
? - Is
GPT_full_2min.hdf5
a resulting file fromSkyGPT/script/sample_gen.py
? - What is the input for Training the transformer ?
- And how are these data related to the files that you provided in the Google drive ?
zxy426-cyber commented
Hello, may I ask if the issue with the dataset you mentioned has been resolved?
xuanmi98 commented
Mahuapeng-collab commented