ttlmh/Bridge-Prompt

Having problem reproducing action segmentation result

Closed this issue · 6 comments

Hello,

I've followed all of the steps mentioned in README.md with GTEA dataset, and using extracted frame features (gtea_vit_features_splt1), I've followed ASFormer's steps (download pretrained model, download dataset and replace features with extracted one and run eval, which is step 1, 2, 3, and 4 in ASFormer's README.md)

but following error occurs and I could not handle it
python main.py --action=predict --dataset=gtea --split=1

UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.1 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" Model Size: 1130860 Traceback (most recent call last): File "/home/jckim/ASFormer/main.py", line 97, in <module> trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate) File "/home/jckim/ASFormer/model.py", line 406, in predict batch_input, batch_target, mask, vids = batch_gen_tst.next_batch(1) File "/home/jckim/ASFormer/batch_gen.py", line 101, in next_batch classes[i] = self.actions_dict[content[i]] KeyError: '<take><bread> (24-76) [0]'

I've also tried replacing all data with your data/gtea.zip not using data.zip given by ASFormer repository. Still doesn't work

is there anything that i've done wrong?
(If downloading pretrained model you uploaded is the only solution, I can't download using BAIDU because signing up to baidu is not currently available foreign user's account)

Thank you in advance!

ttlmh commented

Hi, thanks for your interest in our work, and sorry for the late reply!

Since we have extracted new features for the downstream tasks, the ASFormer model pre-trained by original I3D features is not applicable here. Thus, you would have to retrain the ASFormer using the newly extracted ViT features. Some modifications are needed for the ASFormer codes. You should change the features_dim = 2048 in ASFormer/main.py to features_dim = 768. Moreover, in ASFormer/batch_gen.py, you should change Line 96 to features = np.load(batch_features[idx]).T.

Also, if you require a certain pre-trained model, please let us know. We will try to upload it to Google Drive and Dropbox for your convenience.

Thanks!

Thank you for your help, but the problem did not solved.
I reflected your instructions, and still problem occurs.

I found out the reason is that Label files(e.g. S1_Cheese_C1.txt) from ASFormer and yours have difference form


from yours
line 1: (13-71) [0]

from ASformer
line 1: background


so keyerror mentioned above occurs

KeyError: ' (24-76) [0]'

is there any more thing should I change?

ttlmh commented

I think you can simply use the labels provided by ASFormer. Our provided labels are used only for the pre-training procedure of Bridge-Prompt. Since the frame features are extracted and then taken offline, the training procedure for the action segmentation model does not need to involve Bridge-prompt related contents.

Sorry for keep bothering you,

I successfully trained ASformer with your assistant ( Thank you ^v^ ), but following error occurs when predicting

python main.py --action predict --dataset gtea --split 1


RuntimeError: Given groups=1, weight of size [64, 768, 1], expected input[1, 1643, 768] to have 768 channels, but got 1643 channels instead


ttlmh commented

Hi, you may also change Line 409 in ASFormer/main.py to:
features = np.load(features_path + vid.split('.')[0] + '.npy').T

problem solved Thanks a lot!