CAD-120 Experimental Setting
RomeroBarata opened this issue · 12 comments
Dear Siyuan,
In regards to the CAD-120 experimental results, previous works do a 4-fold cross-validation where one of the subjects is left out for testing whilst the other three are used for training. However, looking through the source code it seems to me that you do a single train, val, test split of the data (e.g. line 61 of https://github.com/SiyuanQi/gpnn/blob/master/src/python/cad120_prediction.py). Could you please clarify whether the results in the paper were a 4-fold cross-validation or a single train, val, test split? Thank you for your attention and for your work.
Kind regards,
Romero
@RomeroBarata I join this thread as I have the same question, I'm not sure they use a 4-fold cross-validation in the cross-subject setting. I will download the tmp/cad120/*.p files, and try to find out which sequences are in the train or in the test.
Edit :
https://github.com/SiyuanQi/gpnn/blob/master/src/python/datasets/CAD120/cad120.py
lines 57-60
sequence_ids = pickle.load(open(os.path.join(args.tmp_root, 'cad120_data_list.p'), 'rb'))
train_num = 80
val_num = 20
test_num = 25
sequence_ids = np.random.permutation(sequence_ids)
I think the sequences are randomly shuffled, and 80% of them go to the training set.
Hi @CamilleMaurice, that is true. Also, using the pre-trained model provided by the authors I get results that are slightly different to the ones reported in the paper. Are you having the same issue?
@RomeroBarata I'm not able to run this code for the moment, but have you tried to run it 4 times and compute the average f1-score ? I'll try to run it soon as I need to find the memory and time requierements for prediction.
I made it work under Python 3, so I've got, using their best model, the F1-score to be : 0.852
Hi @CamilleMaurice, I get the same result as you.
@RomeroBarata About the experimental setting, some people pointed out the same problem :
Look at : Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation
"We cannot compare to [34] as they do not follow the 4 fold cross-validation, a convention most of the previous works used."
Thank you very much for this reference @CamilleMaurice! :)
Thank you very much for this reference @CamilleMaurice! :)
Can you provide CAD-120 dataset?
@RomeroBarata Hello, can you share a google drive link for cad120 dataset? The official website of cad120 seems to be closed. I want to do some research on the dataset. Thanks a lot!
@CamilleMaurice Hello, can you share a google drive link for cad120 dataset? The official website of cad120 seems to be closed. I want to do some research on the dataset. Thanks a lot!
@Space-Xun Hello ! I'm sorry I do not have the cad120 dataset anymore so I can't share. I hope you'll find it or alternatly maybe you can try another dataset such as Charades or Watch-n-Patch ?