Feature Extraction Value Mismatch using C3D pretrained weights.

Question

Feature Extraction Value Mismatch using C3D pretrained weights.

Opened this issue 5 years ago · 11 comments

I've been trying to execute the code for C3D feature extraction at src/feature_extraction/main.py --C3D.

I've extracted the feature vector using your pipeline and the given pre-trained weight matrix and am comparing them with the features that are already available in the NumPy (1-C3D.npy) object. There seems to be a mismatch in terms of the values generated, although their shapes are the same. This implies that the model which I'm using for feature extraction is different from the one used to store in the NumPy.

Also, I'm using the 398x224 video to extract the features from the C3D pre-trained network.

Is there something wrong with my process?

Answer 1 · 2019-11-04T10:28:41.000Z

Hi @ganesh92

I extracted all features from the high quality videos. It may have a mismatch between the features themself, since the frame resizing, as well as frame rate may have been performed differently (ffmpeg vs opencv).
Although, even if the features are different, I would expect you to find similar results for the spotting task.

Best,

Answer 2 · 2019-11-04T11:19:20.000Z

I tried classifying using the features extracted from the 398*224 videos and the results were pretty poor. What is the video resolution for the training dataset? Can you send the corresponding link that was used for training with the labels?

Answer 3 · 2019-11-05T06:51:24.000Z

The C3D expect an 1:1 aspect ratio for your videos. As far as I remember, I used the HQ video, crop the right and left band to have a 1:1 aspect ratio, and downsample the resolution to 112x112. Be aware that if you pass a video with a higher resolution than 112x112, most feature extractor will crop the central part of the video only.

Answer 4 · 2019-12-26T05:59:20.000Z

Hi Silvio,

I have taken a PCA of the features from the testing dataset using my own implementation and I compared them against the PCA features from 1C3D_PCA512.npy file that has been provided. These features do not seem to match. Was there any special pre-processing done before running the PCA to generate the features?

I have used the following code to get the PCA of the testing sample, 1_C3D.npy

from sklearn.decomposition import PCA

pca = PCA(n_components=512)
result = pca.fit_transform(<Numpy with 1C3D.npy>)

I have even tried using center cropping with a 1:1 aspect ratio, but I am still not able to get the same results.

Also, are there any labels for HQ videos, as I am unable to find these in the downloadable sections.

Answer 5 · 2020-01-05T11:14:13.000Z

Hi @ganesh92,

It's weird that you do not get the same features, but at least you should have similar results for the spotting task.

Regarding the labels for HQ videos, there is a text file along each HQ video providing the starting time of the game in the HQ video. To clarify the LQ are trimmed at the start of the game while the HQ are untrimmed. The labels for the spotting task is provided on for the LQ trimmed videos, and with the starting time of the game for the HQ videos, you can adjust the video time for the spots.

Hope that helps

Answer 6 · 2020-04-09T08:59:41.000Z

Hi @SilvioGiancola,

I've experienced the same kind of mismatch that Ganesh explained above, except that this happens for me when I tried to execute the code for ResNet feature extraction.

I extracted the features using your provided implementation of ResNet at src/feature_extraction/main.py --ResNet on high quality videos.

Is there any modification or supplementary step to perform in order to obtain the same features as the ones you provide ?

Thanks for your answer.

Answer 7 · 2020-04-09T14:59:15.000Z

Are you using the C3D features pre-trained on sports1M? The weights are available here.

For ResNET, I was using those weights.

Answer 8 · 2020-04-09T15:01:26.000Z

Also, consider that ResNET expect 224x224 frames. Before extracting the features, I resampled the video at 30fps and resized the frames at 224x224, but I don't remember whether I cropped the sides of the frames or changed the aspect ratio in order to fit that squared aspect ratio.

Answer 9 · 2020-04-09T16:48:57.000Z

Yes, I was using the weights for ResNET you mentioned (https://drive.google.com/file/d/1sKKUH2Ozawu3epyg3YL6jsQ_f6dPcBJ_/view). I tried with and without cropping the sides of the frame but this was not solving the mismatch.

Maybe I am wrong but the HQ videos are at 25 FPS, aren't they ? How could I resample them at 30fps ?

Answer 10 · 2020-04-21T18:51:37.000Z

Hi @SilvioGiancola,

Sorry for the spam but could you detail what you said about resampling the video at 30 fps ?

Answer 11 · 2020-04-21T19:36:22.000Z

Sorry my bad, it was 25fps. Although I was generating the features using the high quality videos, the results might differ if you use the LQ video at 224 pixel height and 25 fps.

…

On Tue, Apr 21, 2020, 9:51 PM ARousseau1 ***@***.***> wrote: Hi @SilvioGiancola <https://github.com/SilvioGiancola>, Sorry for the spam but could you detail what you said about resampling the video at 30 fps ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABYNNQDA3243OMM5II2XLLDRNXTMTANCNFSM4JIQT2WQ> .

-- This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.