/LeReNet

Learned Relation Networks

Primary LanguagePython

LeReNet

Something-something training records

Experiment 1

Models: I3D 32 frames input non-local and mask non-local. Use I3D non-local affine 32 frames inputs as pre-training model.

Model Test Acc Commens
I3D in paper 41.6
I3D non-local in paper 44.4
I3D 32 input 38.7 2.9 drop than I3D
I3D non-loal 32 input 44.05 0.35 drop than I3D nonlocal
I3D mask nlnet 32 input 45.3 1.1 increase than I3D non-local

Conclusion: I3D experiment in paper(41.6) should be repeated and achieved.

Experiment 2

Models: I3D 8 input. Baseline == drop rate 0.8, resize 256*320, crop 224, sample rate 4.

Model Final Val Final Train Test Acc
Base line 28.64 77.86 31.62
drop rate 0.6 29.43 82.82 32.23
Resize 224*224 31.55 91.17 30.10
Resize 224*280 31.11 85.77 31.74
Resize 240*300 30.17 81.82 30.45
Sample rate 3 31.83 72.80 35.71
Sample rate 3 Resize 224*280 34.12 33.72 35.31

Conclusion:

  1. Drop out rate 0.6 is better than 0.8 (not sure).
  2. Sample rate 3 is much better than 4.
  3. Small feature map caused overfitting on train&val via test.

Experiment 3

Models: I3D 8 input. Baseline == drop rate 0.7, resize 232*290, sample rate 3, Res5 stride 1.

Model Best Val Final Val Final Train Test Acc Early Model Acc
Base line 32.17 31.14 88.78 33.75 34.62(115000)
drop rate 0.5 32.92 32.23 91.62 33.67 34.12
Resize 256*320 31.32 30.116 82.48 34.37 34.59
Resize 232*348 30.95 30.64 80.35 34.84 35.29

Experiment 4

Models: I3D 32 input.

model len drop bestval finalval finaltrain finaltest
256*320 32 0.75 37.42 37.34 78.57 43.47
256*376 32 0.75 35.88 35.52 72.86 43.37
232*290 28 0.75 39.40 38.91 85.12 42.74
232*290 32 0.7 39.23 39.23 85.70 42.94
232*290 32 0.85 39.08 39.08 81.88 42.79
232*348 32 0.75 38.14 37.88 76.88 44.44
224*360 32 0.75 38.18 37.37 85.86 44.51