Models: I3D 32 frames input non-local and mask non-local. Use I3D non-local affine 32 frames inputs as pre-training model.
Model | Test Acc | Commens |
---|---|---|
I3D in paper | 41.6 | |
I3D non-local in paper | 44.4 | |
I3D 32 input | 38.7 | 2.9 drop than I3D |
I3D non-loal 32 input | 44.05 | 0.35 drop than I3D nonlocal |
I3D mask nlnet 32 input | 45.3 | 1.1 increase than I3D non-local |
Conclusion: I3D experiment in paper(41.6) should be repeated and achieved.
Models: I3D 8 input. Baseline == drop rate 0.8, resize 256*320, crop 224, sample rate 4.
Model | Final Val | Final Train | Test Acc |
---|---|---|---|
Base line | 28.64 | 77.86 | 31.62 |
drop rate 0.6 | 29.43 | 82.82 | 32.23 |
Resize 224*224 | 31.55 | 91.17 | 30.10 |
Resize 224*280 | 31.11 | 85.77 | 31.74 |
Resize 240*300 | 30.17 | 81.82 | 30.45 |
Sample rate 3 | 31.83 | 72.80 | 35.71 |
Sample rate 3 Resize 224*280 | 34.12 | 33.72 | 35.31 |
Conclusion:
- Drop out rate 0.6 is better than 0.8 (not sure).
- Sample rate 3 is much better than 4.
- Small feature map caused overfitting on train&val via test.
Models: I3D 8 input. Baseline == drop rate 0.7, resize 232*290, sample rate 3, Res5 stride 1.
Model | Best Val | Final Val | Final Train | Test Acc | Early Model Acc |
---|---|---|---|---|---|
Base line | 32.17 | 31.14 | 88.78 | 33.75 | 34.62(115000) |
drop rate 0.5 | 32.92 | 32.23 | 91.62 | 33.67 | 34.12 |
Resize 256*320 | 31.32 | 30.116 | 82.48 | 34.37 | 34.59 |
Resize 232*348 | 30.95 | 30.64 | 80.35 | 34.84 | 35.29 |
Models: I3D 32 input.
model | len | drop | bestval | finalval | finaltrain | finaltest |
---|---|---|---|---|---|---|
256*320 | 32 | 0.75 | 37.42 | 37.34 | 78.57 | 43.47 |
256*376 | 32 | 0.75 | 35.88 | 35.52 | 72.86 | 43.37 |
232*290 | 28 | 0.75 | 39.40 | 38.91 | 85.12 | 42.74 |
232*290 | 32 | 0.7 | 39.23 | 39.23 | 85.70 | 42.94 |
232*290 | 32 | 0.85 | 39.08 | 39.08 | 81.88 | 42.79 |
232*348 | 32 | 0.75 | 38.14 | 37.88 | 76.88 | 44.44 |
224*360 | 32 | 0.75 | 38.18 | 37.37 | 85.86 | 44.51 |