Video-Moment-Retrieval-Papers
Summary of the papers related to video moment retrieval / video grounding / video moment localization ...
The beginning of this stream
[Read, Watch, and Move Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos] AAAI 2019
[Language-Driven Temporal Activity Localization A Semantic Matching Reinforcement Learning Model] CVPR 2019
[Weakly Supervised Video Moment Retrieval From Text Queries] CVPR 2019
[Moment Retrieval via Cross-Modal Interaction Networks With Query Reconstruction] TIP 2020
[Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization] MM 2020
New
Benchmark Results % (2021)
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
IVG
-
63.22
43.84
27.10
-
-
-
-
TCN+DCM
-
-
44.9
27.7
-
-
-
-
CI-MHA
-
61.49
43.97
25.13
-
-
-
-
FVMR
-
60.63
45.00
26.85
-
86.11
77.42
61.04
BPNet
-
58.98
42.07
24.69
-
-
-
-
CABL
-
66.34
48.12
27.60
-
88.91
79.32
63.41
LOCFORMER
-
60.61
43.74
27.04
-
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
MMRG
88.27
71.60
44.25
-
92.35
87.67
60.22
-
LoGAN
-
51.67
34.68
14.54
-
92.74
74.30
39.11
weakly-supervised
IVG
-
67.63
50.24
32.88
-
-
-
-
TCN+DCM
-
-
59.7
34.4
-
-
-
-
I3D
CI-MHA
-
69.87
54.68
35.27
-
-
-
-
FVMR
-
-
55.01
33.74
-
-
89.17
57.24
I3D
I2N-I3D
-
-
52.28
31.32
-
-
80.65
54.17
BPNet
-
65.48
50.75
31.64
-
-
-
-
LOCFORMER
-
71.88
58.52
38.51
-
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
TCN+DCM
-
-
-
37.5
-
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
MMRG
85.34
57.83
39.28
-
84.37
78.38
56.34
-
* the figures look strangely high
IVG
49.36
38.84
29.07
19.05
-
-
-
-
FVMR
53.12
41.48
29.12
-
78.12
64.53
50.00
-
I2N-I3D
-
31.47
29.25
-
-
52.65
46.08
-
BPNet
-
25.96
20.96
14.08
-
-
-
-
* the figures look strangely low
CABL
49.16
38.98
27.65
-
73.12
59.96
46.24
-
maybe the experiment setting is different
Weakly-supervised methods
Comparison of some weakly-supervised methods on Charades-STA
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
LoGAN
-
51.67
34.68
14.54
-
92.74
74.30
39.11
weakly-supervised
TGA
-
29.68
17.04
6.93
-
83.87
58.17
26.80
weakly-supervised
SCN
-
42.96
23.58
9.97
-
95.56
71.80
28.87
weakly-supervised
LoGAN
: Tan, Reuben, et al. "Logan: Latent graph co-attention network for weakly-supervised video moment retrieval." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021.
TGA
: Niluthpol Chowdhury Mithun, Sujoy Paul, and Amit K Roy- Chowdhury. Weakly supervised video moment retrieval from text queries. In Proceedings ofthe IEEE Conference on Com- puter Vision and Pattern Recognition, pages 11592–11601, 2019.
SCN
: Zhijie Lin, Zhou Zhao, Zhu Zhang, Qi Wang, and Huasheng Liu. Weakly-supervised video moment retrieval via semantic completion network. In Proceedings ofthe AAAI Conference on Artificial Intelligence, 2020.
Benchmark Results % (2019 and before)
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
MCN
42.80
21.37
9.58
-
-
-
-
-
CTRL
49.09
28.70
14.0
-
-
-
-
-
ACRN
50.37
31.29
16.17
-
-
-
-
-
QSPN
-
45.3
27.7
13.6
-
75.7
59.2
38.3
TGN
70.06
45.51
28.47
-
79.10
57.32
44.20
-
SCDM
-
54.80
36.75
19.86
-
77.29
64.99
41.53
CBP
-
54.30
35.76
17.80
-
77.63
65.89
46.20
TripNet
-
48.42
32.19
13.93
-
-
-
-
RL
ABLR
73.30
55.67
36.79
-
-
-
-
-
RL
ExCL
-
63.30
43.6
24.1
-
-
-
-
PFGA
75.25
51.28
33.04
19.26
-
-
-
-
WSDEC-X(Weakly)
62.7
42.0
23.3
-
-
-
-
-
WSLLN (Weakly)
75.4
42.8
22.7
-
-
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
CTRL
-
-
23.63
8.89
-
-
58.92
29.52
ABLR
-
-
24.36
9.01
-
-
-
-
SMRL
-
-
24.36
11.17
-
-
61.25
32.08
ACL-K
-
-
30.48
12.20
-
-
64.84
35.13
SAP
-
-
27.42
13.36
-
-
66.37
38.15
QSPN
-
54.7
35.6
15.8
-
95.8
79.4
45.4
MAN
-
-
46.53
22.72
-
-
86.23
53.72
SCDM
-
-
54.44
33.43
-
-
74.43
58.08
CBP
-
-
36.80
18.87
-
-
70.94
50.19
TripNet
-
51.33
36.61
14.50
-
-
-
-
RL
ExCL
-
65.1
44.1
23.3
-
-
-
-
RL
PFGA
-
67.53
52.02
33.74
-
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
TMN
22.92
-
-
-
76.08
-
-
-
MCN
28.10
-
-
-
78.21
-
-
-
TGN
28.23
-
-
-
79.26
-
-
-
MAN
27.02
-
-
-
81.70
-
-
-
WSLLN (Weakly)
19.4
-
-
-
54.4
-
-
-
R@1 IoU@0.1
R@1 IoU@0.3
R@1 IoU@0.5
R@1 IoU@0.7
R@5 IoU@0.1
R@5 IoU@0.3
R@5 IoU@0.5
R@\5 IoU@0.7
Note
MCN
2.62
1.64
1.25
-
2.88
1.82
1.01
-
CTRL
24.32
18.32
13.30
-
48.73
36.69
25.42
-
TGN
41.87
21.77
18.90
-
53.40
39.06
31.02
-
ACRN
24.22
19.52
14.62
-
47.42
34.97
24.88
-
ACL-K
31.64
24.17
20.01
-
57.85
42.15
30.66
-
SCDM
-
26.11
21.17
-
-
40.16
32.18
-
CBP
-
27.31
24.79
19.10
-
43.64
37.40
25.59
TripNet
-
23.95
19.17
9.52
-
-
-
-
RL
SMRL
26.51
20.25
15.95
-
50.01
38.47
27.84
-
RL
ABLR
34.7
19.5
9.4
-
-
-
-
-
RL
ExCL
-
45.5
28.0
14.6
-
-
-
-