MCG-NJU/TDN

decord._ffi.base.DECORDError: [16:56:24] /github/workspace/src/video/video_reader.cc:151: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1

NEUdeep opened this issue · 6 comments

when I trained in kinetics 400, it happened:

`=> base model: resnet50
kinetics: 400 classes
[06/22 16:52:42 TDN]: storing name: TDN__kinetics_RGB_resnet50_avg_segment8_e100

Initializing TSN with base model: resnet50.
TSN Configurations:
input_modality: RGB
num_segments: 8
new_length: 1
consensus_module: avg
dropout_ratio: 0.5
img_feature_dim: 256
=> base model: resnet50
[06/22 16:52:43 TDN]: [TDN-resnet50]group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
[06/22 16:52:43 TDN]: [TDN-resnet50]group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
[06/22 16:52:43 TDN]: [TDN-resnet50]group: normal_weight has 143 params, lr_mult: 1, decay_mult: 1
[06/22 16:52:43 TDN]: [TDN-resnet50]group: normal_bias has 64 params, lr_mult: 2, decay_mult: 0
[06/22 16:52:43 TDN]: [TDN-resnet50]group: BN scale/shift has 232 params, lr_mult: 1, decay_mult: 0
[06/22 16:52:43 TDN]: [TDN-resnet50]group: custom_ops has 0 params, lr_mult: 1, decay_mult: 1
video number:234619
video number:19760
video number:234619
video number:19760
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
[06/22 16:52:54 TDN]: Epoch: [0][0/14663], lr: 0.02000 Time 6.965 (6.965) Data 3.851 (3.851) Loss 5.9819 (5.9819) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
[06/22 16:53:05 TDN]: Epoch: [0][20/14663], lr: 0.02000 Time 0.951 (0.853) Data 0.000 (0.183) Loss 6.1689 (6.5195) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
[06/22 16:53:19 TDN]: Epoch: [0][40/14663], lr: 0.02000 Time 0.555 (0.782) Data 0.000 (0.094) Loss 6.0868 (6.3538) Prec@1 0.000 (0.305) Prec@5 0.000 (0.305)
[06/22 16:53:34 TDN]: Epoch: [0][60/14663], lr: 0.02000 Time 0.476 (0.761) Data 0.000 (0.063) Loss 6.1435 (6.2322) Prec@1 0.000 (0.205) Prec@5 0.000 (0.820)
[06/22 16:53:47 TDN]: Epoch: [0][80/14663], lr: 0.02000 Time 0.908 (0.742) Data 0.000 (0.048) Loss 5.7867 (6.1691) Prec@1 0.000 (0.309) Prec@5 0.000 (0.772)
[06/22 16:54:01 TDN]: Epoch: [0][100/14663], lr: 0.02000 Time 0.554 (0.729) Data 0.000 (0.038) Loss 5.7885 (6.1329) Prec@1 0.000 (0.371) Prec@5 0.000 (1.114)
[06/22 16:54:14 TDN]: Epoch: [0][120/14663], lr: 0.02000 Time 0.680 (0.719) Data 0.000 (0.032) Loss 6.0279 (6.1143) Prec@1 0.000 (0.310) Prec@5 0.000 (0.930)
[06/22 16:54:27 TDN]: Epoch: [0][140/14663], lr: 0.02000 Time 0.491 (0.708) Data 0.000 (0.027) Loss 5.8971 (6.0965) Prec@1 0.000 (0.266) Prec@5 0.000 (1.064)
[06/22 16:54:41 TDN]: Epoch: [0][160/14663], lr: 0.02000 Time 0.519 (0.704) Data 0.000 (0.024) Loss 5.8185 (6.0764) Prec@1 0.000 (0.311) Prec@5 12.500 (1.242)
[06/22 16:54:54 TDN]: Epoch: [0][180/14663], lr: 0.02000 Time 0.613 (0.702) Data 0.000 (0.021) Loss 5.8592 (6.0648) Prec@1 0.000 (0.345) Prec@5 0.000 (1.312)
[06/22 16:55:08 TDN]: Epoch: [0][200/14663], lr: 0.02000 Time 0.519 (0.699) Data 0.000 (0.019) Loss 5.9776 (6.0537) Prec@1 0.000 (0.435) Prec@5 0.000 (1.368)
[06/22 16:55:21 TDN]: Epoch: [0][220/14663], lr: 0.02000 Time 0.500 (0.697) Data 0.000 (0.018) Loss 6.0370 (6.0481) Prec@1 0.000 (0.396) Prec@5 0.000 (1.527)
[06/22 16:55:35 TDN]: Epoch: [0][240/14663], lr: 0.02000 Time 0.714 (0.698) Data 0.000 (0.016) Loss 5.8629 (6.0379) Prec@1 0.000 (0.363) Prec@5 12.500 (1.556)
[06/22 16:55:49 TDN]: Epoch: [0][260/14663], lr: 0.02000 Time 0.565 (0.695) Data 0.000 (0.015) Loss 5.8003 (6.0317) Prec@1 0.000 (0.431) Prec@5 0.000 (1.628)
[06/22 16:56:13 TDN]: Epoch: [0][280/14663], lr: 0.02000 Time 0.572 (0.733) Data 0.000 (0.014) Loss 5.8998 (6.0280) Prec@1 0.000 (0.445) Prec@5 0.000 (1.601)
[06/22 16:56:27 TDN]: Epoch: [0][300/14663], lr: 0.02000 Time 0.524 (0.731) Data 0.000 (0.013) Loss 5.8075 (6.0214) Prec@1 0.000 (0.415) Prec@5 0.000 (1.620)
Traceback (most recent call last):
File "main.py", line 361, in
main()
File "main.py", line 211, in main
train_loss, train_top1, train_top5 = train(train_loader, model, criterion, optimizer, epoch=epoch, logger=logger, scheduler=scheduler)
File "main.py", line 260, in train
for i, (input, target) in enumerate(train_loader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 363, in next
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 971, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
decord._ffi.base.DECORDError: Caught DECORDError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/workspace/mnt/storage/kanghaidong/new_video_project/video_project/TDN/ops/dataset.py", line 166, in getitem
video_list = decord.VideoReader(video_path)
File "/usr/local/lib/python3.6/dist-packages/decord/video_reader.py", line 55, in init
uri, ctx.device_type, ctx.device_id, width, height, num_threads, 0, fault_tol)
File "/usr/local/lib/python3.6/dist-packages/decord/_ffi/_ctypes/function.py", line 175, in call
ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
File "/usr/local/lib/python3.6/dist-packages/decord/_ffi/base.py", line 78, in check_call
raise DECORDError(err_str)
decord._ffi.base.DECORDError: [16:56:24] /github/workspace/src/video/video_reader.cc:151: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1`
how to solve it?

I think you should check your original video dataset to ensure that every video can be decoded successfully

it can be decoded by decord.
just like
`import decord

vr = decord.VideoReader('./kinetics-400-encode/train/swimming_backstroke/IPynjSp5JuE_000133_000143.mp4')
print(type(vr))
print('video frames:', len(vr))
for i in range(len(vr)):
# the video reader will handle seeking and skipping in the most efficient manner
frame = vr[i]
print(frame.shape)`

video is ok.

every video in the dataset is ok?

yes, every video is ok, and using opencv to decoded is ok, but so slow

If you can check the dataset and find those videos that can't be decoded successfully by decord, I could send you our copy by email :)

thank. let me check. I have written files that support opencv decoding, if necessary, you can merge them together