taoyang1122/adapt-image-models

Encoding error while loading val data on k400

Closed this issue · 1 comments

First of all,thanks for your great work!
My question is shown in the title, and the specific error message is as follows:

[> ] 777/19881, 1.0 task/s, elapsed: 769s, ETA: 18903sTraceback (most recent call last):
File "tools/test.py", line 364, in
main()
File "tools/test.py", line 349, in main
outputs = inference_pytorch(args, cfg, distributed, data_loader)
File "tools/test.py", line 167, in inference_pytorch
args.gpu_collect)
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/mmcv/engine/test.py", line 70, in multi_gpu_test
for i, data in enumerate(data_loader):
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
return self._process_data(data)
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
raise exception
decord._ffi.base.DECORDError: Caught DECORDError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/lzh/2022/tjq/adapt-image-models/mmaction/datasets/base.py", line 285, in getitem
return self.prepare_test_frames(idx)
File "/home/lzh/2022/tjq/adapt-image-models/mmaction/datasets/base.py", line 276, in prepare_test_frames
return self.pipeline(results)
File "/home/lzh/2022/tjq/adapt-image-models/mmaction/datasets/pipelines/compose.py", line 41, in call
data = t(data)
File "/home/lzh/2022/tjq/adapt-image-models/mmaction/datasets/pipelines/loading.py", line 965, in call
container = decord.VideoReader(file_obj, num_threads=self.num_threads)
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/video_reader.py", line 42, in init
ba, ctx.device_type, ctx.device_id, width, height, num_threads, 2)
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/_ffi/_ctypes/function.py", line 175, in call
ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
File "/home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/_ffi/base.py", line 63, in check_call
raise DECORDError(py_str(_LIB.DECORDGetLastError()))
decord._ffi.base.DECORDError: [15:44:17] /io/decord/src/video/video_reader.cc:125: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1

Stack trace returned 10 entries:
[bt] (0) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(dmlc::StackTrace(unsigned long)+0x50) [0x7f4606a29990]
[bt] (1) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x1d) [0x7f4606a2aa7d]
[bt] (2) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(decord::VideoReader::SetVideoStream(int)+0xee) [0x7f4606a7a6ae]
[bt] (3) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(decord::VideoReader::VideoReader(std::string, DLContext, int, int, int, int)+0x3cd) [0x7f4606a7b28d]
[bt] (4) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(+0x6a039) [0x7f4606a6a039]
[bt] (5) /home/lzh/anaconda3/envs/AIM/lib/python3.7/site-packages/decord/libdecord.so(DECORDFuncCall+0x52) [0x7f4606a26572]
[bt] (6) /home/lzh/anaconda3/envs/AIM/lib/python3.7/lib-dynload/../../libffi.so.7(+0x69dd) [0x7f465f9679dd]
[bt] (7) /home/lzh/anaconda3/envs/AIM/lib/python3.7/lib-dynload/../../libffi.so.7(+0x6067) [0x7f465f967067]
[bt] (8) /home/lzh/anaconda3/envs/AIM/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(_ctypes_callproc+0x2e7) [0x7f465c9ec437]
[bt] (9) /home/lzh/anaconda3/envs/AIM/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(+0x12ea4) [0x7f465c9ecea4]

The k400 dataset in my project is located in https://github.com/cvdfoundation/kinetics-dataset I downloaded it from, but I ensured that none of my videos were damaged. Is there an error in my configuration file?
My configuration file is as follows:

model = dict(
backbone=dict(drop_path_rate=0.2, adapter_scale=0.5, num_frames=8),
cls_head=dict(num_classes=400),
test_cfg=dict(max_testing_views=4))

dataset settings

dataset_type = 'VideoDataset'
#data_root = 'data/kinetics400/train_256'
#data_root_val = 'data/kinetics400/val_256'
#ann_file_train = 'data/kinetics400/train_video_list.txt'
#ann_file_val = 'data/kinetics400/val_video_list.txt'
#ann_file_test = 'data/kinetics400/val_video_list.txt'
data_root = '/data/K400/k400/train'
data_root_val = '/data/K400/k400/'
ann_file_train = '/data/K400/kinetics400/kinetics400_train_list.txt'
ann_file_val = '/data/K400/kinetics400/kinetics400_val_list.txt'
ann_file_test = '/data/K400/kinetics400/kinetics400_test_list.txt'
img_norm_cfg = dict(
mean=[122.769, 116.74, 104.04], std=[68.493, 66.63, 70.321], to_bgr=False)
train_pipeline = [
dict(type='DecordInit'),
dict(type='SampleFrames', clip_len=8, frame_interval=16, num_clips=1),
dict(type='DecordDecode'),
dict(type='Resize', scale=(-1, 256)),
dict(type='RandomResizedCrop'),
dict(type='Resize', scale=(224, 224), keep_ratio=False),
dict(type='Flip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='FormatShape', input_format='NCTHW'),
dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
dict(type='ToTensor', keys=['imgs', 'label'])
]
val_pipeline = [
dict(type='DecordInit'),
dict(
type='SampleFrames',
clip_len=8,
frame_interval=16,
num_clips=1,
test_mode=True),
dict(type='DecordDecode'),
dict(type='Resize', scale=(-1, 256)),
dict(type='CenterCrop', crop_size=224),
dict(type='Flip', flip_ratio=0),
dict(type='Normalize', **img_norm_cfg),
dict(type='FormatShape', input_format='NCTHW'),
dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
dict(type='ToTensor', keys=['imgs'])
]
test_pipeline = [
dict(type='DecordInit'),
dict(
type='SampleFrames',
clip_len=8,
frame_interval=16,
num_clips=3,
test_mode=True),
dict(type='DecordDecode'),
dict(type='Resize', scale=(-1, 224)),
dict(type='CenterCrop', crop_size=224),
dict(type='Flip', flip_ratio=0),
dict(type='Normalize', **img_norm_cfg),
dict(type='FormatShape', input_format='NCTHW'),
dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
dict(type='ToTensor', keys=['imgs'])
]
data = dict(
videos_per_gpu=8,
workers_per_gpu=2,
val_dataloader=dict(
videos_per_gpu=1,
workers_per_gpu=1
),
test_dataloader=dict(
videos_per_gpu=1,
workers_per_gpu=1
),
train=dict(
type=dataset_type,
ann_file=ann_file_train,
data_prefix=data_root,
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=ann_file_val,
data_prefix=data_root_val,
pipeline=val_pipeline),
test=dict(
type=dataset_type,
ann_file=ann_file_test,
data_prefix=data_root_val,
pipeline=test_pipeline))
evaluation = dict(
interval=5, metrics=['top_k_accuracy', 'mean_class_accuracy'])

I think it's a problem with the decord version, but when I tried to replace it with decord=0.6.0/0.40/0.4.1, the same error was reported

So, I have no choice but to bother you.
Looking forward to your reply, thank you very much!

It made me feel guilty. I was overconfident. I checked the val video again and found that some of the videos were damaged. I'm really sorry