ERROR cannot find video stream with wanted index: -1 #893

balaganeshmohan · 2021-05-31T11:43:45Z

Hey everyone, great implementation of the TSM framework. I am trying to train on a custom dataset, but I can't get past this error. I am not sure what this means since, this happens after the first iteration of training is done. Here is my log. I initially tried changing the data loader to 0 which actually started the training, before which it did not. I have check all the video paths and it is present where it should be. Some help is appreciated here.

>sys.platform: linux
Python: 3.7.10 (default, May  3 2021, 02:48:31) [GCC 7.5.0]
CUDA available: True
GPU 0: Tesla T4
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.1+cu101
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.3, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.9.1+cu101
OpenCV: 4.1.2
MMCV: 1.3.5
MMCV Compiler: n/a
MMCV CUDA Compiler: n/a
MMAction2: 0.14.0+e9b7009
------------------------------------------------------------

2021-05-31 11:34:01,998 - mmaction - INFO - Distributed training: False
2021-05-31 11:34:02,264 - mmaction - INFO - Config: model = dict(
    type='Recognizer2D',
    backbone=dict(
        type='ResNetTSM',
        pretrained='torchvision://resnet50',
        depth=50,
        norm_eval=False,
        shift_div=8),
    cls_head=dict(
        type='TSMHead',
        num_classes=400,
        in_channels=2048,
        spatial_type='avg',
        consensus=dict(type='AvgConsensus', dim=1),
        dropout_ratio=0.5,
        init_std=0.001,
        is_shift=True),
    train_cfg=None,
    test_cfg=dict(average_clips='prob'))
optimizer = dict(
    type='SGD',
    constructor='TSMOptimizerConstructor',
    paramwise_cfg=dict(fc_lr5=True),
    lr=0.02,
    momentum=0.9,
    weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=20, norm_type=2))
lr_config = dict(policy='step', step=[20, 40])
total_epochs = 50
checkpoint_config = dict(interval=5)
log_config = dict(interval=20, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
dataset_type = 'VideoDataset'
data_root = 'Data/Train'
data_root_val = 'Data/Validation'
ann_file_train = 'train.txt'
ann_file_val = 'val.txt'
ann_file_test = 'test.txt'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_bgr=False)
train_pipeline = [
    dict(type='DecordInit'),
    dict(type='SampleFrames', clip_len=1, frame_interval=1, num_clips=8),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(
        type='MultiScaleCrop',
        input_size=224,
        scales=(1, 0.875, 0.75, 0.66),
        random_crop=False,
        max_wh_scale_gap=1,
        num_fixed_crops=13),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(type='Flip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs', 'label'])
]
val_pipeline = [
    dict(type='DecordInit'),
    dict(
        type='SampleFrames',
        clip_len=1,
        frame_interval=1,
        num_clips=8,
        test_mode=True),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='CenterCrop', crop_size=224),
    dict(type='Flip', flip_ratio=0),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs'])
]
test_pipeline = [
    dict(type='DecordInit'),
    dict(
        type='SampleFrames',
        clip_len=1,
        frame_interval=1,
        num_clips=8,
        test_mode=True),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='CenterCrop', crop_size=224),
    dict(type='Flip', flip_ratio=0),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs'])
]
data = dict(
    videos_per_gpu=8,
    workers_per_gpu=0,
    train=dict(
        type='VideoDataset',
        ann_file='train.txt',
        data_prefix='Data/Train',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames', clip_len=1, frame_interval=1,
                num_clips=8),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(
                type='MultiScaleCrop',
                input_size=224,
                scales=(1, 0.875, 0.75, 0.66),
                random_crop=False,
                max_wh_scale_gap=1,
                num_fixed_crops=13),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(type='Flip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs', 'label'])
        ]),
    val=dict(
        type='VideoDataset',
        ann_file='val.txt',
        data_prefix='Data/Validation',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames',
                clip_len=1,
                frame_interval=1,
                num_clips=8,
                test_mode=True),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(type='CenterCrop', crop_size=224),
            dict(type='Flip', flip_ratio=0),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs'])
        ]),
    test=dict(
        type='VideoDataset',
        ann_file='test.txt',
        data_prefix='Data/Validation',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames',
                clip_len=1,
                frame_interval=1,
                num_clips=8,
                test_mode=True),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(type='CenterCrop', crop_size=224),
            dict(type='Flip', flip_ratio=0),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs'])
        ]))
evaluation = dict(
    interval=5, metrics=['top_k_accuracy', 'mean_class_accuracy'])
work_dir = './work_dirs/tsm_r50_video_2d_1x1x8_50e_kinetics400_rgb/'
gpu_ids = range(0, 1)
omnisource = False
module_hooks = []

Use load_from_torchvision loader
2021-05-31 11:34:02,878 - mmaction - INFO - These parameters in pretrained checkpoint are not loaded: {'fc.weight', 'fc.bias'}
2021-05-31 11:34:06,481 - mmaction - INFO - Start running, host: root@8808a7539a3c, work_dir: /content/drive/My Drive/mmaction2/work_dirs/tsm_r50_video_2d_1x1x8_50e_kinetics400_rgb
2021-05-31 11:34:06,481 - mmaction - INFO - workflow: [('train', 1)], max: 50 epochs
2021-05-31 11:35:50,090 - mmaction - INFO - Epoch [1][20/494]	lr: 2.000e-02, eta: 1 day, 11:30:45, time: 5.180, data_time: 4.327, memory: 6980, top1_acc: 0.1500, top5_acc: 0.7312, loss_cls: 13.3731, loss: 13.3731, grad_norm: 50.7961
Traceback (most recent call last):
  File "tools/train.py", line 199, in <module>
    main()
  File "tools/train.py", line 195, in main
    meta=meta)
  File "/content/drive/MyDrive/mmaction2/mmaction/apis/train.py", line 163, in train_model
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs, **runner_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
    for i, data_batch in enumerate(self.data_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 557, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/base.py", line 287, in __getitem__
    return self.prepare_train_frames(idx)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/base.py", line 261, in prepare_train_frames
    return self.pipeline(results)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/pipelines/compose.py", line 41, in __call__
    data = t(data)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/pipelines/loading.py", line 937, in __call__
    container = decord.VideoReader(file_obj, num_threads=self.num_threads)
  File "/usr/local/lib/python3.7/dist-packages/decord/video_reader.py", line 52, in __init__
    ba, ctx.device_type, ctx.device_id, width, height, num_threads, 2, fault_tol)
  File "/usr/local/lib/python3.7/dist-packages/decord/_ffi/_ctypes/function.py", line 175, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/usr/local/lib/python3.7/dist-packages/decord/_ffi/base.py", line 78, in check_call
    raise DECORDError(err_str)
decord._ffi.base.DECORDError: [11:37:01] /github/workspace/src/video/video_reader.cc:151: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1

innerlee · 2021-05-31T12:31:05Z

You can switch the video reader to other backend and see if things work.

balaganeshmohan · 2021-05-31T14:36:04Z

@innerlee Thanks for your reply. I have tried with PyAv, where I got similar error,

Tuple index out of range

I believe it has to do with the data. Is there a way to skip videos are un-decodable?

irvingzhang0512 · 2021-05-31T16:07:21Z

There are quite a lot of issues about corrupted video. Maybe we can add a script, decoding all videos one by one and finding the corrupted ones.

balaganeshmohan · 2021-05-31T16:47:14Z

@irvingzhang0512 I used the same dataset on SlowFast repo where PyAv gave me similar error, but they had an exception pass in their training pipeline where corrupted videos were skipped with a warning. I'd like to add something similar here, but the codebase is too complicated for my expertise. Some help would be nice to pin point what I can change here.

irvingzhang0512 · 2021-05-31T17:03:10Z

I may take a look in two days

irvingzhang0512 · 2021-06-01T00:56:58Z

@innerlee @dreamerlin What do you think about this feature: when trying to decode a corruped video, show a warning and choose a random video.

innerlee · 2021-06-01T02:03:51Z

My thoughts:

Write a tool that iterate over all videos. And output all bad videos.
For bad videos, its up to the users to
- either remove them from the list,
- or re-download them
- or re-encode them (this will fix many errors)

Ill-formatted videos are inevitable so we should "take it seriously".

balaganeshmohan · 2021-06-03T06:48:19Z

@innerlee @irvingzhang0512 I just wanted to update that opencv decoder works without any problems.

halqadasi · 2024-09-11T20:34:52Z

My file was corrupted, and I reinstalled the video and worked very well.

innerlee assigned dreamerlin May 31, 2021

dreamerlin added the enhancement New feature or request label Jun 1, 2021

irvingzhang0512 mentioned this issue Jun 7, 2021

[Improvement] Add a tool to find invalid videos. #907

Merged

6 tasks

kennymckormick closed this as completed in #907 Jun 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERROR cannot find video stream with wanted index: -1 #893

ERROR cannot find video stream with wanted index: -1 #893

balaganeshmohan commented May 31, 2021 •

edited by innerlee

Loading

innerlee commented May 31, 2021

balaganeshmohan commented May 31, 2021

irvingzhang0512 commented May 31, 2021

balaganeshmohan commented May 31, 2021

irvingzhang0512 commented May 31, 2021

irvingzhang0512 commented Jun 1, 2021

innerlee commented Jun 1, 2021 •

edited

Loading

balaganeshmohan commented Jun 3, 2021

halqadasi commented Sep 11, 2024

ERROR cannot find video stream with wanted index: -1 #893

ERROR cannot find video stream with wanted index: -1 #893

Comments

balaganeshmohan commented May 31, 2021 • edited by innerlee Loading

innerlee commented May 31, 2021

balaganeshmohan commented May 31, 2021

irvingzhang0512 commented May 31, 2021

balaganeshmohan commented May 31, 2021

irvingzhang0512 commented May 31, 2021

irvingzhang0512 commented Jun 1, 2021

innerlee commented Jun 1, 2021 • edited Loading

balaganeshmohan commented Jun 3, 2021

halqadasi commented Sep 11, 2024

balaganeshmohan commented May 31, 2021 •

edited by innerlee

Loading

innerlee commented Jun 1, 2021 •

edited

Loading