Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ERROR cannot find video stream with wanted index: -1 #893

Closed
balaganeshmohan opened this issue May 31, 2021 · 9 comments · Fixed by #907
Closed

ERROR cannot find video stream with wanted index: -1 #893

balaganeshmohan opened this issue May 31, 2021 · 9 comments · Fixed by #907
Assignees
Labels
enhancement New feature or request

Comments

@balaganeshmohan
Copy link

balaganeshmohan commented May 31, 2021

Hey everyone, great implementation of the TSM framework. I am trying to train on a custom dataset, but I can't get past this error. I am not sure what this means since, this happens after the first iteration of training is done. Here is my log. I initially tried changing the data loader to 0 which actually started the training, before which it did not. I have check all the video paths and it is present where it should be. Some help is appreciated here.

>sys.platform: linux
Python: 3.7.10 (default, May  3 2021, 02:48:31) [GCC 7.5.0]
CUDA available: True
GPU 0: Tesla T4
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.1+cu101
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.3, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.9.1+cu101
OpenCV: 4.1.2
MMCV: 1.3.5
MMCV Compiler: n/a
MMCV CUDA Compiler: n/a
MMAction2: 0.14.0+e9b7009
------------------------------------------------------------

2021-05-31 11:34:01,998 - mmaction - INFO - Distributed training: False
2021-05-31 11:34:02,264 - mmaction - INFO - Config: model = dict(
    type='Recognizer2D',
    backbone=dict(
        type='ResNetTSM',
        pretrained='torchvision://resnet50',
        depth=50,
        norm_eval=False,
        shift_div=8),
    cls_head=dict(
        type='TSMHead',
        num_classes=400,
        in_channels=2048,
        spatial_type='avg',
        consensus=dict(type='AvgConsensus', dim=1),
        dropout_ratio=0.5,
        init_std=0.001,
        is_shift=True),
    train_cfg=None,
    test_cfg=dict(average_clips='prob'))
optimizer = dict(
    type='SGD',
    constructor='TSMOptimizerConstructor',
    paramwise_cfg=dict(fc_lr5=True),
    lr=0.02,
    momentum=0.9,
    weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=20, norm_type=2))
lr_config = dict(policy='step', step=[20, 40])
total_epochs = 50
checkpoint_config = dict(interval=5)
log_config = dict(interval=20, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
dataset_type = 'VideoDataset'
data_root = 'Data/Train'
data_root_val = 'Data/Validation'
ann_file_train = 'train.txt'
ann_file_val = 'val.txt'
ann_file_test = 'test.txt'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_bgr=False)
train_pipeline = [
    dict(type='DecordInit'),
    dict(type='SampleFrames', clip_len=1, frame_interval=1, num_clips=8),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(
        type='MultiScaleCrop',
        input_size=224,
        scales=(1, 0.875, 0.75, 0.66),
        random_crop=False,
        max_wh_scale_gap=1,
        num_fixed_crops=13),
    dict(type='Resize', scale=(224, 224), keep_ratio=False),
    dict(type='Flip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs', 'label'])
]
val_pipeline = [
    dict(type='DecordInit'),
    dict(
        type='SampleFrames',
        clip_len=1,
        frame_interval=1,
        num_clips=8,
        test_mode=True),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='CenterCrop', crop_size=224),
    dict(type='Flip', flip_ratio=0),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs'])
]
test_pipeline = [
    dict(type='DecordInit'),
    dict(
        type='SampleFrames',
        clip_len=1,
        frame_interval=1,
        num_clips=8,
        test_mode=True),
    dict(type='DecordDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='CenterCrop', crop_size=224),
    dict(type='Flip', flip_ratio=0),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_bgr=False),
    dict(type='FormatShape', input_format='NCHW'),
    dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
    dict(type='ToTensor', keys=['imgs'])
]
data = dict(
    videos_per_gpu=8,
    workers_per_gpu=0,
    train=dict(
        type='VideoDataset',
        ann_file='train.txt',
        data_prefix='Data/Train',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames', clip_len=1, frame_interval=1,
                num_clips=8),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(
                type='MultiScaleCrop',
                input_size=224,
                scales=(1, 0.875, 0.75, 0.66),
                random_crop=False,
                max_wh_scale_gap=1,
                num_fixed_crops=13),
            dict(type='Resize', scale=(224, 224), keep_ratio=False),
            dict(type='Flip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs', 'label'])
        ]),
    val=dict(
        type='VideoDataset',
        ann_file='val.txt',
        data_prefix='Data/Validation',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames',
                clip_len=1,
                frame_interval=1,
                num_clips=8,
                test_mode=True),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(type='CenterCrop', crop_size=224),
            dict(type='Flip', flip_ratio=0),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs'])
        ]),
    test=dict(
        type='VideoDataset',
        ann_file='test.txt',
        data_prefix='Data/Validation',
        pipeline=[
            dict(type='DecordInit'),
            dict(
                type='SampleFrames',
                clip_len=1,
                frame_interval=1,
                num_clips=8,
                test_mode=True),
            dict(type='DecordDecode'),
            dict(type='Resize', scale=(-1, 256)),
            dict(type='CenterCrop', crop_size=224),
            dict(type='Flip', flip_ratio=0),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_bgr=False),
            dict(type='FormatShape', input_format='NCHW'),
            dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]),
            dict(type='ToTensor', keys=['imgs'])
        ]))
evaluation = dict(
    interval=5, metrics=['top_k_accuracy', 'mean_class_accuracy'])
work_dir = './work_dirs/tsm_r50_video_2d_1x1x8_50e_kinetics400_rgb/'
gpu_ids = range(0, 1)
omnisource = False
module_hooks = []

Use load_from_torchvision loader
2021-05-31 11:34:02,878 - mmaction - INFO - These parameters in pretrained checkpoint are not loaded: {'fc.weight', 'fc.bias'}
2021-05-31 11:34:06,481 - mmaction - INFO - Start running, host: root@8808a7539a3c, work_dir: /content/drive/My Drive/mmaction2/work_dirs/tsm_r50_video_2d_1x1x8_50e_kinetics400_rgb
2021-05-31 11:34:06,481 - mmaction - INFO - workflow: [('train', 1)], max: 50 epochs
2021-05-31 11:35:50,090 - mmaction - INFO - Epoch [1][20/494]	lr: 2.000e-02, eta: 1 day, 11:30:45, time: 5.180, data_time: 4.327, memory: 6980, top1_acc: 0.1500, top5_acc: 0.7312, loss_cls: 13.3731, loss: 13.3731, grad_norm: 50.7961
Traceback (most recent call last):
  File "tools/train.py", line 199, in <module>
    main()
  File "tools/train.py", line 195, in main
    meta=meta)
  File "/content/drive/MyDrive/mmaction2/mmaction/apis/train.py", line 163, in train_model
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs, **runner_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
    for i, data_batch in enumerate(self.data_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 557, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/base.py", line 287, in __getitem__
    return self.prepare_train_frames(idx)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/base.py", line 261, in prepare_train_frames
    return self.pipeline(results)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/pipelines/compose.py", line 41, in __call__
    data = t(data)
  File "/content/drive/MyDrive/mmaction2/mmaction/datasets/pipelines/loading.py", line 937, in __call__
    container = decord.VideoReader(file_obj, num_threads=self.num_threads)
  File "/usr/local/lib/python3.7/dist-packages/decord/video_reader.py", line 52, in __init__
    ba, ctx.device_type, ctx.device_id, width, height, num_threads, 2, fault_tol)
  File "/usr/local/lib/python3.7/dist-packages/decord/_ffi/_ctypes/function.py", line 175, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/usr/local/lib/python3.7/dist-packages/decord/_ffi/base.py", line 78, in check_call
    raise DECORDError(err_str)
decord._ffi.base.DECORDError: [11:37:01] /github/workspace/src/video/video_reader.cc:151: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1
@innerlee
Copy link
Contributor

You can switch the video reader to other backend and see if things work.

@balaganeshmohan
Copy link
Author

@innerlee Thanks for your reply. I have tried with PyAv, where I got similar error,

Tuple index out of range

I believe it has to do with the data. Is there a way to skip videos are un-decodable?

@irvingzhang0512
Copy link
Contributor

There are quite a lot of issues about corrupted video. Maybe we can add a script, decoding all videos one by one and finding the corrupted ones.

@balaganeshmohan
Copy link
Author

@irvingzhang0512 I used the same dataset on SlowFast repo where PyAv gave me similar error, but they had an exception pass in their training pipeline where corrupted videos were skipped with a warning. I'd like to add something similar here, but the codebase is too complicated for my expertise. Some help would be nice to pin point what I can change here.

@irvingzhang0512
Copy link
Contributor

I may take a look in two days

@irvingzhang0512
Copy link
Contributor

@innerlee @dreamerlin What do you think about this feature: when trying to decode a corruped video, show a warning and choose a random video.

@innerlee
Copy link
Contributor

innerlee commented Jun 1, 2021

My thoughts:

  • Write a tool that iterate over all videos. And output all bad videos.
  • For bad videos, its up to the users to
    • either remove them from the list,
    • or re-download them
    • or re-encode them (this will fix many errors)

Ill-formatted videos are inevitable so we should "take it seriously".

@dreamerlin dreamerlin added the enhancement New feature or request label Jun 1, 2021
@balaganeshmohan
Copy link
Author

@innerlee @irvingzhang0512 I just wanted to update that opencv decoder works without any problems.

@halqadasi
Copy link

My file was corrupted, and I reinstalled the video and worked very well.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants