Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Error while training #3

Open
yahooo-m opened this issue Nov 25, 2024 · 7 comments
Open

Error while training #3

yahooo-m opened this issue Nov 25, 2024 · 7 comments

Comments

@yahooo-m
Copy link

Hi, when running train_joint.py, I met the error "TypeError: VideoLISAForCausalLM.model_forward() missing 1 required positional argument: 'dense_indices'". And I check the input_dict, it actually does not have this key: ['image_paths', 'images', 'images_clip', 'input_ids', 'labels', 'attention_masks', 'masks_list', 'label_list', 'valid_indices', 'resize_list', 'offset', 'questions_list', 'sampled_classes_list', 'inference', 'conversation_list'].

@JosephPai
Copy link
Collaborator

Hi, thanks for raising this issue.
'dense_indices' should be the 'valid_indices' in the input_dict.
I have updated the dataset.py for consistent variable name.
Feel free to let me know if you have any other questions.

@yahooo-m
Copy link
Author

Thanks!
And how to change the GPU number?

@JosephPai
Copy link
Collaborator

In the training script, say if you want to train the model with only 4 GPUs, you start the deepspeed job with
deepspeed --include localhost:4,5,6,7 train_joint.py

@Lexarymade
Copy link

hi, dear author, I'm wondering whether there are gonna be evaluation scripts for other datasets reported in the paper, eg Refer-DAVIS-17 and your proposed ReasonVOS datasets. Thanks!

@JosephPai
Copy link
Collaborator

Hi @Lexarymade , we are actively working on organizing the data and evaluation scripts.
We just released the ReasonVOS benchmark: https://github.com/showlab/VideoLISA/blob/main/BENCHMARK.md
You can slightly modify the evaluation script of MeViS to evaluate the ReasonVOS benchmark, as their data structures are very similar.

As for Ref-DAVIS-17, its evaluation is a bit complicated as it relies on another evaluation toolkit. We will organize an instruction on how to evaluate it recently.

@yahooo-m
Copy link
Author

yahooo-m commented Nov 28, 2024

Hi, have u tried to train the model on A100? I find that it may take 8 days for training on 8 A100. Is the flash-attention not used in this project?

@JosephPai
Copy link
Collaborator

Hi @yahooo-m , when developing this project, we did not investigate the implementation with flash-attn. It seems that Phi-3 series models would not automatically trigger flash-attn unless we explicitly specify it. This seems to be typical issue according to this and this.

To use flash-attn, you can modify the training script:

model = VideoLISAForCausalLM.from_pretrained(
        args.version, torch_dtype=torch_dtype, low_cpu_mem_usage=True,
        cache_dir="/home/ubuntu/.cache/huggingface/hub",
        attn_implementation="flash_attention_2",                # add this line
        **model_args
    )

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants