Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

有关DynamiCrafter的问题 #3

Open
Charles-ux-bit opened this issue Dec 28, 2024 · 0 comments
Open

有关DynamiCrafter的问题 #3

Charles-ux-bit opened this issue Dec 28, 2024 · 0 comments

Comments

@Charles-ux-bit
Copy link

作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题:
论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:

input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

` if has_image_input:
video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1])
video_embeds_in = self.input_resampler(video_embeds_comp)
video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1])
input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1])
elif not self.freeze_input_resampler:
video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype)
video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake)
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

`

想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?

多谢。

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant