We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题: 论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
` if has_image_input: video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1]) video_embeds_in = self.input_resampler(video_embeds_comp) video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1]) input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1]) elif not self.freeze_input_resampler: video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype) video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake) input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
`
想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?
多谢。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题:
论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
` if has_image_input:
video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1])
video_embeds_in = self.input_resampler(video_embeds_comp)
video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1])
input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1])
elif not self.freeze_input_resampler:
video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype)
video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake)
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
`
想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?
多谢。
The text was updated successfully, but these errors were encountered: