Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug] latents_attention_mask not used #168

Open
HaoyiZhu opened this issue Jan 27, 2025 · 2 comments
Open

[Bug] latents_attention_mask not used #168

HaoyiZhu opened this issue Jan 27, 2025 · 2 comments
Assignees

Comments

@HaoyiZhu
Copy link

Environment

/

Describe the bug

Thanks for the amazing project!

In the data loader, latents are zero-padded and recorded by an attention mask. However, during training and model forward, it seems that the latent attention mask is not used. Is this a bug?

Reproduction

/

@foreverpiano
Copy link
Collaborator

foreverpiano commented Jan 29, 2025

Thanks for pointing out this. latents_attention_mask is not used. I will remove it and refactor code soon.

@foreverpiano foreverpiano self-assigned this Feb 3, 2025
@jzhang38
Copy link
Collaborator

jzhang38 commented Feb 3, 2025

We are going to add support for variable-length sequence packing. I intentionally retain the latent mask for this future feature.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants