[Bug] `latents_attention_mask` not used #168

HaoyiZhu · 2025-01-27T05:46:39Z

Environment

/

Describe the bug

Thanks for the amazing project!

In the data loader, latents are zero-padded and recorded by an attention mask. However, during training and model forward, it seems that the latent attention mask is not used. Is this a bug?

Reproduction

/

foreverpiano · 2025-01-29T16:36:45Z

Thanks for pointing out this. latents_attention_mask is not used. I will remove it and refactor code soon.

jzhang38 · 2025-02-03T22:55:08Z

We are going to add support for variable-length sequence packing. I intentionally retain the latent mask for this future feature.

foreverpiano mentioned this issue Feb 3, 2025

remove latent mask #172

Closed

foreverpiano self-assigned this Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] `latents_attention_mask` not used #168

[Bug] `latents_attention_mask` not used #168

HaoyiZhu commented Jan 27, 2025

foreverpiano commented Jan 29, 2025 •

edited

Loading

jzhang38 commented Feb 3, 2025

[Bug] latents_attention_mask not used #168

[Bug] latents_attention_mask not used #168

Comments

HaoyiZhu commented Jan 27, 2025

Environment

Describe the bug

Reproduction

foreverpiano commented Jan 29, 2025 • edited Loading

jzhang38 commented Feb 3, 2025

[Bug] `latents_attention_mask` not used #168

[Bug] `latents_attention_mask` not used #168

foreverpiano commented Jan 29, 2025 •

edited

Loading