Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fixing block size for Mistral-7B. #141

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Artyom17
Copy link
Contributor

According to Mistral's paper the block size for Mistral-7B should be 8192 (ref: https://arxiv.org/pdf/2310.06825.pdf, https://huggingface.co/docs/transformers/en/model_doc/mistral). But currently it is set to the default value (2048).

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2024
@Artyom17
Copy link
Contributor Author

It also saves some memory on 'freq_cis' tensor when the large block_size is used with relatively small max_seq_length.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants