Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Mistral Nemo Quantized Support #2727

Open
leflambeur opened this issue Jan 18, 2025 · 0 comments
Open

Mistral Nemo Quantized Support #2727

leflambeur opened this issue Jan 18, 2025 · 0 comments

Comments

@leflambeur
Copy link

leflambeur commented Jan 18, 2025

Hi,

I have been learning more about ML, and also Rust, recently and love Candle for giving people the opportunity to use Rust directly.

I have been testing a couple of models and hit a couple of issues - mainly with Quantised Nemo 2407 models as a Q8 Nemo model seems to be the extent my device can handle.

At first I tried writing my own code from the Mistral examples as they mentioned 2407, until I realised from another issue I can't find immediately that it was recommended to use the 'quantized' example instead as the Mistral example was built for a very specific set of models.

The error I get running it either via my own code emulating the quantized example or simply running the 'quantized' example directly is the exact same:

Error: shape mismatch in reshape, lhs: [1, 11, 4096], rhs: [1, 11, 32, 160]

I tested with multiple Nemo 2407 models, notably TheBloke's one, based on the error being the same both in my code and running the example I am guessing it's because 2407 isn't supported with quantization.

Which your readme and this line in the example confirm:
https://github.com/huggingface/candle/blob/main/candle-examples/examples/mistral/main.rs#L266

Unfortunately I don't know enough about tensors and how to deconstruct a GGUF model to figure out what the fix is on my own without some guidance.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant