Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Total Size Of The App #3509

Closed
zain-ul-abedien opened this issue Oct 6, 2023 · 2 comments
Closed

Total Size Of The App #3509

zain-ul-abedien opened this issue Oct 6, 2023 · 2 comments

Comments

@zain-ul-abedien
Copy link

I am thinking about building an app in ios. That will use lamma model so what will be the size of the model when it runs on the phone ?

@KerfuffleV2
Copy link
Collaborator

I'm assuming you mean how much memory it uses while you're actually running the model?

That depends on stuff like the type of quantization and the context size you set. You can assume it will require memory equal to the .gguf file plus a bit extra for general stuff that's always needed and then some variable amount depending on the context size you set.

This also depends on the type of model. LLaMAv1 models, for example, use a lot more memory for the context than LLaMAv2 models. Other models like Starcoders, Baichuan, etc may vary also.

You can expect a Q4_K quantized 7B LLaMA model to require around 4GB RAM just to load and then maybe another 1-2GB based on the context size. This is just a very inexact ballpark figure to give you an idea of the general range.

@bachittle
Copy link
Contributor

bachittle commented Oct 6, 2023

I was able to run starcoder 1b on my iPhone 13, details here: #3284

7b models need to be heavily quantized to load onto 4gb of RAM (Q2_K), and need to have GPU offloading in order to run at decent speeds.

Expect to use 1-3gb of storage space for these models.

@ggml-org ggml-org locked and limited conversation to collaborators Oct 6, 2023
@staviq staviq converted this issue into discussion #3512 Oct 6, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants