Total Size Of The App #3509

zain-ul-abedien · 2023-10-06T18:04:56Z

I am thinking about building an app in ios. That will use lamma model so what will be the size of the model when it runs on the phone ?

KerfuffleV2 · 2023-10-06T18:19:28Z

I'm assuming you mean how much memory it uses while you're actually running the model?

That depends on stuff like the type of quantization and the context size you set. You can assume it will require memory equal to the .gguf file plus a bit extra for general stuff that's always needed and then some variable amount depending on the context size you set.

This also depends on the type of model. LLaMAv1 models, for example, use a lot more memory for the context than LLaMAv2 models. Other models like Starcoders, Baichuan, etc may vary also.

You can expect a Q4_K quantized 7B LLaMA model to require around 4GB RAM just to load and then maybe another 1-2GB based on the context size. This is just a very inexact ballpark figure to give you an idea of the general range.

bachittle · 2023-10-06T18:47:15Z

I was able to run starcoder 1b on my iPhone 13, details here: #3284

7b models need to be heavily quantized to load onto 4gb of RAM (Q2_K), and need to have GPU offloading in order to run at decent speeds.

Expect to use 1-3gb of storage space for these models.

ggml-org locked and limited conversation to collaborators Oct 6, 2023

staviq converted this issue into discussion #3512 Oct 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Total Size Of The App #3509

Total Size Of The App #3509

zain-ul-abedien commented Oct 6, 2023

KerfuffleV2 commented Oct 6, 2023

bachittle commented Oct 6, 2023 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Total Size Of The App #3509

Total Size Of The App #3509

Comments

zain-ul-abedien commented Oct 6, 2023

KerfuffleV2 commented Oct 6, 2023

bachittle commented Oct 6, 2023 • edited Loading

This issue was moved to a discussion.

bachittle commented Oct 6, 2023 •

edited

Loading