Segfault / Memory error with 65B model (128GB RAM)

On an M1 Ultra / 128GB, running the 65B model:

```text
./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128 -p "The word empowerment has five possible definitions:"
```

produces this error after everything has been loaded correctly:

```text
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 268478672, available 268435456)
```

30B runs fine (even on a 64GB M1 Max)

<details>
  <summary>Full output</summary>
  
  ```text
  (base) ➜  llama.cpp git:(master) ✗ ./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128 -p "The word empowerment has five possible definitions:"
main: seed = 1678533057
llama_model_load: loading model from './models/65B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 8192
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 64
llama_model_load: n_layer = 80
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 22016
llama_model_load: n_parts = 8
llama_model_load: ggml ctx size = 41477.73 MB
llama_model_load: memory_size =  2560.00 MB, n_mem = 40960
llama_model_load: loading model part 1/8 from './models/65B/ggml-model-q4_0.bin'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 2/8 from './models/65B/ggml-model-q4_0.bin.1'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 3/8 from './models/65B/ggml-model-q4_0.bin.2'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 4/8 from './models/65B/ggml-model-q4_0.bin.3'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 5/8 from './models/65B/ggml-model-q4_0.bin.4'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 6/8 from './models/65B/ggml-model-q4_0.bin.5'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 7/8 from './models/65B/ggml-model-q4_0.bin.6'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723
llama_model_load: loading model part 8/8 from './models/65B/ggml-model-q4_0.bin.7'
llama_model_load: .......................................................................................... done
llama_model_load: model size =  4869.09 MB / num tensors = 723

main: prompt: 'The word empowerment has five possible definitions:'
main: number of tokens in prompt = 11
     1 -> ''
  1576 -> 'The'
  1734 -> ' word'
  3710 -> ' emp'
  1680 -> 'ower'
   358 -> 'ment'
   756 -> ' has'
  5320 -> ' five'
  1950 -> ' possible'
 15848 -> ' definitions'
 29901 -> ':'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000


ggml_new_tensor_impl: not enough space in the context's memory pool (needed 268478672, available 268435456)
[1]    54172 segmentation fault  ./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128 -p
  ```
  
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Segfault / Memory error with 65B model (128GB RAM) #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Segfault / Memory error with 65B model (128GB RAM) #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions