dynamic estimate of required memory usage #438

Green-Sky · 2023-03-23T18:24:04Z

uses observations made in #213 and replaces it.

fixes ggml_new_tensor_impl: not enough space in the context's memory pool and resulting Segfaults.

this is still as much of a hack as it was before, but this time it is working.

this could potentially fix a bunch of issues. ( fixes #153 )

llama.cpp

Green-Sky · 2023-03-23T19:17:08Z

hold up, need to fix perplexity.

update: still investigating.

mqy · 2023-03-23T19:42:02Z

@Green-Sky UB is hard to fix, I really appreciate!

I'll try this PR tomorrow. Before that, let me to make an immature suggestion:

Think about the situation that new segmentation fault occur again, but still take time fix.
Can we add an experimental cli parameter (or env variable), to allow user configure the max memory pool . According to his/her available RAM size (e.g. at least 1 GB). This benefit users who has big RAM and eager to evaluate.

Green-Sky · 2023-03-23T19:52:29Z

UB is hard to fix, I really appreciate!

it is only UB if you run without address sanitizer 😉

Green-Sky · 2023-03-23T20:08:07Z

==1743825==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x1000000000 bytes
    #0 0x7f6d6c849808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x561e0c966487 in llama_eval_internal /home/green/workspace/github/llama.cpp/llama.cpp:636
    #2 0x561e0c9682bb in llama_eval /home/green/workspace/github/llama.cpp/llama.cpp:1460
    #3 0x561e0c879365 in main /home/green/workspace/github/llama.cpp/main.cpp:229
    #4 0x7f6d6b890082 in __libc_start_main ../csu/libc-start.c:308

==1743825==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: out-of-memory ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 in __interceptor_malloc
==1743825==ABORTING

so, 32GiB are not enough to run perplexity (defaults) on 7B q4_0 . edit: with context 1024
and 64GiB fail to allocate on my machine ...

edit: #407 changes how this works

Green-Sky · 2023-03-23T20:41:58Z

I decided to wait on #439 and maybe #407, so i don't break perplexity.

Green-Sky · 2023-03-24T02:25:46Z

for some reason @ggerganov pushed 4870e45 👀

PerthGoat · 2023-03-24T05:17:42Z

for some reason @ggerganov pushed 4870e45 👀

Unfortunately, this still doesn't fix the memory allocation issues :(

From what I can tell, it pretty much wraps stuff in vectors and adds an assertion to force the code to fail rather than segfaulting.

Green-Sky · 2023-03-24T10:19:14Z

@ggerganov promised an memory overhaul here #407 (comment)

so i am closing this pr.

wafflecomposite · 2023-03-24T14:59:23Z

Runs smoothly, thanks!

…bab2

Green-Sky · 2023-03-24T20:21:47Z

officially replaced by #473

Green-Sky changed the title ~~dynamic estimate of required memory usage~~ WIP: dynamic estimate of required memory usage Mar 23, 2023

Green-Sky marked this pull request as ready for review March 23, 2023 18:37

Green-Sky mentioned this pull request Mar 23, 2023

Scale buf_size linearly with n_ctx #213

Closed

Green-Sky force-pushed the memory_investigation branch from c9aa526 to 660e1df Compare March 23, 2023 18:41

Green-Sky changed the title ~~WIP: dynamic estimate of required memory usage~~ dynamic estimate of required memory usage Mar 23, 2023

Green-Sky force-pushed the memory_investigation branch from 660e1df to 636a954 Compare March 23, 2023 18:44

Green-Sky requested review from anzz1, sw and antimatter15 March 23, 2023 18:45

Green-Sky mentioned this pull request Mar 23, 2023

[mqy] ./examples/chatLLaMa: line 53: 33476 Segmentation fault: 11 #373

Closed

Green-Sky requested a review from ggerganov March 23, 2023 18:53

sw reviewed Mar 23, 2023

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

Green-Sky force-pushed the memory_investigation branch 2 times, most recently from f0e79f4 to 4e64e37 Compare March 23, 2023 19:00

Green-Sky mentioned this pull request Mar 23, 2023

segmentation fault Alpaca #317

Closed

dynamic estimate of required memory usage

424281a

Green-Sky force-pushed the memory_investigation branch from 4e64e37 to 424281a Compare March 23, 2023 19:13

Green-Sky marked this pull request as draft March 23, 2023 19:32

Green-Sky mentioned this pull request Mar 23, 2023

Add support for batch size to --perplexity #407

Merged

Green-Sky added 2 commits March 23, 2023 21:45

fix perplexity - it's memory needs dont grow, so we skip it

2d262ea

cmake: make sanitizers link

5dd94f7

Green-Sky force-pushed the memory_investigation branch from 3c31292 to 5dd94f7 Compare March 23, 2023 20:46

Green-Sky closed this Mar 24, 2023

Green-Sky mentioned this pull request Mar 24, 2023

Segmentation Fault Error "not enough space in the context's memory pool" #52

Closed

wafflecomposite mentioned this pull request Mar 24, 2023

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 717778556, available 454395136) #153

Closed

Green-Sky referenced this pull request Mar 24, 2023

Temporary bump the memory buffer size - hopefully fix issues from 483…

31572d9

…bab2

Green-Sky deleted the memory_investigation branch May 1, 2023 10:24

su77ungr mentioned this pull request May 11, 2023

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 7082923680, available 7082732800) su77ungr/CASALIOY#3

Closed

AAbushady pushed a commit to AAbushady/llama.cpp that referenced this pull request Jan 27, 2024

Separate CuBLAS/hipBLAS (ggml-org#438)

4218641

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dynamic estimate of required memory usage #438

dynamic estimate of required memory usage #438

Green-Sky commented Mar 23, 2023 •

edited

Loading

Green-Sky commented Mar 23, 2023 •

edited

Loading

mqy commented Mar 23, 2023

Green-Sky commented Mar 23, 2023

Green-Sky commented Mar 23, 2023 •

edited

Loading

Green-Sky commented Mar 23, 2023

Green-Sky commented Mar 24, 2023

PerthGoat commented Mar 24, 2023 •

edited

Loading

Green-Sky commented Mar 24, 2023 •

edited

Loading

wafflecomposite commented Mar 24, 2023

Green-Sky commented Mar 24, 2023

dynamic estimate of required memory usage #438

dynamic estimate of required memory usage #438

Conversation

Green-Sky commented Mar 23, 2023 • edited Loading

Green-Sky commented Mar 23, 2023 • edited Loading

mqy commented Mar 23, 2023

Green-Sky commented Mar 23, 2023

Green-Sky commented Mar 23, 2023 • edited Loading

Green-Sky commented Mar 23, 2023

Green-Sky commented Mar 24, 2023

PerthGoat commented Mar 24, 2023 • edited Loading

Green-Sky commented Mar 24, 2023 • edited Loading

wafflecomposite commented Mar 24, 2023

Green-Sky commented Mar 24, 2023

Green-Sky commented Mar 23, 2023 •

edited

Loading

Green-Sky commented Mar 23, 2023 •

edited

Loading

Green-Sky commented Mar 23, 2023 •

edited

Loading

PerthGoat commented Mar 24, 2023 •

edited

Loading

Green-Sky commented Mar 24, 2023 •

edited

Loading