-
Notifications
You must be signed in to change notification settings - Fork 11.4k
dynamic estimate of required memory usage #438
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
c9aa526
to
660e1df
Compare
660e1df
to
636a954
Compare
f0e79f4
to
4e64e37
Compare
4e64e37
to
424281a
Compare
hold up, need to fix perplexity. update: still investigating. |
@Green-Sky UB is hard to fix, I really appreciate! I'll try this PR tomorrow. Before that, let me to make an immature suggestion: Think about the situation that new segmentation fault occur again, but still take time fix. |
it is only UB if you run without address sanitizer 😉 |
so, 32GiB are not enough to run perplexity (defaults) on 7B q4_0 . edit: with context 1024 edit: #407 changes how this works |
3c31292
to
5dd94f7
Compare
for some reason @ggerganov pushed 4870e45 👀 |
Unfortunately, this still doesn't fix the memory allocation issues :( From what I can tell, it pretty much wraps stuff in vectors and adds an assertion to force the code to fail rather than segfaulting. |
@ggerganov promised an memory overhaul here #407 (comment) so i am closing this pr. |
Runs smoothly, thanks! |
officially replaced by #473 |
uses observations made in #213 and replaces it.
fixes
ggml_new_tensor_impl: not enough space in the context's memory pool
and resulting Segfaults.this is still as much of a hack as it was before, but this time it is working.
this could potentially fix a bunch of issues. ( fixes #153 )