Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

FIM completion flexible context #257

Open
kv-gits opened this issue Jun 4, 2024 · 4 comments
Open

FIM completion flexible context #257

kv-gits opened this issue Jun 4, 2024 · 4 comments

Comments

@kv-gits
Copy link

kv-gits commented Jun 4, 2024

Would like to setup/choose the context of FIM completion. Current function/file/dependent files/project instead of fixed sized. I know, it can be complicated for different languages. But if possible, would be nice.

@rjmacarthy
Copy link
Collaborator

Hello, please could you explain in some detail what is most important in this respect and any technical details which might be helpful when adding this functionality?

Many thanks.

@slashedstar
Copy link

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used, so I assume the context is fixed and even if you set it to include 999 lines it will only include whatever it can fit in the default context limit (2048), shouldn't we be able to adjust the actual token context window (num_ctx) instead?

@AndrewRocky
Copy link
Contributor

AndrewRocky commented Sep 11, 2024

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used

@slashedstar, Most LLM providers (their engines, to be exact) preallocate VRAM for defined context length during model loading.

If Twinny sends longer context than provider's max context length - the beginning of context will get truncated.

@slashedstar
Copy link

I was fiddling with the context length option and it didn't seem to affect how much VRAM was being used

@slashedstar, Most LLM providers (their engines, to be exact) preallocate VRAM for defined context length during model loading.

If Twinny sends longer context than provider's max context length - the beginning of context will get truncated.

I forgot to mention I was using Ollama, I assumed twinny would've automatically expand the context as needed (by passing the needed num_ctx ) because continue does this, when you change the context length setting for FIM the VRAM usage changes accordingly, maybe I didn't properly set it up at the time, can't really remember.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants