Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update llama_cpp_server.py fixing bugs with non-streaming response #85

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

orangethewell
Copy link

This commit fixes the following bugs:

  • max_tokens as -1 return a error from python server, set it a default value as 8192 tokens for maximum capacity, but tests can be done with this;
  • data["content"] doesn't exist in llama-cpp-python server for response, better use the entire response json data because it's the same structure used by extractor.

This commit fixes the following bugs:
- `max_tokens` as `-1` return a error from python server, set it a default value as 8192 tokens for maximum capacity, but tests can be done with this;
- `data["content"]` doesn't exist in llama-cpp-python server for response, better use the entire response json data because it's the same structure used by extractor.
@Maximilian-Winter
Copy link
Owner

@orangethewell Thank you, this is great, just started working on the project again, will look into it next week.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants