-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
changelog : llama-server
REST API
#9291
Comments
Not a REST API breaking change, but is server-related: some environment variables are changed in #9308 |
After #9398, in the completion response |
Breaking change #9776 : better security control for public deployments
Please note that GET |
Breaking change for
|
Was the |
For security reasons, "/slots" was disabled by default since #9776 , and was mentioned in the breaking changes table. I just forgot to update the docs. |
Not an API change, but maybe good to know that the default web UI for If you want to use the old completion UI, please follow instruction in the PR. |
|
For clarification, we will maintain OAI-compat for all API under
NOTE: OAI support for |
Behavior of |
Added OAI-compat support for If you want to use it with downstream library, be sure to add from openai import OpenAI
client = OpenAI(api_key="dummy", base_url=f"http://localhost:8080/v1")
res = client.completions.create(
model="davinci-002",
prompt="I believe the meaning of life is",
max_tokens=8,
) If you want to use the old non-OAI style, remove the |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Guessing we want to keep this open. |
Overview
This is a list of changes to the public HTTP interface of the
llama-server
example. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into themaster
branch.If you are building a 3rd party project that relies on
llama-server
, it is recommended to follow this issue and check it carefully before upgrading to new versions.See also:
libllama
APIRecent API changes (most recent at the top)
/v1/chat/completions
now supportstools
&tool_choice
/v1/completions
is now OAI-compatlogprobs
is now OAI-compat, default to pre-sampling probs/embeddings
supports pooling typenone
"tokens"
output to/completions
endpointpenalize_nl
/slots
and/props
responses/slots
and/props
responses/slots
endpoint: removeslot[i].state
, addslot[i].is_processing
/slots
is now disabled by defaultEndpoints now check for API key if it's set
/rerank
endpoint[DONE]\n\n
in OAI stream response to match specseed_cur
to completion response/health
and/slots
For older changes, use:
Upcoming API changes
The text was updated successfully, but these errors were encountered: