Implement log probabilities for chat completions #1

atbe · 2025-06-24T22:11:01Z

Log probabilities support was implemented for Ollama's OpenAI-compatible chat completion endpoints.

Key changes include:

Request parameters logprobs (boolean) and top_logprobs (integer) were added to openai/openai.go's ChatCompletionRequest.
New LogProb and TopLogProb structs were introduced in api/types.go and openai/openai.go to define the log probability response schema.
llm/server.go's CompletionRequest was updated to include LogProbs and TopLogProbs, and the completion struct now includes CompletionProbabilities for parsing llama.cpp responses. The Completion function now passes n_probs to the underlying llama.cpp server.
openai/openai.go's ChatMiddleware was modified to extract log probability settings from incoming requests and store them in the Gin context.
server/routes.go's ChatHandler now retrieves these settings from the context and passes them to the llm.CompletionRequest.
Conversion logic was added in openai/openai.go (toChatCompletion, toChunk) and server/routes.go to transform llama.cpp's completion_probabilities into the OpenAI-compatible logprobs format, including token, logprob, bytes, and top logprobs.
Both streaming and non-streaming responses now include log probabilities.
Test cases were added to openai/openai_test.go, and a LOGPROBS_IMPLEMENTATION.md file was created for documentation.

BruceMacD and others added 6 commits February 12, 2025 16:36

print logprobs

7d16ec8

send completion response on chan

6dfcdec

...

64f9506

prototype

fdbb0b5

...

b88489a

Add documentation comparing log probabilities implementation approaches

5efaf70

Provide feedback