Skip to content

feat: add latency and token_usage info in ai-proxy access log #12042

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Revolyssup
Copy link
Contributor

Description

Fixes # (issue)
After the traffic passes through the AI Gateway (also known as APISIX), users hope to: be able to record more variables required for AI proxy scenarios in the access log of AI requests and responses.
Some metadata explicitly mentioned by the user, and corresponding explanations:

  • Token: The token usage for each request and response phase needs to be recorded separately.
  • Latency: The waiting time for the first response after the request is sent to the AI Instance through the AI Gateway proxy. - Time to first byte.
127.0.0.1 - - [12/Mar/2025:16:24:57 +0530] 127.0.0.1:9080 "POST /anything HTTP/1.1" 429 349 1.088 "-" "curl/8.9.1" - - - "https://somerandom.com "ai_token_usage={\x22prompt_tokens\x22:0,\x22completion_tokens\x22:0,\x22total_tokens\x22:0}" "ai_time_to_first_byte(in seconds)=1.0880000591278""

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Mar 12, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant