Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat(llm): Add prompt caching for Anthropic Claude models #12164

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Seayon
Copy link
Contributor

@Seayon Seayon commented Dec 27, 2024

Add prompt caching parameters for all Claude-3 series models, supporting tagged text caching to improve response speed. Each model can cache up to 4 text blocks.

Summary

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Resolves #7382

Screenshots

Before

iShot_2024-12-30_09 11 53

After

iShot_2024-12-30_09 10 41 iShot_2024-12-30_09 14 14

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. ⚙️ feat:model-runtime labels Dec 27, 2024
@Seayon Seayon force-pushed the support-prompt-caching branch from 4fea270 to 1c0365a Compare December 27, 2024 09:46
@crazywoola
Copy link
Member

Please fix the lint errors.

@Seayon Seayon force-pushed the support-prompt-caching branch from 1c0365a to 1cecc25 Compare December 27, 2024 16:46
Add prompt caching parameters for all Claude-3 series models, supporting tagged text
caching to improve response speed. Each model can cache up to 4 text blocks.
@Seayon Seayon force-pushed the support-prompt-caching branch from b8dc3e1 to b09f068 Compare December 27, 2024 16:56
@Seayon
Copy link
Contributor Author

Seayon commented Dec 27, 2024

Please fix the lint errors.
@crazywoola
Done

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
⚙️ feat:model-runtime size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prompt Caching by Anthropic Claude
2 participants