-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Improved stability of litellm models for reasoning models. #538
base: main
Are you sure you want to change the base?
Conversation
Hey @JoelNiklaus, since you have made most commits around LiteLLM.
|
Hi @satpalsr,
|
@JoelNiklaus Thanks. I'll just drop it as separate issue for anyone to pick. |
kwargs["caching"] = False | ||
logger.info("Response is empty, retrying without caching") | ||
response = litellm.completion(**kwargs) | ||
|
||
if content and "<think>" in content: | ||
logger.debug(f"Removing <think> tags from response: {content}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing think tags from the answer here ? I think it should be done in the metric function no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are evaluating a reasoning model the grader will look at the thinking tokens unless we remove them. We would need to remove them in every metric function otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah but in that case you lose the thinking traces in the details.
What we would need:
- keep the thinking traces in the details
- allow the user to choose wether or not to evaluate with thinking tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Maybe we can open an issue for that and add that improvement in a later PR?
Co-authored-by: Nathan Habib <30601243+NathanHB@users.noreply.github.com>
No description provided.