-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
add async query to improve latency #62
base: main
Are you sure you want to change the base?
Conversation
Adds support for custom evaluation thresholds, introduces ThresholdedTrustworthyRAGScore type, and improves validation error handling with better documentation.
Co-authored-by: Anish Athalye <me@anishathalye.com>
…n in favor of the new Validator API.
src/cleanlab_codex/validator.py
Outdated
expert_answer = self._remediate(query) | ||
if expert_answer == None: | ||
self._project._sdk_client.projects.entries.add_question( | ||
self._project._id, question=query, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If is_bad_response
== True, and expert_answer
= None, then there's extra work being done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make add_question
async as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you mark this resolved? This seems like a critical consideration to think about.
If is_bad_response == True, and expert_answer = None, then there may be extra compute being run in these cases. Need to time this implementation vs. original implementation over a bunch of cases where is_bad_response == True, and expert_answer = None
Please ensure you've created at least 3+ codex projects from the same dataset where you run the queries from this dataset in a different order. And then you've verified that in each one, the results using this async logic exactly match the results using our original logic, including when some of the questions have been answered in the Codex Project. |
src/cleanlab_codex/validator.py
Outdated
# TODO: Make this async as well in the future (right now it takes 8% of the time) | ||
with ThreadPoolExecutor(max_workers=1) as executor: | ||
executor.submit( | ||
self._project._sdk_client.projects.entries.add_question, | ||
self._project._id, question=query | ||
).result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merge, a SWE needs to confirm there isn’t some simple latency gain left on the table in this line. Or that at least, the proper backend Eng tickets are in place to reduce this line's latency.
This line should ideally be near instant, any backend operations shouldn’t block it client side (eg. actually organizing how the new query gets logged in the DB shouldn't block the client).
Add async query to improve latency
NEW:

OLD: