Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[FT] Rerun evaluations with new metrics based on completions saved in details file #467

Closed
JoelNiklaus opened this issue Dec 19, 2024 · 3 comments · Fixed by #488
Closed
Labels
feature request New feature/request

Comments

@JoelNiklaus
Copy link
Contributor

Issue encountered

Rerunning an evaluation with a new metric requires rerunning the entire inference currently, which can be very costly.

Solution/Feature

It would be great, if we could specify a details file containing the predictions and use that to compute more metrics on.

@JoelNiklaus JoelNiklaus added the feature request New feature/request label Dec 19, 2024
@JoelNiklaus
Copy link
Contributor Author

@clefourrier @NathanHB I am happy to implement this. Do you have suggestions for how to best solve this?

@NathanHB
Copy link
Member

NathanHB commented Jan 2, 2025

It would be great ! I think the best way would be to recreate the sample_id_to_response from the details file and run the metric on these.

from the pipeline.py file:

sample_id_to_responses = self._run_model()
self._compute_metrics(sample_id_to_responses)

you would need to inspect what is in sample_id_to_response and try to make it from the details file.

@JoelNiklaus
Copy link
Contributor Author

Great, will try that, thanks Nathan!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request New feature/request
Projects
None yet
2 participants