-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Added diskcache to base model. #480
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition ! Ping us when ready :)
Thanks, I don't know when I have the capacity to add it to the other methods. |
This might not be necessary anymore with PR #488. |
Want us to close this one? |
I personally think it would still be nice to have caching here too, but for me it is not strictly necessary anymore I guess. |
To make local inference of large models more robust it would still be useful. |
Some models are very expensive to run inference on (e.g., Llama-3.3-70B). When we need to rerun inference to add a new metric for example, it would be very time consuming and expensive, especially since at least 4 80GB GPUs are necessary for inference.
We might want to add a flag to enable/disable caching. Also, we might want it for the other methods like loglikelihood generation too.