Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Added diskcache to base model. #480

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

JoelNiklaus
Copy link
Contributor

Some models are very expensive to run inference on (e.g., Llama-3.3-70B). When we need to rerun inference to add a new metric for example, it would be very time consuming and expensive, especially since at least 4 80GB GPUs are necessary for inference.

We might want to add a flag to enable/disable caching. Also, we might want it for the other methods like loglikelihood generation too.

Copy link
Member

@NathanHB NathanHB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition ! Ping us when ready :)

@JoelNiklaus
Copy link
Contributor Author

Thanks, I don't know when I have the capacity to add it to the other methods.

@JoelNiklaus
Copy link
Contributor Author

This might not be necessary anymore with PR #488.

@clefourrier
Copy link
Member

Want us to close this one?

@JoelNiklaus
Copy link
Contributor Author

I personally think it would still be nice to have caching here too, but for me it is not strictly necessary anymore I guess.

@JoelNiklaus
Copy link
Contributor Author

To make local inference of large models more robust it would still be useful.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants