Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Auto determine how much of the model to load into RAM #9

Open
coreylowman opened this issue May 2, 2023 · 0 comments
Open

Auto determine how much of the model to load into RAM #9

coreylowman opened this issue May 2, 2023 · 0 comments

Comments

@coreylowman
Copy link
Owner

Use cases:

  1. You can fit the whole model into GPU ram
  2. You can fit part of the model into GPU ram
  3. You need keep all the model weights on disk

In all these cases, we should be able to detect how much GPU ram is available, and determine the max amount of model to store that way. More advanced use cases of sharing GPU with other applications may need manual control over the memory, but that can be done later.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant