Auto determine how much of the model to load into RAM #9

coreylowman · 2023-05-02T00:47:45Z

Use cases:

You can fit the whole model into GPU ram
You can fit part of the model into GPU ram
You need keep all the model weights on disk

In all these cases, we should be able to detect how much GPU ram is available, and determine the max amount of model to store that way. More advanced use cases of sharing GPU with other applications may need manual control over the memory, but that can be done later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto determine how much of the model to load into RAM #9

Auto determine how much of the model to load into RAM #9

coreylowman commented May 2, 2023

Auto determine how much of the model to load into RAM #9

Auto determine how much of the model to load into RAM #9

Comments

coreylowman commented May 2, 2023