Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BugFix] Don't scan entire cache dir when loading model
Recent PR vllm-project#12926 added logging of the time to load model weights, but to determine whether any new files were downloaded to the cache, scans the entire local HF cache dir before and after to determine the size. This can be very expensive if there is a large cache, and/or it's a remote filesystem mount which is common. With this fix the time will still be logged even if no files were downloaded, unless HF_HUB_OFFLINE is set. But I think this is ok since it still includes the time to connect to the hub to check for existence of new files. Signed-off-by: Nick Hill <nhill@redhat.com>
- Loading branch information