-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
planning: Supporting vision model (Llava and Llama3.2) #1493
Comments
Updates:
![]() |
We should ensure that |
@vansangpfiev and @hahuyhoang411 - can I get your thoughts to add to this list from my naive understanding? To support Vision models on Cortex, we need the following:
|
We probably need to consider changing the UX for inferencing with vision model, for example:
|
Thank you @vansangpfiev and @hahuyhoang411! Quick notes from call:
|
Added an action item, where model management should pull metadata from chat model file instead of projector file (just to make sure we tracked this) |
Problem Statement
To support Vision models on Cortex, we need the following:
v1/models/start
takes inmodel_path
(.gguf) andmmproj
parameters/chat/completions
to take in messages contentimage_url
1. Downloading model .gguf and mmprog file:
For fully compatible with Jan, cortex should be able to pull mmproj file along with GGUF file.
Let's take the image below for example.

Scenario steps:
.gguf
file) for user to select.mmproj
is also ended with.gguf
, we also listed that in the selection.So, we need to come up with a way so that cortex knows when to download the
mmproj
file along with traditional gguf file.cc @dan-homebrew , @louis-jan , @nguyenhoangthuan99, @vansangpfiev
Feature Idea
Couple of thoughts:
1.1. For CLI: Ignore file name contains
mmproj
when presenting selection list. And download it along with selected traditional gguf file.1.2. For API: Always scan the directory with same level as the URL provided. If there's a
mmproj
file name, cortex adds it to the download list.mmproj
file, return error with clear error message.The text was updated successfully, but these errors were encountered: