-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat(multimodal): Video understanding #2318
Labels
Comments
1 task
mudler
added a commit
that referenced
this issue
Oct 4, 2024
Closes: #2318 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
1 task
mudler
added a commit
that referenced
this issue
Oct 4, 2024
* feat(vllm): add support for image-to-text Related to #3670 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): add support for video-to-text Closes: #2318 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): support CPU installations Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): add bnb Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: add docs reference Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
siddimore
pushed a commit
to siddimore/LocalAI
that referenced
this issue
Oct 6, 2024
) * feat(vllm): add support for image-to-text Related to mudler#3670 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): add support for video-to-text Closes: mudler#2318 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): support CPU installations Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vllm): add bnb Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: add docs reference Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Labels
It should be possible now to expand the vision support to understand videos, there are projects like
https://github.com/Efficient-Large-Model/VILA
https://github.com/LLaVA-VL/LLaVA-NeXT
https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct?s=09
which make this possible nowadays. Since OpenAI has announced GPT4o, makes sense start looking into open solutions that we can plug into the API with specific backends.
llama.cpp: ggml-org/llama.cpp#9165
vLLM: #3670
The text was updated successfully, but these errors were encountered: