Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat(multimodal): Video understanding #2318

Closed
mudler opened this issue May 13, 2024 · 0 comments · Fixed by #3729
Closed

feat(multimodal): Video understanding #2318

mudler opened this issue May 13, 2024 · 0 comments · Fixed by #3729
Labels
enhancement New feature or request roadmap up for grabs Tickets that no-one is currently working on

Comments

@mudler
Copy link
Owner

mudler commented May 13, 2024

It should be possible now to expand the vision support to understand videos, there are projects like
https://github.com/Efficient-Large-Model/VILA
https://github.com/LLaVA-VL/LLaVA-NeXT
https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct?s=09

which make this possible nowadays. Since OpenAI has announced GPT4o, makes sense start looking into open solutions that we can plug into the API with specific backends.

llama.cpp: ggml-org/llama.cpp#9165
vLLM: #3670

@mudler mudler added the enhancement New feature or request label May 13, 2024
@mudler mudler added roadmap up for grabs Tickets that no-one is currently working on labels May 13, 2024
mudler added a commit that referenced this issue Oct 4, 2024
Closes: #2318

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
mudler added a commit that referenced this issue Oct 4, 2024
* feat(vllm): add support for image-to-text

Related to #3670

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): add support for video-to-text

Closes: #2318

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): support CPU installations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): add bnb

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add docs reference

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Apply suggestions from code review

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
siddimore pushed a commit to siddimore/LocalAI that referenced this issue Oct 6, 2024
)

* feat(vllm): add support for image-to-text

Related to mudler#3670

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): add support for video-to-text

Closes: mudler#2318

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): support CPU installations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vllm): add bnb

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add docs reference

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Apply suggestions from code review

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request roadmap up for grabs Tickets that no-one is currently working on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant