#

multimodal-models

Here are 5 public repositories matching this topic...

uncbiag / Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

transformer-models vision-transformer multimodal-models foundation-models large-language-models

Updated Nov 1, 2024

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

text-to-speech multimodality text-to-image text-to-audio text-to-video text-to-music multimodal-models aigc large-language-models text-to-3d multimodal-generation text-to-sound large-vision-language-models multimodal-large-language-models

Updated Nov 6, 2024
HTML

arman-aminian / video-search

Video Search with CLIP

nlp image-search clip zero-shot video-search multimodal multilingual-models multimodal-models

Updated Aug 13, 2023
Jupyter Notebook

pokarats / LAP-final-project

Multimodal Bi-Transformers (MMBT) in Biomedical Text/Image Classification

text-classification transformer image-classification transfer-learning attention-mechanism bert biomedical-image-processing attention-visualization multimodal-representation huggingface-transformers sparse-data-learning multimodal-models mmbt-model

Updated Apr 13, 2021
Jupyter Notebook

antonio-f / Phi-3-Vision

Phi-3-Vision model test - running locally

machine-learning computer-vision jupyter-notebook artificial-intelligence image-to-text multimodal-learning hands-on hugging-face multimodal-models llms running-locally tiny-models small-models phi-3 phi-3-vision

Updated May 29, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the multimodal-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-models topic, visit your repo's landing page and select "manage topics."