visual-language-models

Star

Here are 15 public repositories matching this topic...

THUDM / CogVLM

Star

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 29, 2024
Python

camel-ai / crab

Sponsor

Star

🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/

multi-agent-systems gui-automation large-language-models language-model-agent visual-language-models

Updated Nov 22, 2024
Python

bilel-bj / ROSGPT_Vision

Star

Commanding robots using only Language Models' prompts

robotics language-models ros2 robotic-vision large-language-models llm prompt-engineering chatgpt language-models-are-next robotic-design-patterns prompting-robotic-modalities visual-language-models

Updated Feb 16, 2025
Python

AlignGPT-VL / AlignGPT

Star

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

large-language-models multimodal-large-language-models visual-language-models

Updated Jul 12, 2024
Python

tianyu-z / VCR

Star

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

benchmark deep-learning visual-language-models

Updated Feb 18, 2025
Python

xinyanghuang7 / Basic-Visual-Language-Model

Star

Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖

visual-language-learning large-language-models visual-language-models multimodel-large-language-model

Updated Jun 19, 2024
Python

jaisidhsingh / CoN-CLIP

Star

Implementation of the "Learn No to Say Yes Better" paper.

deep-learning pytorch multimodal compositionality image-captions image-text-matching visual-language-models

Updated Nov 2, 2024
Python

Sid2697 / HOI-Ref

Star

Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"

dataset dataset-generation vlm hand-object-interaction egocentric-vision large-language-models visual-language-models

Updated Apr 16, 2024
Python

amathislab / wildclip

Star

Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models

behavior computer-vision clip camera-trap computervision visual-language-models

Updated Mar 8, 2024
Python

sduzpf / UAP_VLP

Star

Universal Adversarial Perturbations for Vision-Language Pre-trained Models

deep-neural-networks adversarial-attacks visual-language-models

Updated Oct 4, 2024
Python

declare-lab / Sealing

Star

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

multimodality video-understanding video-question-answering visual-language-models naacl2024

Updated Jul 25, 2024
Python

GraphPKU / CoI

Star

Chain of Images for Intuitively Reasoning

chatbot llama multimodal chatgpt llava visual-language-models gpt4v dalle3 chain-of-throught chain-of-image

Updated Nov 29, 2023
Python

AikyamLab / hallucinogen

Star

A benchmark for evaluating hallucinations in large visual language models

ai aisafety visual-language-models hallucination-evaluation hallucination-detection medical-safety medical-visual-language-model

Updated Jan 5, 2025
Python

vlvink / PaliGemma-from-scratch

Star

PaliGemma is a project created from scratch, based on a YouTube guide, to learn and demonstrate application/library/system creation. The project uses modern development approaches and best practices from the original tutorial.

machine-learning computer-vision language-model vlm generative-ai visual-language-models

Updated Jan 19, 2025
Python

laclouis5 / uform-coreml-converters

Star

CLI for converting UForm models to CoreML.

transformers coreml coremltools uform visual-language-models

Updated Jan 12, 2024
Python

Improve this page

Add a description, image, and links to the visual-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the visual-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

visual-language-models

Here are 15 public repositories matching this topic...

THUDM / CogVLM

camel-ai / crab

bilel-bj / ROSGPT_Vision

AlignGPT-VL / AlignGPT

tianyu-z / VCR

xinyanghuang7 / Basic-Visual-Language-Model

jaisidhsingh / CoN-CLIP

Sid2697 / HOI-Ref

amathislab / wildclip

sduzpf / UAP_VLP

declare-lab / Sealing

GraphPKU / CoI

AikyamLab / hallucinogen

vlvink / PaliGemma-from-scratch

laclouis5 / uform-coreml-converters

Improve this page

Add this topic to your repo