Skip to content

Latest commit

 

History

History
433 lines (421 loc) · 19.7 KB

SUPPORTED_MODELS.md

File metadata and controls

433 lines (421 loc) · 19.7 KB

OpenVINO™ GenAI: Supported Models

Large language models

Architecture Models Example HuggingFace Models
ChatGLMModel ChatGLM
GemmaForCausalLM Gemma
GPTNeoXForCausalLM Dolly
RedPajama
LlamaForCausalLM Llama 3
Llama 2
OpenLLaMA
TinyLlama
MistralForCausalLM Mistral
Notus
Zephyr
PhiForCausalLM Phi
QWenLMHeadModel Qwen

Note

LoRA adapters are supported.

The pipeline can work with other similar topologies produced by optimum-intel with the same model signature. The model is required to have the following inputs after the conversion:

  1. input_ids contains the tokens.
  2. attention_mask is filled with 1.
  3. beam_idx selects beams.
  4. position_ids (optional) encodes a position of currently generating token in the sequence and a single logits output.

Note

Models should belong to the same family and have the same tokenizers.

Image generation models

Architecture Text 2 image Image 2 image Inpainting LoRA support Example HuggingFace Models
Latent Consistency Model Supported Supported Supported Supported
Stable Diffusion Supported Supported Supported Supported
Stable Diffusion Inpainting Not applicable Not applicable Supported Supported
Stable Diffusion XL Supported Supported Supported Supported
Stable Diffusion XL Inpainting Not applicable Not applicable Supported Supported
Stable Diffusion 3 Supported Not supported Not supported Not supported
Flux Supported Supported Supported Partially Supported

Visual language models

Architecture Models LoRA support Example HuggingFace Models Notes
InternVL2 InternVL2 Not supported
LLaVA LLaVA-v1.5 Not supported
LLaVA-NeXT LLaVa-v1.6 Not supported
MiniCPMV MiniCPM-V-2_6 Not supported
Phi3VForCausalLM phi3_v Not supported
  • GPU isn't supported
  • These models' configs aren't consistent. It's required to override the default eos_token_id with the one from a tokenizer: generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id()).
  • Qwen2-VL Qwen2-VL Not supported

    Whisper models

    Architecture Models LoRA support Example HuggingFace Models
    WhisperForConditionalGeneration Whisper Not supported
    Distil-Whisper Not supported
    Some models may require access request submission on the Hugging Face page to be downloaded.

    If https://huggingface.co/ is down, the conversion step won't be able to download the models.