Skip to content
Vladimir Mandic edited this page Oct 28, 2024 · 13 revisions

Screenshot-Dark

Access

Note: SD3.0 and SD3.5 are under gated access

Load Model

Components

SD3 Model consists of:

  • MMDiT (multi-modal diffusion transformer)
    Note: "medium" primarily refers to number of parameters in MMDiT component: 2B
    StabilityAI may release smaller and/or larger variations as full SD3 has 8B parameters
  • VAE (variational autoencoder)
  • Multiple text encoders:

Load using Reference Models

Select: Networks -> Models -> Reference -> StabilityAI Stable Diffusion 3 Medium

Tip

To allow access to the models from SDNext server get your Huggingface token from your huggingface profile -> settings -> access tokens and enter it in SDNext -> settings -> diffusers -> huggingface token

Tip

Alternatively, login to Huggingface CLI and use the token from there

source venv/bin/activate
venv/bin/huggingface-cli login

Load using Manually provided single-file

Download SD3 models from Huggingface

Supported:

  • sd3_medium.safetensors: includes the MMDiT and VAE weights only, SD.Next will automatically load CLiP models as needed
  • sd3_medium_incl_clips.safetensors: includes all necessary weights except for the t5 text encoder

Unsupported:

  • sd3_medium_incl_clips_t5xxlfp8.safetensors: contains all necessary weights and t5 fp8 variant support for this version is planned in the near-future due to nature of fp8 quantization packaged in the file
    t5 can be loaded/unloaded separately

Load Text Encoder

SD.Next allows changing optional text encoder on-the-fly

Go to settings -> models -> text encoder and select the desired text encoder
T5 enhances text rendering and some details, but its otherwise very lightly used and optional
Loading T5 will greatly increase model resource usage and automatically enables sequential offloading

Tip

If you want to frequently switch between text encoders, you can add that setting to quicksettings

Other

Clone this wiki locally