-
-
Notifications
You must be signed in to change notification settings - Fork 454
SD3
Note: SD3.0 and SD3.5 are under gated access
SD3 Model consists of:
- MMDiT (multi-modal diffusion transformer)
Note: "medium" primarily refers to number of parameters in MMDiT component: 2B
StabilityAI may release smaller and/or larger variations as full SD3 has 8B parameters - VAE (variational autoencoder)
- Multiple text encoders:
- CLIP-ViT/L,
- OpenCLIP-ViT/G,
-
T5 Version 1.1
T5 is optional and used primarily to render text
Select: Networks -> Models -> Reference -> StabilityAI Stable Diffusion 3 Medium
Tip
To allow access to the models from SDNext server get your Huggingface token from your huggingface profile -> settings -> access tokens and enter it in SDNext -> settings -> diffusers -> huggingface token
Tip
Alternatively, login to Huggingface CLI and use the token from there
source venv/bin/activate
venv/bin/huggingface-cli login
Download SD3 models from Huggingface
Supported:
-
sd3_medium.safetensors
: includes the MMDiT and VAE weights only, SD.Next will automatically load CLiP models as needed -
sd3_medium_incl_clips.safetensors
: includes all necessary weights except for the t5 text encoder
Unsupported:
-
sd3_medium_incl_clips_t5xxlfp8.safetensors
: contains all necessary weights and t5 fp8 variant support for this version is planned in the near-future due to nature of fp8 quantization packaged in the file
t5 can be loaded/unloaded separately
SD.Next allows changing optional text encoder on-the-fly
Go to settings -> models -> text encoder and select the desired text encoder
T5 enhances text rendering and some details, but its otherwise very lightly used and optional
Loading T5 will greatly increase model resource usage and automatically enables sequential offloading
Tip
If you want to frequently switch between text encoders, you can add that setting to quicksettings