-
-
Notifications
You must be signed in to change notification settings - Fork 454
SD3
Note: SD3 is under gated access: go to https://huggingface.co/stabilityai/stable-diffusion-3-medium and fill the form to get access
SD Model consists of:
- MMDiT (multi-modal diffusion transformer)
Note: "medium" primarily refers to number of parameters in MMDiT component: 2B
StabilityAI may release smaller and/or larger variations as full SD3 has 8B parameters - VAE (variational autoencoder)
- Multiple text encoders: CLIP-ViT/L, OpenCLIP-ViT/G, T5 Version 1.1
TE3 (T5) is optional and used primarily to render text
Select: Networks -> Models -> Reference -> StabilityAI Stable Diffusion 3 Medium
Tip
To allow access to the models from SDNext server get your Huggingface token from your huggingface profile -> settings -> access tokens and enter it in SDNext -> settings -> diffusers -> huggingface token
Tip
Alternatively, login to Huggingface CLI and use the token from there
source venv/bin/activate
venv/bin/huggingface-cli login
Download SD3 models from Huggingface
Supported:
-
sd3_medium.safetensors
: includes the MMDiT and VAE weights only, SD.Next will automatically load CLiP models as needed -
sd3_medium_incl_clips.safetensors
: includes all necessary weights except for the t5 text encoder
Unsupported:
-
sd3_medium_incl_clips_t5xxlfp8.safetensors
: contains all necessary weights and t5 fp8 variant support for this version is planned in the near-future due to nature of fp8 quantization packaged in the file
t5 can be loaded/unloaded separately
SD.Next allows changing optional text encoder on-the-fly
Go to settings -> models -> text encoder and select the desired text encoder
Default is None, supported are T5 FP8 and T5 FP16 (not recommended due to size)
T5 enhances text rendering and some details, but its otherwise very lightly used and optional
Loading T5 will greatly increase model resource usage and automatically enables sequential offloading
Tip
If you want to frequently switch between text encoders, you can add that setting to quicksettings
- Mandatory parameters:
Sampler: Default
Note: SD3 uses custom sampler FlowMatchEulerDiscreteScheduler
you can experiment with different samplers, but results are not guaranteed - StabilityAI recommended parameters:
Resolution: 1024x1024, CFG scale: 7.0, Steps: 28
- Add prompt attention parser
- Add preview
- Add inpainting
- Fix SD3Transformer2DModel not compatible with cross-attention