Table of Contents
What Are Hypernetworks In Stable Diffusion?
Hypernetworks are a technique used for fine-tuning Stable Diffusion image generations. They involve attaching a small neural network to modify the style of the model. This small hypernetwork is inserted into the cross-attention module of the noise predictor UNet.
Hypernetworks, LoRAs, and Textual Inversion Embeddings are alternatives for enhancing checkpoint model results, but they differ in training methodology and file size.
Hypernetworks were developed by Kurumuz as a way to control model generations and provide better text generation modules. They differ from the HyperNetworks introduced by David Ha, Andrew Dai, Quoc V. Le in 2016.
Hypernetworks applied to Stable Diffusion’s cross-attention layers have shown good performance and can outperform fine-tuning in cases with limited data on the target concept.
Hypernetworks are usually 5 to 300MB in file size and have a *.pt, *.ckpt, or *.safetensors file extension.
Where to Find Hypernetworks For Stable Diffusion?
The best place to find hypernetworks is Civitai. In Civitai, remember to filter results by hypernetworks. Another place to find hypernetworks is from HuggingFace.
Hypernetworks are not as popular as LoRAs or textual inversions, so don’t expect to find a lot of them.
How to Use Hypernetworks In Stable Diffusion?
The most common way to use hypernetworks is to download AUTOMATIC1111’s WebUI and copy and paste the downloaded hypernetwork to the following folder: *\stable-diffusion-webui\models\hypernetworks
You can activate the hypernetwork by either selecting it from AUTOMATIC1111’s Hypernetwork tab or by typing hypernet:filename:multiplier, for example <hypernet:incaseStyle_incaseAnythingV3:1>.
The filename is the file name of the hypernetwork, excluding the file extension. Multiplier is the strength or weight of the hypernetwork when AI applies it to the image-generation process. 0 is the same as disabling the hypernetwork and 1 is the standard. 1.2 would be a 20% increase on top of the standard strength.
You need a checkpoint model to use hypernetworks; they don’t work alone. Hypernet phrase is not considered part of the text prompt; it only points to which hypernetworks to use.
Image showing “standard” results without a hypernetwork enabled.
Hypernetwork used: none
Model used: Counterfeit
Image with a hypernetwork enabled.
Hypernetwork used: incaseStyle_incaseAnythingV3
Model used: Counterfeit
Image with a hypernetwork enabled and over-strengthened by 100% (hypernet:incaseStyle_incaseAnythingV3:2).
Hypernetwork used: incaseStyle_incaseAnythingV3
Model used: Counterfeit
Text prompt used to generate the images:
1 girl, 1 face, open mouth, portrait showing mainly face, front view, happily surprised, surprised facial expression, detailed, long hair, beautiful earrings, brown eyes, eyes open, surprised eyes, blond hair, violet shirt, purple shirt
Negative text prompt:
nsfw
Additional settings:
- Sampling method: Euler a
- Unchecked Restore faces, Tiling, hires, fix
- CFG Scale: 7
- Sampling steps: 20-25
- Seed: -1
- Script: none
The checkpoint model was kept the same and the text prompt was kept intentionally minimal to show the effects of hypernetworks.
How to Train A Hypernetwork?
Preparation:
- Select a few high-quality images rather than a large quantity of poor-quality images.
- Train using an image size of 512×512 and a 1:1 ratio to avoid distortion.
- Utilize BLIP and/or deepbooru to generate labels for the images.
- Review each label, removing any incorrect ones and adding any missing ones.
- Refer to the Hypernetwork Style Training guide for activation, initialization, and network size recommendations.
Training:
- Set the learning rate schedule: 5e-5 for 100 steps, 5e-6 for 1500 steps, 5e-7 for 10,000 steps, and 5e-8 for 20,000 steps.
- Create a prompt template file a .txt with only [filewords].
- Run training for around 20,000 steps or fewer.
Additional details for training a hypernetwork:
- Follow the recommended learning rate schedule to capture fine details and ensure safe training.
- Create a prompt template file with [filewords] if the labels are accurate, or use a regular hypernetwork txt file to reduce bias.
- Consider training for 5,000-10,000 steps for usable results or up to 20,000 steps for finer details.
- Keep VAE (Variational Autoencoder) enabled if your model uses it.
- Unload any other hypernetworks to avoid potential interference during training.
- If your model breaks and the preview tests show colorful noise, revert to an earlier model and reduce the learning rate.
- Avoid changing training data midway; it’s better to start over.
- For bulk cropping images, you can use Birme.
- If the loss exceeds 0.3, lower the learning rates, as this may indicate issues with the hypernetwork.
Feature image credits.