---
language:
- en
license_name: stabilityai-ai-community
license_link: LICENSE.md
library_name: diffusers
pipeline_tag: text-to-image
tags:
- text-to-image
base_model:
- suzushi/miso-diffusion-m-1.0
- stabilityai/stable-diffusion-3.5-medium
---
<div style="display: flex; justify-content: center; gap: 20px; margin-bottom: 20px;">
    <img src="demo1.png" width="400" />
    <img src="demo2.png" width="400" />
</div>
# Anime SD3.5 medium Model
An attempt to fine tune sd3.5 medium
## Version History

| Version | Base Training | Aesthetic Training | Total Epochs |
|---------|--------------|-------------------|--------------|
| alpha     | 250K images  | 0 images        | 1           |
| beta     | 160K images  | 0 images        | 3           |
| 1.0     | 600k images   | 0 images        | 2 + (3 from beta)    |
| 1.1     | 710k images   | 0 images | 5                   |
| 2.0     | 1.08M images   | 0 images | 5                   |

## Training Methodology

Training is done on gh200 with 96gb vram, now that prior training shows 
decent results, I am slightly increasing learning rate.

Training setting: Adafactor with a batchsize of 40, lr_scheduler: cosine
SD3.5 Specific setting:
enable_scaled_pos_embed = true

pos_emb_random_crop_rate = 0.2

weighting_scheme = "flow"
learning_rate = 8e-6

learning_rate_te1 = 5e-6

learning_rate_te2 = 5e-6

Train Clip: true, Train t5xxl: false

## Support Me
At the moment training an epoch cost around 130 dollars. If you like my project please consider supporting me: https://ko-fi.com/suzushi2024

Lastly, huge thanks to meg who has been supporting this project, without him this project would not have been possible !