simpletunner-full-novaskin

This is a full rank finetune derived from stabilityai/stable-diffusion-3.5-medium.

No validation prompt was used during training.

None

Validation settings

CFG: 5.0
CFG Rescale: 0.0
Steps: 20
Sampler: FlowMatchEulerDiscreteScheduler
Seed: 42
Resolution: 64x64
Skip-layer guidance:

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, police officer with dark blue uniform: crisp button‐up shirt tucked into matching pants; sturdy black boots covering lower legs; black tactical gloves on hands; silver badge on left chest; face with focused expression, blue eyes under a peaked cap; short brown hair peeking at nape; utility belt with radio pouches around waist

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, fantasy dragon‐scale armor in deep red and gold: overlapping scale plates on chest and shoulders; gauntleted gloves with claw tips on hands; armored greaves on legs; sturdy boots with scale fins; face framed by horned helm opening to reveal green reptilian eyes; dark flowing hair braided into a tail at back

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, cyberpunk streetwear: black bio‐leather jacket with neon blue circuit patterns, high‐collar around neck; fingerless gloves on hands; slim cargo pants with glowing seams and knee pads; tech‐bonded boots; face with glowing augmented ocular implants and angular cheek tattoos; buzzed undercut hair in electric purple

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, samurai warrior in red and black lacquered armor: layered cuirass over kimono sleeves; wrapped hand‐guards on forearms; hakama trousers tying at calf; straw‐soled sandals on feet; face calm with golden eyes beneath a half‐mask; jet‐black topknot hairstyle

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, elven forest scout: green hooded cloak draping over slender shoulders; leather bracers on forearms; fitted tunic and leggings dyed in moss and bark tones; soft boots for silent steps; delicate hands with finger‐woven gloves; face with sharp emerald eyes and freckled cheeks; long auburn hair braided with leaf ornaments

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, steampunk mechanic: worn brown leather apron over soot‐stained shirt; fingerless leather gloves with metal knuckle plates on hands; reinforced trousers with tool pockets; heavy leather boots with brass buckles; face smudged with oil, bright hazel eyes behind round goggles; tousled sandy hair

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, high‐tech spacesuit: white and blue pressurized suit with panel lines on torso; glove‐sealed sleeves and articulated gauntlets on hands; reinforced leggings with cable conduits to boots; helmet viewport revealing calm face with hazel eyes and chin strap; short dark hair neatly cut

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, desert robes: flowing sand‐colored tunic over loose pants; wrapped cloth around forearms and calves; sand‐proof gauntlets on hands; leather sandals on feet; face partially veiled with brown scarf exposing only bright amber eyes; sun‐bleached blonde hair tied back

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, shining plate armor: breastplate embossed with crest over padded gambeson; articulated gauntlets on hands; greaves and sabatons covering legs and feet; face visible through open helm, with steely gray eyes and a cropped brown beard; short cropped hair

Negative Prompt
blurry, cropped, ugly

Prompt
minecraft skin, arcane mage robes: deep violet robe embroidered with glowing runes on torso and sleeves; delicate fingerless silk gloves on hands; flowing skirt over fitted leggings; soft pointed boots; face with luminous violet eyes and pale skin; long silver hair cascading over shoulders

Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 5
Training steps: 60260
Learning rate: 1e-06
- Learning rate schedule: cosine
- Warmup steps: 200
Max grad value: 1.0
Effective batch size: 12
- Micro-batch size: 12
- Gradient accumulation steps: 1
- Number of GPUs: 1
Gradient checkpointing: False
Prediction type: flow_matching (extra parameters=['shift=3'])
Optimizer: adamw_bf16
Trainable parameter precision: Pure BF16
Base model precision: no_change
Caption dropout probability: 0.1%

Datasets

skins-64

Repeats: 0
Total number of images: 100000
Total number of aspect buckets: 1
Resolution: 0.004096 megapixels
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

skins-256

Repeats: 1
Total number of images: 20000
Total number of aspect buckets: 1
Resolution: 256 px
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

skins-heads

Repeats: 3
Total number of images: 5000
Total number of aspect buckets: 1
Resolution: 256 px
Cropped: False
Crop style: None
Crop aspect: None
Used for regularisation data: No

Inference

import torch
from diffusers import DiffusionPipeline

model_id = 'saviski/simpletunner-full-novaskin'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16

prompt = "An astronaut is riding a horse through the jungles of Thailand."
negative_prompt = 'blurry, cropped, ugly'

pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
    width=64,
    height=64,
    guidance_scale=5.0,
).images[0]

model_output.save("output.png", format="PNG")

Exponential Moving Average (EMA)

SimpleTuner generates a safetensors variant of the EMA weights and a pt file.

The safetensors file is intended to be used for inference, and the pt file is for continuing finetuning.

The EMA model may provide a more well-rounded result, but typically will feel undertrained compared to the full model as it is a running decayed average of the model weights.

saviski
/

simpletunner-full-novaskin