How to create an original character LoRA [PonyV6 Training] PonyV6 Character training
![How to create an original character LoRA [PonyV6 Training] PonyV6 Character training featured Image](/_next/image?url=https%3A%2F%2Fdca.data-hub-center.com%2Fcontent%2Fuploads%2F2025%2F06%2Feye_catch_original-character-lora-pony-character-training-en.jpg&w=3840&q=80)
This article will use Pony Diffusion V6 XL (PDXL), one of the SDXL models, to learn about the character LoRA. PDXL is a high-quality T2I based on Stable Diffusion XL (SDXL), especially good at characters such as animals and humans. Since many Pony-derived models are also publicly available, you can generate LoRA learned on a Pony base in various styles. If you don’t know how to install Kohya ss GUI or how to create a dataset, start by reading the articles below.
Also, if you are new to LoRA training, I recommend starting with the SD1.5 model, which finishes training quickly.


Training with the Default Values of Kohya ss GUI
First, let’s try training using the default values of Kohya ss GUI, which i always introduce first in the character training edition. I will use the following base model.

Dataset
The dataset is based on the data created in “How to Create Original Character LoRA【Dataset Edition】Production of training Images and Captions”. If you want to train with the same dataset, it is available on Patreon, but only paid supporters can download it.


Default Parameters
Once the dataset is ready, use the following parameters for training. For training the Pony model, I have slightly modified the values. The parts that need to be input or changed are marked in red.
- Pretrained model name or path:ponyDiffusionV6XL_v6StartWithThisOne.safetensors
- Trained Model output name:DCAI_Girl_Pony_Def ※Model output name
- Instance prompt:dcai-girl ※Although the value is ignored in the captioning method used this time, it will cause an error if not entered.
- Class prompt:1girl ※Entered for the same reason as above.
- Repeats:5 [Default: 40] ※Because the original image for training is 100, and we want to make the total image 500.
- Presets:none
- LoRA type:Standard
- Train batch size:1
- Epoch:1
- Max train epoch:0
- Max train steps:1600
- Save every N epochs:1
- Seed:123 [Default: 0 = Random] ※Enter an arbitrary number to control the parameters.
- LR Scheduler:cosine
- Optimizer:AdamW8bit
- Learning rate:0.0001 (1e-4)
- Text Encoder learning rate:0.00005 (5e-5) [Default: 0.0001 (1e-4)] ※Changed to the recommended default value in the official documentation.
- Unet learning rate:0.0001 (1e-4)
- LR warmup (% of total steps):10
- Max resolution:1024, 1024 [Default: 512, 512] ※Resolution of SDXL
- Network Rank (Dimension):8
- Network Alpha:1
- clip_skip:0 [Default: 1] ※Because Clip skip is disabled in SDXL
Test Generation Using the Trained LoRA
The trained LoRA was used to generate the results using the A1111 WebUI, and it turned out as shown in the figure below. The same “ponyDiffusionV6XL_v6StartWithThisOne.safetensors” was used for generation as the base model of training. The image below is before the application of LoRA.


At first glance, it seems to be highly proficient, but the image introduced looks good because the seed was a hit. In reality, it was generating unstable costumes.
About Score Tags Specific to the Pony Model
One of the features of the Pony model is the score tag. Let’s take a look at the list below.
- score_9: Extremely high quality (top rank)
- score_8_up: High quality (top 20% or so)
- score_7_up: Good (middle to upper class)
- score_6_up: Average (medium quality)
- score_5_up: Slightly low (somewhat rough, composition is sweet, etc.)
- score_4_up: Low quality (with breakdowns and blurs)
For example, specifying score_8_up
means “score 8 or higher”, but it is often used in conjunction with score_9, score_8_up
.
By including this quality tag in the image caption, you can adjust the training quality to some extent.
One thing to note is that if you use score_9
when the training image is clearly low, the LoRA model may become unstable. Also, if there is a variation in the quality of the training images, you should lower the score tag to score_6_up
or not use it.
The comparison model is trained with the following settings and applies the score tag to each.
Number of training images: 100
Repeats: 5
Train batch size: 1
Epoch: 4
Max train steps: 0
Seed: 123
LR Scheduler: cosine_with_restarts
Optimizer: AdamW
Optimizer extra arguments: betas=0.9,0.99 weight_decay=0.05
Learning rate: 0.0004 (4e-4)
Unet learning rate: 0.0004 (4e-4)
Text Encoder learning rate: 0.00005 (5e-5)
LR warmup (% of total steps): 10
LR # cycles: 2
Max resolution: 1024, 1024
Network Rank (Dimension): 32
Network Alpha: 16
clip_skip: 0 ※Disabled in SDXL
Each model is generated with the A1111 WebUI. The prompt is generated with various score settings, with no score, score_9, and all scores included.
The generation settings are as follows.
Basic Positive-Promptdcai-girl, 1girl, solo, looking at viewer, solo, short hair, orange hair, brown eyes, animal ears, dress, meadow, sky, day, masterpiece, best quality
worst quality, low quality, bad anatomy, realistic, lips, inaccurate limb, extra digit, fewer digits, six fingers, monochrome, nsfw
checkpoint: ponyDiffusionV6XL_v6StartWithThisOne.safetensors
Steps: 20
Sampler: DPM++ SDE
Schedule type: Karras
CFG scale: 6
Seed: 3740385248
Size: 1024x1024
VAE: sdxl.vae.safetensors







In this dataset, wouldn’t it be optimal to train with no score or around score_8_up, and generate with score_8_up or all scores included + negative prompt? It was surprising that score_4_up was of good quality.
To edit the dataset caption, use the “Search and Replace” feature in the “Batch Edit Captions” of the A1111 WebUI extension “WebUI Dataset Tag Editor”.

If you enter 1girl,
in “Search Text” and 1girl, score_8_up,
in “Replace Text”, you can add it. Before rewriting, select Entire Caption
in Search and Replace in to rewrite the whole. If you want to know the detailed method, please refer to the following article.
Training with Prodigy Optimizer
This time, I’m thinking of training using the Prodigy Optimizer.
The Prodigy Optimizer is essentially AdamW, but it’s an optimizer that uses the Adaptive Learning Rate optimization algorithm, which has been attracting attention in recent LoRA and DreamBooth training. It’s particularly suitable for fine-tuning LoRA and high-quality training because it tends to converge quickly even at low learning rates.
How to Use the Prodigy Optimizer
The basic usage is to simply select Prodigy
from the Optimizer in the Basic section of Kohya_ss GUI’s Parameters. Note that the LearningRate is set to 1.0
for both Unet/TextEncoder. If you want to lower the learning rate, you can adjust it to some extent by entering d_coef=0.5
(default: 1.0) in the Optimizer extra arguments, instead of lowering the LR.
While the official documentation recommends the cosine annealing
scheduler, you can also train high-quality LoRA with cosine_with_restarts
or polynomial
.
Setting up KohyaSS_GUI using Prodigy Optimizer
Now let’s set up the Kohya_ss GUI using the Prodigy Optimizer. The dataset we will use is the same one I used for training with the default parameters earlier. In this dataset, if I use the score tag of Pony, the image style of the original image used for training becomes too strong, so I did not use it.
About Optimizer extra arguments
As explained in the previous SDXL settings, we will use Optimizer extra arguments to manipulate parameters that cannot be set in the GUI.
First, let’s look at the arguments referring to the repository.

- params: A list of dictionaries defining the iterable of parameters to optimize, or parameter groups.
- lr: Learning rate adjustment parameter. Increases or decreases the learning rate of Prodigy. (Default: 1.0)
- betas: Coefficients for calculating the moving averages of the gradient and its square. (Default: (0.9, 0.999))
- beta3: Coefficient for calculating the step size of Prodigy using a moving average. If set to
None
, the square root ofbeta2
is used. (Default: None) - eps: A term added to the denominator outside the square root operation to improve numerical stability. (Default: 1e-8)
- weight_decay: Weight decay as L2 regularization. (Default: 0)
- decouple: Whether to use AdamW style decoupled weight decay. (Default: True)
- use_bias_correction: Whether to enable Adam’s bias correction. (Default: False)
- safeguard_warmup: To avoid problems during the warm-up phase, exclude the learning rate (lr) from the denominator of the D estimate. (Default: False)
- d0: Initial D estimate for D adaptation. There is rarely a need to change this. (Default: 1e-6)
- d_coef: Coefficient in the formula for estimating D.
0.5
or2.0
can also generally be effective. It is recommended to adjust the optimization method by changing this parameter. (Default: 1.0) - growth_rate: Limits the D estimate from increasing too rapidly at this multiplication rate. Using a value like
1.02
can give a certain learning rate warm-up effect. (Default: float(‘inf’) (unlimited)) - fsdp_in_use: If you are using sharded parameters, you need to set this to
True
. The optimizer will try to auto-detect this, but auto-detection does not work if you are using an implementation other than PyTorch’s built-in. (Default: False) - slice_p: Reduces memory usage by calculating learning rate adaptation statistics using only the
p
th element of each tensor. Values greater than1
are approximations to the normal Prodigy. A value around11
is reasonable. (Default: 1)
Among these, We will change the following arguments this time. Copy and paste it into Optimizer extra arguments. Please be careful about the notation (no spaces before and after =, not comma-separated, but space-separated, etc.).
weight_decay=0.05 d_coef=1.3 betas=0.9,0.99 safeguard_warmup=True use_bias_correction=True
About the Scheduler
This time, we will use polynomial
. Polynomial is a scheduling method that decays the learning rate according to a polynomial curve. It is used with the aim of training quickly at first and suppressing the learning rate in the latter half.
You can control the learning rate decay curve with the LR power in the GUI. Let’s take a look at the graph in TensorBord below.

By changing the LR power, you can create various learning rate decay curves. Also, the default 1
does not have a curve, so it is the same as the scheduler’s “linear”.
This time, we will set the LR power to 2
for training, so it will be a cyan curve.
Training Parameters
In addition to the two settings mentioned earlier, I will also set up things like Shuffle caption, which I introduced in a previous article. The parts that need to be entered or changed in the list below are written in red.
- Pretrained model name or path:ponyDiffusionV6XL_v6StartWithThisOne.safetensors
- Trained Model output name:DCAI_Girl_Pony
- Instance prompt:dcai-girl※The value is ignored in the caption method we use this time, but if you don’t enter it, it will cause an error.
- Class prompt:1girl※Enter for the same reason as above.
- Repeats:5 [Default: 40]
- Presets:none
- LoRA type:Standard
- Train batch size:1
- Epoch:6 [Default: 1] ※To adjust the total steps with Epoch
- Max train epoch:0
- Max train steps:0 [Default: 1600] ※To adjust the total steps with Epoch
- Save every N epochs:0 [Default: 1] ※Because there was no need to see the progress along the way
- Seed:123 [Default: 0 = Random]
- LR Scheduler:polynomial [Default: cosine]
- Optimizer:Prodigy [Default: AdamW8bit]
- Optimizer extra arguments:weight_decay=0.05 d_coef=1.3 betas=0.9,0.99 safeguard_warmup=True use_bias_correction=True
- Learning rate:1.0 [Default: 0.0001 (1e-4)] ※Recommended value for Prodigy
- Text Encoder learning rate:1.0 [Default: 0.0001 (1e-4)] ※Recommended value for Prodigy
- Unet learning rate:1.0 [Default: 0.0001 (1e-4)] ※Recommended value for Prodigy
- LR warmup (% of total steps):0 [Default: 10] ※Controlled by Prodigy
- LR power:2 [Default: 1]
- Max resolution:1024, 1024 [Default: 512, 512] ※Resolution of SDXL
- Network Rank (Dimension):32 [Default: 8]
- Network Alpha:16 [Default: 1]
- Keep n tokens:8 [Default: 0] ※Number of instance and class tokens
- clip_skip:0 [Default: 1] ※Because Clip skip is disabled in SDXL
- Shuffle caption:true [Default: false]
- CrossAttention:sdpa [Default: xformers]
The TensorBord graph trained with these parameters turned out as follows. Around step:1900 seems to be a sign of overfitting, but since the final loss is decreasing, I considered it OK.


Training Results
As for the training results, I think we were able to create a reasonably high-quality LoRA. The quality of the face is not a problem for close-up shots, but for long shots, I recommend using ADetailer. Also, there were cases where cloth was generated on the front part of the skirt, so I added apron
to the negative prompt. The overall generation settings are as follows.
dcai-girl, 1girl, solo, looking at viewer, solo, short hair, orange hair, brown eyes, animal ears, dress, blue dress, black skirt, upper body, white thighhighs, thigh strap, meadow, sky, day, masterpiece, best quality, score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up <lora:DCAI_Girl_Pony:1>
score_6, score_5, score_4, apron, worst quality, low quality, bad anatomy, realistic, lips, inaccurate limb, extra digit, fewer digits, six fingers, monochrome, nsfw
checkpoint: ponyDiffusionV6XL_v6StartWithThisOne.safetensors
Steps: 20
Sampler: DPM++ SDE
Schedule type: Karras
CFG scale: 6
Seed: 3740385248
Size: 1344x768
VAE: sdxl.vae.safetensors
ADetailer: on
Hires upscaler: 4x-UltraSharp
The following is what was generated with the above parameters.

Applying to Models of the Same Lineage
You can apply the LoRA trained this time to checkpoints of the Pony lineage. Below are some samples where LoRA has been applied. The generation parameters are the same as before.





The final LoRA results are published on Civitai, so if you’re interested, give it a download.
Conclusion
This time, we tried to train the character LoRA using Pony Diffusion V6 XL (PDXL). Although Pony is often associated with NSFW content, it proved to be an excellent model for SFW content as well. At the time of writing, the enthusiasm has somewhat faded due to the influence of models like Illustrious-XL, but there are quite a few derivative models, so if you create a LoRA model, you can do a lot.
So far, I have explained how to train with SD1.5, SDXL, and Pony. Following this trend, next time I plan to train LoRA with Illustrious-XL.

