Installation guide for Kohya ss_GUI using GPU Cloud RunPod

- What is RunPod
- GPUs Available on RunPod
- About RunPod Pricing
- How to Start RunPod
- ⚠️Precautions when using Kohya ss_GUI with RunPod
- How to Stop Pod and Kohya ss_GUI
- How to Restart Pod and Kohya ss_GUI
- Troubleshooting
- How to Use Kohya ss_GUI
- How to Download Models like Flux.1 that Require Login to Jupyter Lab
- Conclusion
Have you ever felt frustrated when your PC becomes unusable during LoRA training, or have you given up on training SDXL because your PC’s specs are low? In such cases, I recommend GPU Cloud. GPU Cloud allows users to rent high-performance GPU machines located in data centers on time basis. Services include Google Cloud (GCP), RunPod, Vast.ai, Lambda Labs, etc. This time, I will introduce how to use RunPod, which is inexpensive and suitable for personal use, and has a relatively low running cost, making it suitable for beginners as well.
What is RunPod

RunPod is a GPU cloud ideal for those who want to advance AI development and experiments with “minimal settings for immediate use”, “wide selection of GPUs”, and “cost only for what you use”. It is particularly suitable for iterative and intermittent AI projects, personal use, and research purposes. The features are summarized below.
- Abundant regions: It is deployed in 24 regions globally, providing an environment where you can use GPUs with low latency.
- Rich GPU lineup: A wide range of high-performance GPUs from NVIDIA (H100, A100 80GB, RTX 5090, RTX 4090) can be selected.
- Flexible environment construction by container (Docker) based Pods: Many templates such as OS, libraries, PyTorch/TensorFlow are prepared, and it is designed to start using immediately.
- Fast startup: With “FlashBoot”, you can launch a Pod in less than 20 seconds and start working immediately.
- Pay-as-you-go system with fine billing (per second): Pay only for what you use, reducing waste. No monthly contracts.
- Checkpoint saving with persistent volume: You can retain data even if you stop the Pod, and you can continue to use it by restarting.
- Zero data transfer fee: Another advantage is that there is no charge for uploading/downloading models and data.
✅The downside is that it can occasionally become unstable, but in most cases, this can be resolved by restarting the pod.
The free trial is small at $0.10, but you can receive RunPod credits by using the RunPod referral program. It is recommended for those who want to train models at a low cost. Also, even if you have a high-spec PC, you will be freed from the hassle of not being able to use your PC during training.
GPUs Available on RunPod
You can rent the following representative GPUs on RunPod.
- NVIDIA GeForce RTX 4090:
- NVIDIA GeForce RTX 5090:
- NVIDIA RTX 6000 Ada Generation:
- NVIDIA H200:
- NVIDIA H100 80GB HBM3:
- NVIDIA B200:
If you want to check the latest list, you can check it from the link below.
About RunPod Pricing
When it comes to the fees associated with using RunPod, the first thing to remember is that you are charged separately for the operating time of the Pod and the storage. You only pay for the cost of the Pod for the time you use it, and you are not charged when you terminate it, but you will continue to be charged for storage as long as it exists on the server (it will be half price during operation). If you don’t want to be charged, you need to empty the storage (delete the deployed Pod).
Therefore, when using RunPod to train LoRA, if you deploy a Pod for each LoRA project and delete the Pod after downloading all the necessary data once the model training is finished, you can save running costs as there will be no maintenance fees.
Also, the cost varies greatly depending on the GPU. The higher-end the GPU, the higher the usage fee.
There are two types of clouds, “Secure Cloud” and “Community Cloud”. The Community Cloud is about half the price of the Secure Cloud, but it may be busy and not available when you want to use it. However, if you want to train LoRA at the lowest cost, the Community Cloud is the cheapest option.
Estimate of Charges Simulation
Since the usage fee changes every Monday, I can’t clearly state “You can use this GPU at this price.” However, I will explain it with the fee at the time of writing as a reference.
If you rent a NVIDIA GeForce RTX 4090
on the Secure Cloud
, it is currently 0.69/hr
. Let’s say it takes roughly 1 hour for initial pod construction, model download, dataset upload, etc. If it takes 1 hour for one training session, it would be nice if a satisfactory model could be made in one go, but let’s say you changed some settings and did the training 4 times. The total time is 5 hours, so the cost of LoRA training is about $3.45.
And, since this Pod’s storage is secured at a total of 70GB, the maintenance fee is $0.014 per hour. The daily fee is approximately $0.333. ⚠️This fee can be reduced to zero by deleting the Pod.
If you buy a NVIDIA GeForce RTX 4090 now, it will cost around $2,500 to $3,500. In addition, GPUs are consumables and can break down after a few years of use. If you rent it, you won’t have to worry about breakdowns, so it could be said that it’s cost-effective.
How to Start RunPod
Account Registration and Payment
First, create an account and make a payment from the link below. The following link is for the RunPod referral program. If you register from this link and pay more than $10, you can receive a random RunPod credit of $5 to $500. However, you need to register with a Google account. *If you do not want to use a Google account, please register without the RunPod referral program.

When you go to the Runpod website from the link above, you will see a screen like the one below.

Click the “Claim Your Bonus” button to go to the account registration screen.
Once you reach the account registration screen, create an account with the “Sign up & Claim reward” button. Give permission to link with the specified Google account. If you do not want to use a Google account, register from “Sign up without the bonus” without the RunPod referral program.

Once your account is created, you will be taken to the Home screen.

Since the credit is $0.10 and you can hardly do anything, first go to the payment screen from the “Billing” button in the Account in the left menu.

At the top of the page, you can specify the payment amount in the “How much credit do you want to add to your account?” section ($10 or more), specify it, or click the amount on the left and enter the amount. Once you can enter, complete the procedure from the “Pay with card” button.

Deploying a Pod
The Pod we will deploy this time is compatible with Kohya ss_GUI. We will deploy it according to the 🔗official repository’s manual installation method.
Go to the Pod deployment page from the “Pods” button in the Manage menu. Here, you will choose the GPU to rent. This time, I want to use the RTX 4090
of Secure Cloud
. If you want to use it at the lowest cost, it would be good to rent the Community Cloud
.

Choosing a region is surprisingly important when renting a GPU. Busy regions may have almost no vacancies when you want to use them. The speed of uploading/downloading models depends on the distance from the server to the PC, but it’s not a bad idea to choose a server that has a lot of vacancies even if it’s far away.
You can choose a region from the filter list at the top. If you don’t care about the region, you can rent it with the default “Any region” and the region that is vacant at that time will be assigned.

When you click on the GPU, the deployment settings will appear in the UI. Press the “Change Template” button to change the template.

Various templates will appear, so select “Runpod Pytorch 2.2.0”.

⚠️We want to proceed to deployment as it is, but I will edit an important part here. Edit the template from the “Edit Template” button in the Pod Template area.

What’s important in deployment is the “Volume Disk” capacity. This capacity is the space to store the Kohya ss_GUI itself, the safetensors file for training, the training images, the LoRA file after training, etc., so let’s secure enough. However, the larger this capacity, the more storage idle cost will be incurred in RunPod.
This time, I set it to 50 GB
from the default 20 GB, assuming the training of the SDXL model. In the case of Flux.1, it will be even larger, so you will need about 80 GB
.

Instance Pricing is fine with On-Demand. If you rent an instance continuously for a long period, you can get a significant discount by using the Savings Plan. However, this requires prepayment.
There are options for “Encrypt Volume” and “Start Jupyter NortBook”, but this time we will operate using Jupyter NortBook, so turn on the checkbox. Use Encrypt Volume if you want to keep the server secret.
After changing the template settings, deploy from “Deploy On-Demand” at the bottom.

Wait for a while until the deployment is complete. When it’s done, a screen like the one below will appear.

Next, wait for Jupyter Lab to change from Not Ready
to Ready
and then click the Jupyter Lab Link .

After a while, Jupyter Lab will start in your browser.

Installing Kohya ss_GUI
Once Jupyter Lab is launched, install Kohya ss_GUI.
Press the “Terminal” button located in the Other section at the bottom of the launcher list on the right side of the UI to launch the terminal.


After launching the terminal, confirm that the directory is in workspace
, and first clone the Kohya ss_GUI repository. Paste and execute the following code.
git clone --recursive https://github.com/bmaltais/kohya_ss.git

The cloning should finish fairly quickly. Next, move to the kohya_ss
directory and execute the setup. By pasting the following code, you can execute the directory move and setup at once. This will take quite some time. Wait patiently until Installation completed... You can start the gui with ./gui.sh --share --headless
is displayed on the command line. (Approximately 10 to 15 minutes)
cd kohya_ss
./setup-runpod.sh

Once the setup is complete, launch Kohya ss_GUI with the following code.
./gui.sh --share --headless
After execution, a link will be displayed as shown in the image below after a while. Click on it to launch the Kohya ss_GUI UI in your browser.

Also, you can proceed with “clone > setup > launch” all at once with the following code.
git clone --recursive https://github.com/bmaltais/kohya_ss.git
cd kohya_ss
./setup-runpod.sh
./gui.sh --share --headless
Uploading a Dataset Using Jupyter Lab
Even if the installation of Kohya ss_GUI is complete, you can’t do anything as it is. You need to prepare a checkpoint model for training and a dataset of training images, etc.
First, create a new folder in the workspace, like “train”. ⚠️ You can create this directory structure as you like, but please change the file path explained later to match your structure.

Next, create a folder for training. This time, I named it DCAI-Girl-Illustrious
. Let’s name the folder so that the content of the LoRA to be trained can be easily understood.
Once the folder is created, move inside and create three new folders: dataset
, model
, and log
.

Once the folder is created, move to the “dataset” folder. Then, drag and drop the dataset images into the folder to upload them.

Downloading a Training Model Using Jupyter Lab
Once the dataset upload is complete, the next step is to download the training checkpoint model.
Move to the train folder. Here, create a new folder called models
.

Once you’ve moved to the folder, click on the “+” tab next to the command line tab that is currently open at the top right of the UI.
A launcher will appear, so open the same “Terminal” as before.

When the terminal opens, make sure it says root@HASH-NAME:/workspace/train/models#
.
It is possible to upload the model directly from local, but it takes quite a bit of time to upload a file of several gigabytes, so let’s download it directly from HuggingFace or Civitai.
There are several ways to download files from the command line, but this time let’s use “wget” to download. ⚠️This method can only download models that do not require login. For models that require login (e.g., Flux.1 [Dev], etc.), use the “curl” command to download.
Let’s say you want to download the following model.
Go to the model page and copy the link with the “copy download link” button. Replace the LINK
and FILE_NAME
in the code below and execute it.
wget --wait=10 --tries=10 "LINK" -O FILE_NAME.safetensors
In this case, it would look like this.
wget --wait=10 --tries=10 "https://huggingface.co/OnomaAIResearch/Illustrious-XL-v2.0/resolve/main/Illustrious-XL-v2.0.safetensors?download=true" -O illustriousXL_v2.safetensors

If you run the above code in the command prompt, the model download will begin.

Also, if you want to download a model from Civitai, you can copy the link by right-clicking the “Download” button on the model page.
With that, the preparation for LoRA training is complete.
⚠️Precautions when using Kohya ss_GUI with RunPod
Although it is basically the same as when using it locally, you cannot select the file path using Explorer like you do locally, so you need to enter the path directly.
How to Save the Config File
Once you open Kohya ss_GUI, switch to LoRA mode and open “Configuration” at the top.
Fill in “Load/Save Config file” as shown below. (You can change the file name as you like.)
/workspace/train/config.json

The config file is created in the /workspace/train folder with the 💾 button. You can call it up with the ↩️ button.
If you want to create a configuration file for each training LoRA, do as follows.
/workspace/train/DCAI-Girl-Illustrious/config.json
How to Specify the Training Model
Move to the model directory you downloaded in Jupyter Lab.
Right-click the downloaded model and copy the path with “Copy Path”.

In the “Model” section of Kohya ss_GUI, paste the path you just copied into “Pretrained model name or path”, but start by entering /
before pasting.

How to Specify a Dataset
Just like when specifying the model path, copy the path in Jupyter Lab and paste it into Kohya ss_GUI, adding a /
at the beginning.

How to Stop Pod and Kohya ss_GUI
When stopping the Pod, please note that all data saved outside of the workspace
will be deleted. In this configuration, no user data will be lost. However, during the Kohya ss_GUI setup, things like python3-tk are installed, but all of these will be deleted, so you will need to start from the setup when you restart. *If there is a template with python3-tk etc. in advance, it would be good, but it is not currently released from the official.
With the above precautions in mind, we will stop.
The basic method of stopping is to stop the running Pod on the RunPod’s Pod page with the “Stop” button. If you want to stop more carefully, open the terminal of Kohya ss_GUI in Jupyter Lab and stop the virtual server with Ctrl + c
before stopping.

The cost to the storage at the time of stopping is also displayed, so please refer to it.
Also, if you no longer need this Pod, or if you want to save on idle disk cost, you can completely delete the Pod with the “Terminate” button. *If you are going to delete it completely, make sure to download the necessary data first.
How to Restart Pod and Kohya ss_GUI
As explained in the method of stopping the Pod, all data saved outside the workspace
is erased, so when restarting, you need to start from the setup of Kohya ss_GUI.
On the Runpod’s Pod page, there is a stopped Pod, so if you click on the Pod you want to start, a button with a usage price like “Start for $0.69/hr” will appear, so click to start. After starting, open Jupyter Lab.

As explained in the installation of Kohya ss_GUI, press the “Terminal” button to start the terminal.


After starting the terminal, the following command will perform everything from setup to starting Kohya ss_GUI in one go.
apt update -y && apt install -y python3-tk
cd kohya_ss
./gui.sh --share --headless
The setup takes about 30 seconds to 1 minute. Once Kohya ss_GUI starts, a URL will appear, so click to open the UI and the startup is complete.
Troubleshooting
I can’t proceed any further due to an error
RunPod can be quite unstable, and things that used to work can suddenly stop working due to an error.
When it becomes unstable, try restarting the Pod. Most problems can be solved this way.
If that doesn’t solve the problem, try deploying a new Pod. It’s a hassle, but it can also solve the problem.
There’s no GPU when I try to start the Pod (Zero GPU Pods)

This is a common occurrence in busy regions. All you can do is wait and check for availability later.
If you really need to use it at that time, it’s a hassle, but it might be faster to rent a new Pod.
When you want to deploy a new Pod, start the Pod that couldn’t rent a GPU with 0GPU (CPU only), and then use Jupyter Lab to download the necessary data.
If moving data back and forth is a hassle, consider using paid Network Storage. Network Storage is a storage that can be accessed across Pods, so you can use it immediately by connecting it to an available Pod.
How to Use Kohya ss_GUI
You can use Kohya ss_GUI just like you do locally now that you’ve come this far. The usage of Kohya ss_GUI is explained in detail in the following article.
Please refer to the following articles for the learning methods of the original character LoRA from SD1.5 to NoobAI XL.
![Featured image of How to create an original character LoRA [Illustrious-XL Training] Illustrious-XL Character training](https://dca.data-hub-center.com/content/uploads/2025/07/eye_catch_original-character-lora-illustrious-character-training-en.jpg)
![Featured image of How to create an original character LoRA [NoobAI XL Training] NoobAI XL Character training](https://dca.data-hub-center.com/content/uploads/2025/07/eye_catch_original-character-lora-noobai-character-training-en.jpg)
How to Download Models like Flux.1 that Require Login to Jupyter Lab
Flux.1 and similar models require you to log in to download. You cannot download them with the usual wget because you do not have download permissions. This time, I will introduce two ways to download “Flux.1 Dev” from Civitai using curl command.
Conclusion
With RunPod, you don’t need to buy an expensive GPU worth thousands of dollars—you can simply rent high-performance GPUs at a low cost whenever you need them for LoRA training. By deploying a Pod and running Kohya ss_GUI through Jupyter Lab, you can work almost the same way as on your local PC, and once training is finished, you can delete the Pod to avoid ongoing storage costs. While there are some caveats, such as occasional instability and the need to reinstall setups after restarts, the ability to handle large-scale models like SDXL or Flux.1 efficiently and affordably is a major advantage. For those struggling with limited hardware or the inconvenience of not being able to use their PC during training, RunPod is a highly practical solution.