How to Use Qwen-Image-Edit-2511 in ComfyUI: Camera Angle LoRA Workflow

In this article, I introduce “Qwen-Image-Edit”, published by Alibaba’s Qwen team. Qwen-Image-Edit has gone through multiple version upgrades since its initial release in August 2025. This article focuses on “Qwen-Image-Edit-2511”, the latest version at the time of writing.
While similar image editing models like “FLUX.2” and “FLUX.1 Kontext” are also popular, Qwen-Image-Edit is easy to run locally under the Apache 2.0 license, excels at photography and commercial use cases, and can also handle illustration-style content — which is why I’m introducing it here.
What You’ll Learn in This Article
- Overview, version history, and architecture of Qwen-Image-Edit.
- How to set up and use the ComfyUI template “Qwen Image Edit 2511 – Material Replacement”.
- Walkthrough of the DCAI custom workflow with GGUF, MultiAngle Camera LoRA, and SeedVR2 upscaler support. (💎 Members only)
- How to convert camera angles using MultiAngle Camera LoRA. (💎 Members only)
- Issues with SageAttention on Windows and how to work around them. (💎 Members only)
What Is Qwen-Image-Edit?
What Is Qwen?
“Qwen” is a large-scale model family developed by Alibaba Cloud. From this organization, which continuously releases LLMs, multimodal models, and AGI-related projects, a notable model in the image editing field emerged in 2025.

Qwen-Image-Edit is an image editing-focused version of the Qwen-Image series, released by Alibaba’s Qwen team. At the time of writing this article, “Qwen-Image-Edit-2511”, which has gone through two updates, is the latest version.
| Version | Release Date | Key Changes |
|---|---|---|
| Qwen-Image-Edit | August 2025 | Initial release |
| Qwen-Image-Edit-2509 | September 2025 | Multi-image editing support, improved character consistency |
| Qwen-Image-Edit-2511 | December 2025 | Significant character consistency improvements, LoRA integration, enhanced industrial design |
Architecture
Qwen-Image-Edit inherits Qwen-Image as its backbone, a 20 billion (20B) parameter multimodal diffusion transformer (MMDiT). It adopts dual-pass input to enable editing tasks.
Specifically, the input image is simultaneously passed through two encoders. The Qwen2.5-VL (vision-language encoder) captures high-level semantics such as object identity and scene context, while the VAE encoder encodes appearance information such as color, texture, and lighting. By fusing these two latent representations in the MMDiT’s diffusion core, editing becomes possible while independently controlling semantics and appearance.
Key Features
It supports both semantic editing (high-level) and appearance editing (low-level). Semantic editing enables IP character creation, object rotation, style transfer, and more, while appearance editing allows adding, removing, or modifying elements while fully preserving everything outside the specified region.
Notably, the text editing feature supports both Chinese and English, allowing text within images to be added, removed, or modified while preserving the original font, size, and style.
New features in Qwen-Image-Edit-2509: Simultaneous editing of multiple images is now supported, with combinations such as “person + person”, “person + product”, and “person + scene” (optimal performance with up to 3 images). Single-image consistency has also been significantly improved.
New features in Qwen-Image-Edit-2511: Popular community-created LoRAs have been integrated into the base model without additional tuning, along with significant improvements in character consistency and enhancements to industrial product design and geometric reasoning.
The following two features have been officially identified as integrated functions (specific LoRA names and creators have not been disclosed).
- Lighting Enhancement: Secondary relighting is now possible, allowing you to remove the existing lighting from the input image and apply the lighting from the reference image.
- Novel View Synthesis: It is now possible to move, rotate, and adjust the elevation angle of the camera.
Note that Qwen-Image-Edit particularly excels at use cases such as real photo editing, product photography, and text editing. It tends to struggle with edits that require fully preserving anime or illustration styles, but dedicated LoRAs can help address this in some cases.
License
At the time of writing, the license is Apache 2.0, making it an open-source model available for commercial use. ✅ Licenses are subject to change. If you are considering commercial use of the model, such as for generation services, always check the latest license.
How to Use the Qwen Image Edit 2511 – Material Replacement Template in ComfyUI
Let’s start by looking at the workflow example for “Qwen Image Edit 2511 – Material Replacement” from the template.
Open the template list and select Image from the GENERATION TYPE menu on the left.
Image-related templates will appear; select “Qwen Image Edit 2511 – Material Replacement”.
Opening the template will display the missing models. Download them by following the instructions, and you can run it right away.
The official documentation is as follows.
Models Used in the Qwen Image Edit 2511 - Material Replacement Workflow
Diffusion Model Text Encoders VAE LoRA (Optional – 4-step Lightning acceleration)Input Images for the Qwen Image Edit 2511 - Material Replacement Workflow
The two input images used in Qwen Image Edit 2511 – Material Replacement were not included in the official documentation. You can download them by opening the workflow in ComfyUI Cloud.
If you just need the assets, the same files are also uploaded to the drive below.
Nodes in the Qwen Image Edit 2511 - Material Replacement Workflow
This template uses subgraphs for a simple and easy-to-use layout. The key nodes are arranged in the main subgraph “Image Edit(Qwen-Image 2511)”, so let’s take a look.
FluxKontextImageScale
This node resizes input images to a size appropriate for Flux Kontext using the Lanczos algorithm.
TextEncodeQwenImageEditPlus
This node takes text instructions and reference images (up to 3) and converts the “edit this image like this” information into conditioning.
It combines text prompts and images to generate conditioning data in a format that the Qwen Image Edit model can understand. When an optional VAE is connected, it also simultaneously generates reference latents (the reference image converted into a set of numerical values the model can process) from the input image.
ModelSamplingAuraFlow
This node applies special sampling settings designed for the AuraFlow model architecture to the model, inheriting SD3’s sampling framework. The shift parameter adjusts the sampling distribution.
The reason AuraFlow is used is that AuraFlow and SD3/SD3.5 calculate shift using the same “linear scaling” method. In contrast, Flux uses an “exponential scaling” method and shifts vary depending on image size. AuraFlow node was likely chosen because it was the simplest way to specify a fixed shift value.
CFGNorm
This node can be used when you want a higher CFG scale (to increase prompt adherence) while suppressing image quality breakdown and color saturation. In editing workflows like Qwen Image Edit, it is used in combination to increase adherence to edit instructions while preventing image degradation. ✅ When using the Lightning 4-step LoRA, CFG becomes 1, so this node has no effect.
Edit Model Reference Method(FluxKontextMultiReferenceLatentMethod)
The “FluxKontextMultiReferenceLatentMethod” node is for modifying conditioning data and setting a specific reference latent processing method.
Since reference_latents_method is index_timestep_zero, in Flow Matching-based models, timestep t=0 means “a completely clean image (no noise)”. In short — though it may be confusing — this node informs the model that “this reference image is a finished product with no noise applied”, while proceeding with generation.
Switch
This is a branching node that selects one of two inputs based on a condition and outputs it. Previously, custom nodes like Crystools were used for this, but it is now included as a core node.
How to Use the Qwen Image Edit 2511 - Material Replacement Workflow
Now let’s use the workflow. ⚠️ First, please be aware that if you are using ComfyUI with SageAttention 2.2 on Windows, the generation results will be completely black. Remove --use-sage-attention from the ComfyUI launch command to disable SageAttention. ✅ There is also a way to apply SageAttention to Qwen Image Edit, which will be explained in a later paid article.
Now let’s go through the steps.
- Load input images: Load the downloaded sofa image into the “Load Image” node at the top. Then load the fur texture image into the node below.
-
Enter the prompt: Enter the editing instructions. The sample uses the following:
Change the furniture leather difference in image 1 to the fur material in image 2. - Enable/disable Lightning 4-Step LoRA: Set the value after the prompt to True to enable the Lightning 4-step LoRA.
- Load models: Verify that the appropriate models are loaded for the unet/text-encorder/vae/lora.
- Run the graph: Finally, execute the graph with the “Run” button. After a short wait, the edited image will be generated.
Generation Results
These are the results for the standard 40 steps and the Lightning 4-step LoRA. The same seed is used, but there is a noticeable difference in the fur texture. Generation times are measured in the author’s environment (RTX3090).


For reference, below are the results using Sage Attention. The same seed is used here as well. A slight quality degradation can be seen in the 40-step result.


Introducing the DCAI Qwen Image Edit 2511 + Multi-Angle Camera LoRA Workflow
In this section, I customize the Qwen Image Edit 2511 – Material Replacement template to create a general-purpose workflow for Qwen Image Edit 2511. The customizations are as follows.
- GGUF support:
- Multiangle Camera integration:
- SeedVR2 Video Upscaler integration:
- Windows SageAttention support:
The workflow and input assets are available on Patreon. Only paid supporters can view and download them.

Sample Results
A character image and background image were composited, and the camera angle was further modified.
Summary
Qwen-Image-Edit-2511 is an image editing-focused model available for commercial use under the Apache 2.0 license. Its dual-pass processing via a 20B parameter MMDiT architecture enables editing while independently controlling semantics and appearance. I found that it excels at real photos, product photography, and text editing, while also being capable enough for illustration-style content.
Starting from the ComfyUI template, I introduced a DCAI custom workflow that incorporates GGUF, MultiAngle Camera LoRA, SeedVR2 upscaler, and Windows SageAttention support. Thanks to ComfyUI’s native support, the process from downloading models to executing the workflow is well-organized, making this a relatively easy model to get started with. The workflow and input assets are available on Patreon, so please make use of them.
Thank you for reading to the end.
If you found this even a little helpful, please support by giving it a “Like”!

