Wan2.2 Animate Guide|How to Turn Images into Videos with ComfyUI

In this article, I will explain “Wan2.2 Animate” in detail. I have been introducing the Wan2.2 series, and Wan2.2 Animate is a model that combines Image-to-Video (I2V) and Video-to-Video (V2V) technologies. By combining a reference image with a reference video, you can freely control the appearance and motion of the generated video.
What You Will Learn
- The basic mechanism of Wan2.2 Animate and the differences between its two modes (Character Animation / Character Replacement)
- How to set up the ComfyUI official workflow, including required models and custom nodes
- Parameter explanations for each node (Points Editor, SAM2, WanAnimateImageToVideo, etc.)
- Preparing input materials, creating control videos, and generating reference images (💎Premium Article)
- High-quality video generation using a custom workflow and how to extend sampling (💎Premium Article)
- How to calculate frame counts and mask settings in Character Replacement mode (💎Premium Article)
What Is WAN2.2 Animate
“Wan2.2 Animate” is a model that applies the motion from a video to a character in an image by combining a single image (reference image) with a motion source video (reference video).
Specifically, it integrates Image-to-Video (I2V) and Video-to-Video (V2V) elements. Its key feature is the ability to apply motion (from the reference video) while preserving the character’s appearance (from the reference image).
Officially titled “Unified Character Animation and Replacement with Holistic Replication,” it specializes in animating characters while preserving their overall features.
Two Modes of Wan2.2 Animate
Wan2.2 Animate has two main modes:
- Character Animation Mode: A mode that applies only the motion from the reference video to the character in the reference image. The background is either preserved from the reference image or replaced with a simple background.
- Character Replacement Mode: A mode that swaps the character in the reference video with the character from the reference image. The background, lighting, and camera work from the reference video are preserved as-is.
By choosing the appropriate mode for your purpose, you can greatly expand the range of your video expressions.
Exploring the ComfyUI Template “Wan2.2 Animate”
Since this is not included in the ComfyUI templates, let’s download the workflow from the official ComfyUI tutorial page.
This workflow includes both “Character Animation Mode” and “Character Replacement Mode.” ✅Right after loading, it defaults to “Character Replacement Mode.”

Models Used in "Wan2.2 Animate"
The workflow uses the following models. This workflow also requires CLIP Vision. The download destination for CLIP Vision is ComfyUI/models/clip_visions/.
Input Materials for the ComfyUI "Wan2.2 Animate" Workflow
The images and videos used as input are posted midway through the linked page. Download both the Reference Image and the Reference Video.
Custom Nodes for the ComfyUI "Wan2.2 Animate" Workflow
This workflow uses the following three custom nodes. Make sure to install them in advance.
Nodes in the ComfyUI "Wan2.2 Animate" Workflow
Since the basic structure is shared with other Wan2.2 series workflows, we will focus on the newly added nodes here.
About the Points Editor Node
The “Points Editor” node is used to specify arbitrary coordinates (points) on an image. It is used for point selection when performing segmentation with SAM2. The role of each parameter is as follows:
- bg_image: Loads the image on which to specify points.
- points_store: Internal string data that holds point information.
- bbox_store: Internal string data that holds bounding box information.
- bbox_format: Specifies the bounding box format (xyxy/xywh, etc.).
- width: Specifies the width of the editing canvas.
- height: Specifies the height of the editing canvas.
- normalize: When set to
True, coordinates are output as ratios (0.0–1.0) relative to the image size. When set toFalse, coordinates are output as absolute pixel values.
About the (Down)Load SAM2model Node
The “(Down)Load SAM2model” node loads the SAM2 (Segment Anything Model 2) model. If the model is not available locally, it will be downloaded automatically.
- model: Selects the SAM2 model to use (e.g., sam2_hiera_large.pt).
- segmentor: Specifies the segmentation mode to use (single/video/automaskgenerator).
automaskgeneratorautomatically generates masks from the entire image. - device: Specifies the device for computation (cuda/cpu).
- precision: Specifies the computation precision (fp16/bf16, etc.).
About the Sam2Segmentation Node
The “Sam2Segmentation” node uses the SAM2 model and specified point information to segment (extract) specific objects from an image.
- sam2_model: Input the loaded SAM2 model.
- image: Input the image to be segmented.
- coordinates_positive: Input the positive coordinates (areas to include) of the segmentation target. Accepts coordinate data from the “Points Editor” node, etc.
- coordinates_negative: Input the negative coordinates (areas to exclude) from the segmentation.
- bboxes: Input the bounding boxes (rectangular areas) surrounding the segmentation target.
- mask: Input an existing mask image to use as a hint for segmentation.
- keep_model_loaded: Sets whether to keep the model in VRAM after processing.
- individual_objects: When set to
True, each detected object is processed individually.
About the Grow Mask Node
The “Grow Mask” node expands (grows) or shrinks a mask area. It is commonly used for fine-tuning segmentation boundaries.
- mask: Input the mask to process.
- expand: Specifies the number of pixels to expand the mask. A negative value shrinks it.
- tapered_corners: When set to
True, corners are rounded for a more natural shape.
About the Blockify Mask Node
The “Blockify Mask” node converts a mask into block-like (mosaic-like) shapes. It is used for removing fine noise or achieving specific stylization effects.
- mask: Input the mask to process.
- block_size: Specifies the block size. Larger values produce coarser blocks.
- device: Specifies the device for computation (cpu/gpu). Default is
cpu.
About the WanAnimateImageToVideo Node
The “WanAnimateImageToVideo” node is the core node of Wan2.2 Animate, which generates video by combining a reference image and a reference video.
- positive/negative: Input the prompts.
- vae: Input the VAE.
- clip_vision_output: Input the clip vision output.
- reference_image: Input the reference image of the character or subject.
- face_video: Input for applying a control video for facial expression movements.
- pose_video: Input for applying a control video for full-body poses and movements.
- background_video: Input the video used in Character Replacement mode.
- charactor_mask: Input the mask specifying the area of the subject to replace on the background_video in Character Replacement mode.
- continue_motion: Sets whether the generated video continues the motion from a previous video (or frame). Used for generating longer videos.
- width/height: Sets the video dimensions.
- length: Sets the video length in number of frames.
- batch_size: Sets the number of videos to generate at once.
- continue_motion_max_frames: Sets the number of reference frames for motion continuation. By referencing this many frames from the end of the previous video, it helps create smoother transitions between segments.
- video_frame_offset: Sets the starting frame position for reading the reference video.
About the TrimVideoLatent Node
The “TrimVideoLatent” node trims (cuts) the latent space data of a video from the beginning by a specified number of frames. It is used to adjust the length of reference videos.
- samples: Input the latent data to trim.
- trim_amount: Specifies the number of frames to trim from the beginning of the video. Increasing this value makes the video shorter.
About the ImageFromBatch Node
The “ImageFromBatch” node extracts specific images from an image batch.
- image: Input the batch of images.
- batch_index: Specifies the index (number) of the image to extract (starting from 0).
- length: Specifies how many images to extract (up to 4096 frames).
About the Batch Images Node
The “Batch Images” node combines multiple images into a single batch.
- image1: Input the first image.
- image2: Input the second image. These are combined and output.
How to Use the ComfyUI "Wan2.2 Animate" Workflow
Let’s try using the workflow in practice. Here, we will explain the steps using Character Replacement mode as an example.
- Load the downloaded reference image into “Load Image.”
- Load the downloaded reference video into “Load Video.”
- After confirming that all models are loaded correctly, click the “Run” button to execute.
After a while, a video will be generated in which the character from the reference image moves with the motion from the reference video. ✅The input is displayed at the top of the video, and you can see that this input video did not capture well.
Customizing the Official Workflow
From here, we will customize the official workflow for practical use. This time, splitting each process into separate workflows proved more convenient, so I divided it into three workflows. Including preparation, there are four steps in total.
- Preparing the input video: Preparing the video to be converted into a control video.
- Converting video to control video: Use DCAI_Video2Control-OpenPose to create a control video from the input video.
- Creating the reference image: Use DCAI_Control2RefImage-OpenPose to generate a reference image from the control video.
- Generating the video: Use DCAI_wan2_2_14B_animate to generate a video using the control video and reference image.
Workflow Generation Examples
When you run the custom workflow, a high-quality video is generated that applies the motion from the reference video to the character in the reference image. ⚠️In this example, a reference video with audio is used. Please be aware that sound will play when you unmute.
For this example, we used 🔗a video by satynek from Pixabay.
Wan2.2 Animate has two modes. The video above was generated using “Character Animation Mode,” and the video below was generated using “Character Replacement Mode.”
The workflows and input materials are available on Patreon. They are only accessible for download by paid supporters.

Summary
In this article, we covered everything from the basic mechanism of Wan2.2 Animate to practical usage in ComfyUI. Wan2.2 Animate is a powerful model that lets you add natural motion to characters while preserving their appearance by combining a reference image and a reference video. By using the two modes—”Character Animation Mode” and “Character Replacement Mode”—you can achieve a wide range of video expressions. Take advantage of custom workflows and try generating high-quality videos for yourself.
Thank you for reading to the end.
If you found this even a little helpful, please support by giving it a “Like”!

