Wan 2.2 is a Mixture-of-Experts (MoE) AI video model from Wan AI, using 14B active parameters from a 27B model for efficient cinematic output. It supports text-to-video, image-to-video, and unique speech-to-video generation. The Turbo mode delivers 720p at 24fps with prompt-controlled camera motion — zooms, pans, dolly shots — and strong temporal stability.
Unique speech-to-video (S2V) mode — provide audio input and Wan generates a matching video with synchronized visuals.
Mixture-of-Experts architecture with 14B active parameters (from 27B total) delivers Turbo-speed generation without sacrificing cinematic quality.
Outputs cinematic 720p video at 24 frames per second with prompt-guided camera motion — zoom, pan, dolly — and strong temporal stability.
From photorealistic to anime, oil painting to cyberpunk — Wan 2.2 handles diverse artistic styles with consistent quality.
Pick text-to-video, image-to-video, or speech-to-video mode depending on your input.
Describe your scene with camera direction. For image/speech mode, upload the reference file.
Get your cinematic 720p video fast — Wan 2.2 Turbo is one of the quickest models available.
Everything about Wan 2.2