Video Model

Wan 2.2

Efficient AI Video with Speech-to-Video and Turbo Speed

Wan 2.2 is a Mixture-of-Experts (MoE) AI video model from Wan AI, using 14B active parameters from a 27B model for efficient cinematic output. It supports text-to-video, image-to-video, and unique speech-to-video generation. The Turbo mode delivers 720p at 24fps with prompt-controlled camera motion — zooms, pans, dolly shots — and strong temporal stability.

Key Features

Speech-to-Video

Unique speech-to-video (S2V) mode — provide audio input and Wan generates a matching video with synchronized visuals.

MoE Architecture

Mixture-of-Experts architecture with 14B active parameters (from 27B total) delivers Turbo-speed generation without sacrificing cinematic quality.

Cinematic 720p @ 24fps

Outputs cinematic 720p video at 24 frames per second with prompt-guided camera motion — zoom, pan, dolly — and strong temporal stability.

Style Versatility

From photorealistic to anime, oil painting to cyberpunk — Wan 2.2 handles diverse artistic styles with consistent quality.

How to Use

  1. 1

    Choose Your Mode

    Pick text-to-video, image-to-video, or speech-to-video mode depending on your input.

  2. 2

    Write Prompt & Upload

    Describe your scene with camera direction. For image/speech mode, upload the reference file.

  3. 3

    Generate

    Get your cinematic 720p video fast — Wan 2.2 Turbo is one of the quickest models available.

Frequently Asked Questions

Everything about Wan 2.2

Wan 2.2 is a Mixture-of-Experts AI video model from Wan AI. It uses 14B active parameters from a 27B model to deliver cinematic 720p video at 24fps with Turbo-speed generation.
Speech-to-video (S2V) is a unique feature that lets you provide an audio file as input and Wan 2.2 generates a video synchronized to match it. Great for creating videos from spoken narration or music.
Wan 2.2 outputs at 480p, 580p, or 720p at 24 frames per second in 16:9 or 9:16 aspect ratios.
Use Wan 2.2 for quick cinematic video, speech-to-video workflows, or efficient batch generation. For native audio in the video itself, consider Veo 3.1 or Seedance 2.0. For maximum duration, use Kling 3.0.

Start creating with Wan 2.2

Free credits on signup. No credit card required.

Get Started Free