Image Model

Grok Imagine

xAI's Multimodal Model — Images and Videos with Synchronized Audio

Grok Imagine by xAI is a multimodal generation model supporting text-to-image, image-to-image, text-to-video, and image-to-video. Videos include automatically synchronized audio. Multiple generation modes — Normal, Fun, and Spicy — let you control creative tone and intensity. Outputs at 480p or 720p with aspect ratios including 16:9, 9:16, and 1:1.

Key Features

Text & Image to Video

Generate videos from text prompts or animate existing images into smooth short clips — 6 seconds at 480p or 720p.

Synchronized Audio

Videos include automatically synchronized background audio matching the tone and motion — no separate editing needed.

Creative Generation Modes

Normal for standard results, Fun for expressive creative takes, and Spicy Mode for more intense and artistic interpretations.

Image Generation & Editing

Create images from text or transform existing images — strong prompt adherence with bold, high-impact visual style.

How to Use

  1. 1

    Choose Output Type

    Select image or video generation. For video, choose text-to-video or image-to-video mode.

  2. 2

    Select Mode

    Pick Normal, Fun, or Spicy mode to set the creative tone.

  3. 3

    Generate

    Enter your prompt or upload an image, then generate. Videos include synchronized audio automatically.

Frequently Asked Questions

Everything about Grok Imagine

Grok Imagine is xAI's multimodal generation model supporting text-to-image, image-to-image, text-to-video, and image-to-video. Videos automatically include synchronized audio.
Yes! Grok Imagine supports text-to-video and image-to-video — generating 6-second clips at 480p or 720p with automatically synchronized background audio.
Three modes: Normal for standard results, Fun for expressive creative takes, and Spicy Mode for more intense interpretations. Note: Spicy mode is not available when using external image inputs for video.
Image aspect ratios: 1:1 and various portrait/landscape formats. Video: 2:3, 3:2, 1:1, 16:9, and 9:16.

Start creating with Grok Imagine

Free credits on signup. No credit card required.

Get Started Free