Seven models, one generator. Each Kling model handles a different creative task — from prompt-led video and reference-locked animation to motion transfer and AI image creation. Use the comparison table below to find the right model for your workflow, then start generating.
Explore each model page for full technical specs, use cases, and generation examples.
Text-to-video, image-to-video, multi-shot sequencing, optional audio, and 4K-capable renders with physics-accurate motion.
Reference-guided video with style preservation, character consistency, and visual identity lock across every frame.
Transfer gesture, dance, pose, or camera movement from a reference video to a still subject image at 1080p.
Lighter reference-video motion transfer with practical 720p and 1080p output paths at lower credit cost.
Side-by-side specs for every Kling model. Credits shown are for a 5-second standard clip (720p 16:9, no audio) — actual cost scales with resolution, duration, and audio.
Not sure which model to pick? Match your creative task to the right Kling model.
Prompt-led video from text or image
Kling 3.0 is the most versatile video model — text-to-video, image-to-video, Draft Mode for fast iteration, and physics-accurate motion from a single prompt.
Reference-controlled video with style lock
Kling O3 preserves the visual identity of your reference image across every frame. Style, character, and composition stay locked — no drift.
Movement transfer from a reference video
Upload a dance, gesture, or camera-movement reference and transfer that motion to any still subject. Full-body capture at 1080p.
Generate a reference frame before video
Create the style frame, product concept, or character reference first — then feed it into Kling O3 or Kling 3.0 for video generation.
Common questions about choosing between Kling AI models.