Four AI video models, one tough prompt

In short

What this article says

Seedance 2.0 was the best AI video model in this test. It followed the first-person parachute prompt most completely, kept the helmet-camera feeling, included sound, and cost about $2.43. Veo 3.1 looked premium but rejected the original prompt and lost the first-person view on retry; Kling v3 Pro looked good but had weak motion; Wan 2.7 was cheap but not usable.

Test prompt: first-person BASE jump from a 150-meter abandoned smokestack into one muddy puddle.
Models compared: Seedance 2.0, Kling v3 Pro, Veo 3.1, and Wan 2.7 on fal.ai.
Winner: Seedance 2.0, because it followed the full idea with the least confusion.
Approximate working-clip costs: Seedance $2.43, Kling $1.34, Veo $3.20, Wan $1.20.

I had one hard idea for an AI video: a BASE jumper leaps from a 150-meter abandoned factory smokestack, filmed from their own helmet camera, and lands in the single muddy puddle in an otherwise dry yard. I gave the exact same prompt to four AI video models and checked two things — how real it looked, and how much it cost.

The test

I ran everything on fal.ai. It lets you use many video models from one place, in the same queue. So each model got the same prompt, and the prices were easy to compare side by side.

The prompt was hard on purpose. One long shot from the first-person view, a clear sense of height, a parachute, an industrial yard, and a muddy landing. Easy prompts make every model look good. Hard prompts show you where a model breaks.

I picked four models, each for a reason:

Seedance 2.0 (ByteDance) — my baseline for a clean, cinematic look.
Kling v3 Pro (Kuaishou) — known for action and smart shot planning.
Veo 3.1 (Google) — the premium option.
Wan 2.7 (Alibaba) — a cheaper model that can render at 1080p.

The prompt, in one line: a long first-person action shot — real height, a parachute, an industrial yard, and a muddy landing.

The exact prompt

I turned the rough idea into one clear prompt that every model could read. Here it is, word for word:

Prompt

A cinematic first-person POV action-camera shot from the helmet of a fictional BASE jumper standing on the top edge of a 150-meter abandoned industrial concrete smokestack. The location is a deserted factory complex with cracked asphalt, rusted metal structures, dry concrete ground, and only one large dirty muddy puddle on a road far below. The jumper's gloved hands appear at the edge, looking down over the massive chimney, then leaps forward. A fast stomach-dropping descent along the side of the tall concrete smokestack, wind noise, subtle camera shake, dramatic height and scale. The parachute opens quickly, suspension lines briefly visible, the camera glides toward the mostly dry industrial yard. The landing target becomes clear: one muddy puddle in the middle of the road. The jumper lands directly into the dirty puddle with a huge muddy splash covering the lens. Realistic physics, gritty documentary style, overcast light, cinematic color grading, wide-angle GoPro lens, intense but safe stunt, no injury, no gore, no text, no logos.

Most models also took a negative prompt — a short list of things to keep out of the shot:

Negative prompt

injury, blood, gore, death, broken body, cartoon, animation, text, subtitles, captions, logos, watermark, low quality, blur, distorted camera

Seedance 2.0 Winner

ByteDance · cinematic baseline

This was the best result. It kept the helmet-camera feeling, the hands, the parachute lines, and the rhythm of the story. It understood the whole idea, not just parts of it. My one real complaint was the landing — the fall into the puddle did not look fully real. Sound was included at no extra cost.

Endpointhttps://queue.fal.run/bytedance/seedance-2.0/text-to-video

Settings720p · 8s · audio on

Render~150s · seed 44209696

Cost≈ $2.43 173.7 × $0.014

Kling v3 Pro

Kuaishou · built for action

The picture looked good. The motion did not. I even turned on its "intelligent" shot planning, which is meant to split the scene into phases on its own — the edge, the drop, the parachute, the landing — but the actions still felt off and not real, even though Kling is built for action.

Endpointhttps://queue.fal.run/fal-ai/kling-video/v3/pro/text-to-video

Settings8s · audio on · shot_type "intelligent" · cfg_scale 0.5

Render~129s

Cost≈ $1.34 9.6 × $0.14

Veo 3.1

Google · the premium pick

This one was tricky. My first try used the exact prompt and was rejected — fal.ai returned a content-policy error. Big premium models often have stricter safety filters, and a stunt that looks dangerous (a leap off a 150-meter tower) can set them off. I softened the wording and sent it again. This time it rendered, but it dropped the first-person view for a third-person shot. It still looked great.

Endpointhttps://queue.fal.run/fal-ai/veo3.1

Settings1080p · 8s · audio on · safety_tolerance 4

OriginalRejected — content_policy_violation (422)

RetrySoftened wording → third-person clip

Cost≈ $3.20 8 × $0.40

Wan 2.7

Alibaba · the cheap challenger

Wan took the prompt, but the result fell apart. It looked like an odd video made from cuts that did not connect. It was the cheapest run at 1080p, but the clip was not usable. Unlike Veo, it did not block the scene.

Endpointhttps://queue.fal.run/fal-ai/wan/v2.7/text-to-video

Settings1080p · 8s · no audio · prompt expansion on

Render~186s · seed 2131596831

Cost≈ $1.20 12 × $0.10

The cost, and the winner

Counting the second Veo run, the working clips cost about $8.17 in total. The models are billed differently — some by clip length, some by compute time — so the cheapest render is not always the cheapest idea. The winner was not the cheapest one. It was Seedance 2.0 — the model that followed the whole idea with the least confusion.

The lesson: the best model is not the cheapest one, and not the one with the prettiest single frame. It is the one that follows your whole idea.

If you want to try this yourself

Start with one clear, hard idea. A tough prompt shows the real gap between models.
Write the full prompt once and send the same text to every model, so the test is fair.
Use one platform so the costs line up — and remember some bill by length, some by compute time.
Expect safety filters on premium models. If a prompt is blocked, soften it and try again.
Judge the whole clip, not one nice frame. Motion and story matter more.

Share on Telegram

Quick FAQ

Which AI video model won the parachute prompt test?

Seedance 2.0 won because it best preserved the first-person camera, parachute action, story rhythm, and overall prompt intent.

Was the cheapest AI video model the best choice?

No. Wan 2.7 was the cheapest working render at about $1.20, but the clip did not hold together. Seedance 2.0 cost more but produced the most usable result.

Why did Veo 3.1 not win?

The original Veo 3.1 run was blocked by a content-policy error. A softened retry rendered a good-looking clip, but it changed the first-person prompt into a third-person shot.