DebaterXDebaterX

Inngest for Long-Running AI Pipelines

Why a background job runner became the backbone of DebaterX.

·4 min read

An AI-generated debate takes about four minutes end-to-end. Script generation: 20 seconds. Image generation for both mascots: 45 seconds. Video synthesis: 90-120 seconds. Audio: 20 seconds. Composition and Mux upload: 30 seconds.

Four minutes is fine for the user, who's doing other things. Four minutes is not fine for a Vercel serverless function, which times out at 300 seconds on the Pro plan and way sooner on Hobby.

The naive approach — fire the whole pipeline from an API route and wait for it to finish — was always going to fail. I needed a job runner. I chose Inngest. Here's why.

The problem with serverless for long jobs

Serverless functions are optimized for short, stateless requests. HTTP in, compute briefly, HTTP out. Long-running anything fights the platform.

You can work around this with polling, with WebSockets, with queues — all of which introduce infrastructure you have to maintain. At some point, maintaining the workaround is more expensive than just using a job runner designed for this.

Why Inngest specifically

I evaluated a few options:

Inngest. Step-based workflows. Durable execution. Each step is retryable independently. Built-in observability for failed runs. TypeScript-native.

Temporal. Older, more feature-rich, used at large scale. Overkill for a product my size.

Self-hosted queues (BullMQ, RabbitMQ). Works fine, but requires operational expertise I'd rather not spend time on.

Vercel's own job system. Didn't exist when I started; when it launched, it didn't support the step-retry model I needed.

Inngest won because the steps model fit my pipeline exactly. Each stage of the debate generation is a natural step. Each step can retry independently if it fails. That's a huge operational win — if the video synthesis step fails, I don't re-run the script generation.

The pipeline

A debate job in DebaterX has these steps:

  1. Generate script (Gemini call, ~20s)
  2. Generate brand A image (Fal call, ~15s)
  3. Generate brand B image (Fal call, ~15s, parallel with step 2)
  4. Generate video (Fal call, ~90s)
  5. Generate audio (ElevenLabs call, ~20s)
  6. Composite (FFmpeg in a cloud function, ~15s)
  7. Upload to Mux (~10s)
  8. Update debate record (Supabase call, ~1s)

Each step is an Inngest step. Each step has its own retry policy. If step 4 fails, I retry step 4 without losing the work done in steps 1-3. If the whole job fails permanently, I can see exactly which step failed and why, in the Inngest dashboard.

The developer experience

Writing an Inngest function looks like writing normal code, with one wrapper:

The step functions are declared as async callbacks. The orchestrator handles retries, backoff, and state persistence. My application code doesn't need to know any of that — it just calls steps in order.

This is much simpler than managing a queue manually. I write the pipeline the way I think about it. Inngest handles the durability.

The failure modes I've handled

Rate limits. When a model provider rate-limits me, the step retries with exponential backoff. The user sees a slight delay; I don't see any failures.

Provider outages. When Fal has a bad hour (it happens), jobs wait in Inngest's retry queue. When Fal recovers, jobs resume automatically. No manual intervention.

Partial failures. If step 4 succeeds but step 5 fails, Inngest only re-runs step 5. Step 4's output is preserved. This saves me real money in GPU costs — a failed video upload doesn't force me to regenerate the video.

The operational benefit

I spend almost zero time on pipeline reliability. The Inngest dashboard shows me which jobs are running, which have failed, and why. When I deploy a change, I can test it against a dead-letter queue of previously-failed jobs. The platform handles what I would otherwise be building myself.

For an AI video product with long pipelines, Inngest is the right tool. If I were building this two years later, I'd probably still choose it. It's one of those infra decisions where you're right and it keeps being right.

← Back to all posts