← Back to research
·7 min read·product

Vercel AI Gateway

Analysis of Vercel AI Gateway — a unified API proxy for accessing 100s of AI models (text, image, video, embeddings) through a single endpoint with zero token markup, BYOK, and budget controls. Now serving 200K+ teams.

Key takeaways

  • Single API key provides access to 100s of models across all major providers — now including image generation, video generation, and embeddings
  • No markup on tokens — providers' list prices with budget controls and observability; fine print adds a $0.10/1,000-request Zero Data Retention fee and payment processing fees on credit purchases
  • Real production scale — tens of trillions of tokens served across 200K+ teams per Vercel's AI Gateway Production Index
  • It is an API proxy, not an execution sandbox — Vercel's separate Sandbox product is the company's actual entry in that category

FAQ

Is Vercel AI Gateway free?

New accounts get $5 of credit every 30 days. Beyond that, you pay provider list prices for tokens with no markup, though credit purchases carry a payment processing fee.

Can I use my own API keys?

Yes, BYOK (Bring Your Own Key) is supported with 0% markup. Route through the gateway for observability while using your own provider keys.

Is Vercel AI Gateway a sandbox?

No. It is a model-routing API proxy. Vercel's separate Sandbox product provides isolated compute for running untrusted or agent-generated code.

Overview

Vercel AI Gateway is a unified API proxy that provides access to hundreds of AI models through a single endpoint. It simplifies multi-provider deployments with budget controls, observability, automatic fallbacks, and no token markup. As of mid-2026 it has expanded beyond text to image generation, video generation (marked "New"), embeddings, and web search.[1][2]

Quick Reference
Websitevercel.com/ai-gateway
Documentationvercel.com/docs/ai-gateway
AvailabilityGA, all Vercel plans
Scale (June 2026)Tens of trillions of tokens, 200K+ teams[3]

Category Fit

This profile sits in the AI Agent Sandboxes hub, but AI Gateway is not a sandbox — it executes no code and isolates nothing. It is a model-routing proxy, a direct competitor to OpenRouter and LiteLLM. It belongs in an AI gateway / inference-routing category; its presence here is as supporting infrastructure that agent sandboxes commonly call out to. Vercel's actual sandbox offering is the separate Vercel Sandbox product — now GA with usage-based pricing ($0.60 per million sandbox creations, Active-CPU billing at $0.128/hr on Pro, up to 32 vCPUs) — which is the better-fitting member for a sandboxes comparison.[4]


Adoption (as of June 2026)

Per Vercel's own AI Gateway Production Index, the gateway has served tens of trillions of tokens for 200K+ unique teams over seven months of production traffic. In May 2026, tokens grew 20% month-over-month and spend grew 43%. Anthropic took 65% of all gateway spend (32% of tokens), while DeepSeek's V4 launch pushed it to 17% of token volume — third place, ahead of OpenAI — at roughly 1% of spend. Agentic traffic dominates: just under a quarter of requests end in a tool call, but those requests carry well over half of all tokens.[3]

These are vendor-published numbers; treat directional trends as more reliable than absolutes.


Key Features

One Key, Hundreds of Models

Single API key accesses OpenAI, Anthropic, Google, xAI, DeepSeek, and more. No need to manage separate keys for each provider, and Vercel places no rate limits of its own on queries.[2]

Unified API

Switch between providers and models with minimal code changes. OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages compatible endpoints are available, alongside native AI SDK v5/v6 integration.[1]

Beyond Text

Image generation, video generation, embeddings, and web search are now first-class capabilities, added since early 2026.[1][2]

Zero Token Markup

Tokens cost exactly what providers charge — no middleman markup (0%), applying to both Vercel-provided keys and BYOK.[1] See Cautions for the fee fine print.

High Reliability

  • Automatic retries — Failed requests retry to alternative providers
  • Load balancing — Distribute traffic across providers
  • Fallbacks — Define backup models if primary is unavailable, including automatic failover during provider outages[2]

Observability

Built-in request traces, token counts, latency metrics, spend tracking, and TTFT (time to first token) in the Vercel dashboard, plus a "disallow prompt training" control.[1]

BYOK (Bring Your Own Key)

Use your own provider API keys while routing through the gateway for observability and fallback benefits, with no Vercel markup.[1]


Supported Providers

The gateway supports 100+ models from:

  • OpenAI — GPT-5.x, Responses API models
  • Anthropic — Claude Opus 4.x, Sonnet 4.x
  • Google — Gemini family
  • xAI — Grok 4.x
  • DeepSeek — V4 (17% of gateway token volume in May 2026)[3]
  • Meta and more — Llama via inference providers; text, image, and video models[2]

SDK Compatibility

Works seamlessly with:

  • Vercel AI SDK (native integration, v5 and v6)
  • OpenAI SDK (Chat Completions and Responses compatible endpoints)
  • Anthropic SDK (Messages-compatible endpoint)
  • LiteLLM (proxy support)
  • LlamaIndex (integration available)
  • cURL / any HTTP client
// Vercel AI SDK example
import { generateText } from 'ai';

const { text } = await generateText({
  model: 'anthropic/claude-opus-4.7',
  prompt: 'What is the capital of France?',
});

Pricing

ComponentCost
Free credit$5 every 30 days for new accounts[2]
Token costsProvider list price (no markup)[1]
Gateway routingIncluded
BYOKSupported, 0% markup (pay provider directly)[1]
Team-wide Zero Data Retention$0.10 per 1,000 successful requests[5]
Credit purchasesPayment processing fee (~3.2%, Stripe standard)[5]

Strengths

  • Simplicity — One key for all providers
  • Cost transparency — No markup on tokens
  • Reliability — Automatic fallbacks and retries
  • Observability — Built-in tracking in Vercel dashboard
  • Production-proven — Tens of trillions of tokens across 200K+ teams[3]

Cautions

  • Vercel-centric — Best experience on Vercel platform
  • Fee fine print — "No platform fees" marketing coexists with a per-request ZDR surcharge and payment processing fees; one reviewer calls this "a trust gap"[5]
  • Basic routing only — No conditional routing, A/B testing, or traffic splitting; slower than OpenRouter to add the newest models[6]
  • Serverless constraints — 5-minute function timeouts on Pro and 4.5MB request bodies can pinch long-running agent workflows[6]
  • Vendor dependency — All traffic routes through Vercel; users have reported intermittent latency and provider-routing degradation[7]

What Developers Say

"I'm biased, but like Vercel's AI Gateway" — daniel_levine, Hacker News, August 2025[8]

"Response times usually were fast (3-4s), but since yesterday, providers started to peak at 45-60s for a single call" — user report, Vercel Community forum[7]

"When a vendor leads with 'no platform fees' and I find a per-request ZDR surcharge in the docs... that's a trust gap" — Folding Sky review, April 2026[5]

Direct Reddit threads on the gateway are sparse; most community discussion happens on Vercel's own forums and in OpenRouter comparison posts.


Competitive Position

vs AlternativeAI Gateway AdvantageCompetitor Advantage
Direct provider APIsOne key, fallbacks, observabilityNo middleman, no ZDR/processing fees
LiteLLM ProxyManaged, no infrastructureSelf-hosted, open-source, more config
OpenRouterVercel integration, no markupBroader catalog, faster model adds, conditional routing[6]

Best fit: Teams using Vercel who want unified multi-provider access without managing infrastructure or paying markup.


Bottom Line

Vercel AI Gateway is the simplest way to access multiple AI providers from a single API, and June 2026 production-index data shows it operating at serious scale. Zero markup pricing is genuinely competitive, though the ZDR surcharge and payment processing fees mean "no platform fees" deserves an asterisk.

Recommended for teams already on Vercel or AI SDK who want managed multi-provider routing with transparent token pricing. Not recommended as a sandboxes-category pick — it isn't one — or for teams needing conditional routing, A/B traffic splitting, or day-zero access to brand-new models, where OpenRouter or self-hosted LiteLLM fit better. Outlook: with agentic traffic now carrying over half of gateway tokens and Vercel pairing it with the GA Vercel Sandbox, expect the gateway to keep consolidating as the default routing layer for Vercel-hosted agents.[3][4]


Research by Ry Walker Research