Key takeaways
- Single API key provides access to 100s of models across all major providers — now including image generation, video generation, and embeddings
- No markup on tokens — providers' list prices with budget controls and observability; fine print adds a $0.10/1,000-request Zero Data Retention fee and payment processing fees on credit purchases
- Real production scale — tens of trillions of tokens served across 200K+ teams per Vercel's AI Gateway Production Index
- It is an API proxy, not an execution sandbox — Vercel's separate Sandbox product is the company's actual entry in that category
FAQ
Is Vercel AI Gateway free?
New accounts get $5 of credit every 30 days. Beyond that, you pay provider list prices for tokens with no markup, though credit purchases carry a payment processing fee.
Can I use my own API keys?
Yes, BYOK (Bring Your Own Key) is supported with 0% markup. Route through the gateway for observability while using your own provider keys.
Is Vercel AI Gateway a sandbox?
No. It is a model-routing API proxy. Vercel's separate Sandbox product provides isolated compute for running untrusted or agent-generated code.
Overview
Vercel AI Gateway is a unified API proxy that provides access to hundreds of AI models through a single endpoint. It simplifies multi-provider deployments with budget controls, observability, automatic fallbacks, and no token markup. As of mid-2026 it has expanded beyond text to image generation, video generation (marked "New"), embeddings, and web search.[1][2]
| Quick Reference | |
|---|---|
| Website | vercel.com/ai-gateway |
| Documentation | vercel.com/docs/ai-gateway |
| Availability | GA, all Vercel plans |
| Scale (June 2026) | Tens of trillions of tokens, 200K+ teams[3] |
Category Fit
This profile sits in the AI Agent Sandboxes hub, but AI Gateway is not a sandbox — it executes no code and isolates nothing. It is a model-routing proxy, a direct competitor to OpenRouter and LiteLLM. It belongs in an AI gateway / inference-routing category; its presence here is as supporting infrastructure that agent sandboxes commonly call out to. Vercel's actual sandbox offering is the separate Vercel Sandbox product — now GA with usage-based pricing ($0.60 per million sandbox creations, Active-CPU billing at $0.128/hr on Pro, up to 32 vCPUs) — which is the better-fitting member for a sandboxes comparison.[4]
Adoption (as of June 2026)
Per Vercel's own AI Gateway Production Index, the gateway has served tens of trillions of tokens for 200K+ unique teams over seven months of production traffic. In May 2026, tokens grew 20% month-over-month and spend grew 43%. Anthropic took 65% of all gateway spend (32% of tokens), while DeepSeek's V4 launch pushed it to 17% of token volume — third place, ahead of OpenAI — at roughly 1% of spend. Agentic traffic dominates: just under a quarter of requests end in a tool call, but those requests carry well over half of all tokens.[3]
These are vendor-published numbers; treat directional trends as more reliable than absolutes.
Key Features
One Key, Hundreds of Models
Single API key accesses OpenAI, Anthropic, Google, xAI, DeepSeek, and more. No need to manage separate keys for each provider, and Vercel places no rate limits of its own on queries.[2]
Unified API
Switch between providers and models with minimal code changes. OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages compatible endpoints are available, alongside native AI SDK v5/v6 integration.[1]
Beyond Text
Image generation, video generation, embeddings, and web search are now first-class capabilities, added since early 2026.[1][2]
Zero Token Markup
Tokens cost exactly what providers charge — no middleman markup (0%), applying to both Vercel-provided keys and BYOK.[1] See Cautions for the fee fine print.
High Reliability
- Automatic retries — Failed requests retry to alternative providers
- Load balancing — Distribute traffic across providers
- Fallbacks — Define backup models if primary is unavailable, including automatic failover during provider outages[2]
Observability
Built-in request traces, token counts, latency metrics, spend tracking, and TTFT (time to first token) in the Vercel dashboard, plus a "disallow prompt training" control.[1]
BYOK (Bring Your Own Key)
Use your own provider API keys while routing through the gateway for observability and fallback benefits, with no Vercel markup.[1]
Supported Providers
The gateway supports 100+ models from:
- OpenAI — GPT-5.x, Responses API models
- Anthropic — Claude Opus 4.x, Sonnet 4.x
- Google — Gemini family
- xAI — Grok 4.x
- DeepSeek — V4 (17% of gateway token volume in May 2026)[3]
- Meta and more — Llama via inference providers; text, image, and video models[2]
SDK Compatibility
Works seamlessly with:
- Vercel AI SDK (native integration, v5 and v6)
- OpenAI SDK (Chat Completions and Responses compatible endpoints)
- Anthropic SDK (Messages-compatible endpoint)
- LiteLLM (proxy support)
- LlamaIndex (integration available)
- cURL / any HTTP client
// Vercel AI SDK example
import { generateText } from 'ai';
const { text } = await generateText({
model: 'anthropic/claude-opus-4.7',
prompt: 'What is the capital of France?',
});
Pricing
| Component | Cost |
|---|---|
| Free credit | $5 every 30 days for new accounts[2] |
| Token costs | Provider list price (no markup)[1] |
| Gateway routing | Included |
| BYOK | Supported, 0% markup (pay provider directly)[1] |
| Team-wide Zero Data Retention | $0.10 per 1,000 successful requests[5] |
| Credit purchases | Payment processing fee (~3.2%, Stripe standard)[5] |
Strengths
- Simplicity — One key for all providers
- Cost transparency — No markup on tokens
- Reliability — Automatic fallbacks and retries
- Observability — Built-in tracking in Vercel dashboard
- Production-proven — Tens of trillions of tokens across 200K+ teams[3]
Cautions
- Vercel-centric — Best experience on Vercel platform
- Fee fine print — "No platform fees" marketing coexists with a per-request ZDR surcharge and payment processing fees; one reviewer calls this "a trust gap"[5]
- Basic routing only — No conditional routing, A/B testing, or traffic splitting; slower than OpenRouter to add the newest models[6]
- Serverless constraints — 5-minute function timeouts on Pro and 4.5MB request bodies can pinch long-running agent workflows[6]
- Vendor dependency — All traffic routes through Vercel; users have reported intermittent latency and provider-routing degradation[7]
What Developers Say
"I'm biased, but like Vercel's AI Gateway" — daniel_levine, Hacker News, August 2025[8]
"Response times usually were fast (3-4s), but since yesterday, providers started to peak at 45-60s for a single call" — user report, Vercel Community forum[7]
"When a vendor leads with 'no platform fees' and I find a per-request ZDR surcharge in the docs... that's a trust gap" — Folding Sky review, April 2026[5]
Direct Reddit threads on the gateway are sparse; most community discussion happens on Vercel's own forums and in OpenRouter comparison posts.
Competitive Position
| vs Alternative | AI Gateway Advantage | Competitor Advantage |
|---|---|---|
| Direct provider APIs | One key, fallbacks, observability | No middleman, no ZDR/processing fees |
| LiteLLM Proxy | Managed, no infrastructure | Self-hosted, open-source, more config |
| OpenRouter | Vercel integration, no markup | Broader catalog, faster model adds, conditional routing[6] |
Best fit: Teams using Vercel who want unified multi-provider access without managing infrastructure or paying markup.
Bottom Line
Vercel AI Gateway is the simplest way to access multiple AI providers from a single API, and June 2026 production-index data shows it operating at serious scale. Zero markup pricing is genuinely competitive, though the ZDR surcharge and payment processing fees mean "no platform fees" deserves an asterisk.
Recommended for teams already on Vercel or AI SDK who want managed multi-provider routing with transparent token pricing. Not recommended as a sandboxes-category pick — it isn't one — or for teams needing conditional routing, A/B traffic splitting, or day-zero access to brand-new models, where OpenRouter or self-hosted LiteLLM fit better. Outlook: with agentic traffic now carrying over half of gateway tokens and Vercel pairing it with the GA Vercel Sandbox, expect the gateway to keep consolidating as the default routing layer for Vercel-hosted agents.[3][4]
Research by Ry Walker Research
Sources
- [1] Vercel AI Gateway Documentation
- [2] Vercel AI Gateway
- [3] Vercel AI Gateway Production Index — June 2026
- [4] Vercel Sandbox Pricing and Limits
- [5] Vercel AI Gateway: One Killer Feature, One Pricing Catch
- [6] Vercel AI Gateway vs OpenRouter (TrueFoundry)
- [7] Vercel Community: AI Gateway latency and provider routing issues
- [8] Hacker News discussion mentioning Vercel AI Gateway