Key takeaways
- $113M Series B led by CapitalG (Alphabet) at ~$1.3B post-money in May 2026 — more than double the $547M valuation from the June 2025 Series A — with NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks venture arms participating
- Weekly token volume grew 5x in six months, from 5 trillion to 25 trillion tokens/week, on pace for a quadrillion tokens this year across 8M+ developers and 400+ models
- The business model is a ~5% take on routed inference: provider list prices pass through with no token markup, monetized via a 5.5% fee on credit purchases and a 5% BYOK fee after 1M free requests/month
- Asset-light by design — OpenRouter owns no GPUs; it routes across 60+ providers with failover, cost/latency optimization, and zero-data-retention filtering, which is both the moat and the commoditization risk
FAQ
What is OpenRouter?
OpenRouter is a unified API gateway that gives developers access to 400+ large language models from 60+ providers through a single OpenAI-compatible endpoint, with automatic routing on price, latency, and uptime.
How much does OpenRouter cost?
Provider list prices pass through with no token markup; OpenRouter charges a 5.5% platform fee when you buy pay-as-you-go credits, and BYOK usage is free for the first 1M requests per month, then 5% of the equivalent OpenRouter cost.
Does OpenRouter host its own models?
No — it is a routing layer, not an inference provider; requests are proxied to 60+ upstream providers including Anthropic, OpenAI, Google, xAI, and open-weight hosts, with provider-level failover.
How is OpenRouter different from Vercel AI Gateway?
Both are zero-markup multi-model gateways; OpenRouter is the standalone category leader with 400+ models, public usage rankings, and a 25T tokens/week run rate, while Vercel's gateway is a feature of the Vercel platform.
Executive Summary
OpenRouter is the switchboard of the multi-model era: a unified gateway that puts 400+ large language models from 60+ providers behind a single OpenAI-compatible endpoint, routing each request on price, latency, and uptime with provider-level failover.[1][2] It owns no GPUs and hosts no models — provider list prices pass through "without any markup," and the company monetizes the convenience layer instead: a 5.5% fee on credit purchases and a 5% bring-your-own-key fee after 1M free requests per month, working out to roughly a 5% take on routed inference spend.[2][3][1]
The traction validates the thesis. Weekly token volume grew 5x in six months — from 5 trillion to 25 trillion tokens — putting OpenRouter on pace to process over a quadrillion tokens this year across 8M+ developers.[4] Sacra estimates $50M in annualized revenue as of March 2026, up from about $19M at the end of 2025.[1] In May 2026 the company closed a $113M Series B led by CapitalG (Alphabet's growth fund) at roughly $1.3B post-money — more than double the $547M valuation from its $40M Series A a year earlier — with the venture arms of NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks all participating.[5][4]
| Attribute | Value |
|---|---|
| Company | OpenRouter, Inc. (New York, NY)[1] |
| Founder | Alex Atallah (CEO)[1] |
| Founded | 2023[1] |
| Funding | $113M Series B (May 2026) led by CapitalG at ~$1.3B post; $40M Series A (June 2025, a16z + Menlo); $12.5M seed (a16z)[5][4][1] |
| Scale | 25T tokens/week; 8M+ developers; 400+ models; 60+ providers[4][1] |
| Revenue | ~$50M annualized (March 2026, Sacra estimate) on a ~5% take rate[1] |
Product Overview
The core loop: swap your OpenAI base URL for OpenRouter's, and one API key now reaches every major lab and open-weight host. OpenRouter implements the OpenAI API specification for /completions and /chat/completions, so existing OpenAI SDKs work as a drop-in replacement.[2] Behind the endpoint, intelligent routing handles provider failover, cost and latency optimization, and quality-aware selection — including filtering to zero-data-retention providers only.[4]
The past year expanded the surface beyond text: image, audio, speech, transcription, embedding, and video models now route through the same gateway, alongside enterprise controls — workspaces, spend management, and guardrails.[4] The public model rankings, which chart token share across the industry in near-real time, have become a de facto market-share scoreboard that HN threads cite as primary data.[6][7]
Key Capabilities
| Capability | Description |
|---|---|
| Unified API | OpenAI-compatible endpoint to 400+ models from 60+ providers[2][1] |
| Intelligent routing | Provider failover, cost/latency optimization, quality-aware routing[4] |
| Consolidated billing | One credit balance across all providers; per-key spend limits with periodic refill[3][7] |
| BYOK | Bring your own provider keys; first 1M requests/month free, then 5% fee[3] |
| Privacy controls | Zero prompt/completion logging by default; opt-in logging earns a 1% discount; ZDR provider filtering[2][4] |
| Multimodal | Image, audio, speech, transcription, embedding, and video models[4] |
| Free tier | Small free allowance plus rate-limited free models for testing[2] |
Technical Architecture
OpenRouter is a proxy, not a cloud: requests hit its routing layer, which selects an upstream provider per model based on price, latency, uptime, and user preferences, then streams the response back through the OpenAI-compatible interface.[1][2] The asset-light model — no GPUs, no model hosting — is what enables high gross margins on a ~5% take and instant coverage of every new model release.[1] The flip side is that output quality inherits whatever the routed provider does: third-party hosts vary in quantization and caching behavior, a recurring developer complaint (see What Developers Say).[7]
Key Technical Details
| Aspect | Detail |
|---|---|
| Deployment | Managed cloud gateway only; no self-hosting[6] |
| Model(s) | 400+ models from Anthropic, OpenAI, Google, xAI, DeepSeek, and 60+ providers[5][1] |
| Integrations | OpenAI SDK drop-in; supported natively by most agent CLIs and IDE tools[2] |
| Open Source | Platform proprietary; SDKs and docs on GitHub[6] |
Strengths
- The default multi-model on-ramp — one key, one OpenAI-compatible API, every frontier and open-weight model on release day; 8M+ developers use it, and "lowest friction" is the consistent community verdict.[4][7]
- Hypergrowth with revenue behind it — token volume up 5x in six months to 25T/week, and ~$50M annualized revenue (March 2026) versus ~$19M at the end of 2025.[4][1]
- No token markup, transparent take — provider list prices pass through; the platform fee sits on credit purchases (5.5%) and post-allowance BYOK (5%), so the economics are legible.[2][3]
- Billing controls providers still lack — prepaid credits, per-key spend limits with periodic refill, and hard caps; HN commenters single this out versus hyperscaler billing.[7]
- Strategic cap table as distribution — CapitalG, NVentures, ServiceNow, MongoDB, Snowflake, and Databricks venture arms signal enterprise-platform alignment, not just capital.[4]
Cautions
- The 5% take is a standing dare — Sacra flags that providers offering direct pricing incentives could undermine the routing value proposition, and at scale the fees compound; even fans say to migrate to first-party APIs for production volume.[1][7]
- Provider quality variance is your problem — the same open-weight model can be served at different quantizations and cache behaviors by different upstream hosts; community research found some third-party providers markedly worse.[7]
- BYOK allowance erodes under agent workloads — 1M free requests/month sounds large until agents fire multiple tool calls per task; past it, the 5% fee applies to traffic on keys you already pay for.[3]
- Free models are a data trade — community consensus is that free-tier model traffic should be assumed to feed someone's training data; ZDR filtering exists but is opt-in.[7][4]
- Single point of failure by design — putting one proxy in front of all inference concentrates uptime, retention, and compliance exposure in a company that publishes no formal SLA on its self-serve tiers.[3][1]
- Gateway competition is converging — Vercel, Cloudflare, and the hyperscalers now ship equivalent unified-API products, some bundled free or at matching fees.[7]
What Developers Say
The 253-comment Series B thread (May 2026) is the best single read on sentiment: heavy real-world usage, genuine affection for the convenience and billing controls, and consistent skepticism about fees at scale and routed-provider quality.[7]
"Originally I didn't understand why anyone would put a proxy between them and an LLM, but it actually adds some quite significant value." — simonw on Hacker News[7]
"It's definitely the best way to try out new models without fiddling with each providers distinct APIs." — minimaxir on Hacker News[7]
"IMO being able to buy credits and not have them locked to one provider is worth the 5% to me." — 542458 on Hacker News[7]
"I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale." — Aurornis on Hacker News[7]
"[I] did some research on it to come up with a provider tiers list and found a bunch of open-source 3rd party hosts are simply trash tier." — GodelNumbering on Hacker News, on routed provider quality[7]
Pricing & Licensing
No subscription — pay-as-you-go credits with provider list prices passed through at no markup.[2][3]
| Tier | Price | Includes |
|---|---|---|
| Free | $0 | Small testing allowance; rate-limited free models[2] |
| Pay-as-you-go | Provider list price + 5.5% platform fee on credit purchases | All 400+ models, routing, failover, per-key spend limits[3] |
| BYOK | First 1M requests/month free, then 5% of equivalent OpenRouter cost | Use your own provider keys through the gateway[3] |
| Enterprise | Custom (discounted platform fees; inference not discounted) | Workspaces, spend management, guardrails, ZDR policies, invoicing/POs[3][4] |
Licensing model: Proprietary managed gateway; the API surface is the OpenAI specification, which keeps switching costs low by design.[2]
Hidden costs: The 5.5% fee applies when credits are purchased (crypto payments carry their own fee); BYOK turns from free to 5% past 1M requests/month, which agentic workloads can hit quickly; opting into prompt logging is what earns the advertised 1% discount.[3][2]
Competitive Positioning
Direct Competitors
| Competitor | Differentiation |
|---|---|
| Vercel AI Gateway | The most direct rival — zero token markup, BYOK, budget controls, 200K+ teams; bundled into the Vercel platform, while OpenRouter is the standalone leader with more models (400+ vs 100s) and the public rankings |
| Together AI | Vertically integrated — runs its own GPU fleet at ~$1B annualized revenue; OpenRouter routes across providers like Together rather than competing on infrastructure |
| Groq / Fireworks AI / DeepInfra | Inference providers that appear behind OpenRouter's router; choosing one directly removes the aggregator layer (and fee) at the cost of multi-provider flexibility |
| Cloudflare AI Gateway / AWS Bedrock / Vertex AI | Hyperscaler gateways with bundled compliance; Cloudflare's unified billing carries the same 5% fee, per HN commenters[7] |
| Portkey / Martian / Not Diamond | Smaller routing/gateway startups Sacra lists as direct competitors, without OpenRouter's volume or model breadth[1] |
When to Choose OpenRouter Over Alternatives
- Choose OpenRouter when: you need every model behind one key on release day, consolidated billing with hard spend caps, or routing/failover across providers — especially for multi-model apps and agent harnesses.
- Choose Vercel AI Gateway when: you already build on Vercel and want the gateway colocated with your deployment platform.
- Choose Together AI or another direct provider when: you've converged on specific open-weight models at production scale and want provider-grade SLAs without an aggregator fee.
- Choose a hyperscaler gateway when: your contracts, compliance, and spend commitments already live in AWS, Google Cloud, or Azure.
Ideal Customer Profile
Best fit:
- Developers and startups building multi-model products or agents who want one API, one bill, and instant access to new models
- Teams that value billing caps, per-key limits, and the ability to A/B models without per-provider account sprawl
- Enterprises moving from single-model pilots to multi-model production who want routing, failover, and ZDR filtering as managed infrastructure
Poor fit:
- High-volume production workloads on a single known model, where first-party APIs avoid the 5–5.5% platform economics
- Teams requiring contractual SLAs or self-hosted/in-VPC gateways
- Quality-critical open-weight inference where provider-level quantization variance is unacceptable without careful provider pinning
Viability Assessment
| Factor | Assessment |
|---|---|
| Financial Health | Strong — $113M Series B at ~$1.3B post (May 2026), ~$50M annualized revenue, asset-light margins[5][1] |
| Market Position | Category leader in standalone AI gateways — 25T tokens/week and rankings the industry treats as market data[4][7] |
| Innovation Pace | High — multimodal routing, enterprise controls, and quality-aware routing all shipped in the past year[4] |
| Community/Ecosystem | Deep — 8M+ developers, default support in agent CLIs and IDE tools, heavily discussed (and used) on HN[4][7] |
| Long-term Outlook | Good but contested — provider concentration, take-rate pressure, and gateway commoditization are the named structural risks[1] |
The bull case is that OpenRouter has become measurement infrastructure as much as routing infrastructure — when its rankings are how the industry tracks model share, the gateway is the default place inference lands. The bear case is structural: it sits on a ~5% toll between customers and providers who could discount it away, and Vercel, Cloudflare, and the hyperscalers are shipping the same primitive as a platform feature.[1][7] The Series B's strategic investor list — Alphabet, NVIDIA, ServiceNow, MongoDB, Snowflake, Databricks — suggests the platforms themselves are betting the neutral layer survives.[4]
Bottom Line
OpenRouter is the proven answer to a real problem: 400+ models from 60+ providers behind one OpenAI-compatible key, with billing controls the labs themselves only recently matched, growing 5x in six months to 25 trillion tokens a week. The trade is a ~5% convenience toll, inherited provider-quality variance on open-weight models, and a single managed dependency in front of all your inference — acceptable for experimentation and multi-model products, worth re-evaluating at single-model production scale.
Recommended for: Multi-model apps and agent builders, teams that want consolidated billing and spend caps, and anyone who needs new models on release day without per-provider integration work.
Not recommended for: Single-model production workloads at volumes where 5–5.5% is material, SLA-bound enterprises without an Enterprise contract, or teams needing self-hosted gateways.
Outlook: Watch whether the take rate holds as Vercel and Cloudflare bundle equivalent gateways, whether quality-aware routing neutralizes the provider-variance complaint, and whether the quadrillion-token pace turns the rankings into durable, network-effect infrastructure.
Research by Ry Walker Research • methodology
Sources
- [1] Sacra: OpenRouter revenue, valuation & growth rate
- [2] OpenRouter Docs: FAQ
- [3] OpenRouter Pricing
- [4] OpenRouter Blog: OpenRouter Raises $113M Series B
- [5] TechCrunch: OpenRouter more than doubles valuation to $1.3B in a year
- [6] OpenRouter Website
- [7] Hacker News: OpenRouter raises $113M Series B (253 comments)