← Back to research
·12 min read·product

OpenRouter

OpenRouter is the unified gateway to 400+ LLMs from 60+ providers through one OpenAI-compatible API — 25 trillion tokens/week (5x in six months), $50M annualized revenue on a ~5% take rate, and a $113M Series B led by CapitalG at a $1.3B valuation.

Key takeaways

  • $113M Series B led by CapitalG (Alphabet) at ~$1.3B post-money in May 2026 — more than double the $547M valuation from the June 2025 Series A — with NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks venture arms participating
  • Weekly token volume grew 5x in six months, from 5 trillion to 25 trillion tokens/week, on pace for a quadrillion tokens this year across 8M+ developers and 400+ models
  • The business model is a ~5% take on routed inference: provider list prices pass through with no token markup, monetized via a 5.5% fee on credit purchases and a 5% BYOK fee after 1M free requests/month
  • Asset-light by design — OpenRouter owns no GPUs; it routes across 60+ providers with failover, cost/latency optimization, and zero-data-retention filtering, which is both the moat and the commoditization risk

FAQ

What is OpenRouter?

OpenRouter is a unified API gateway that gives developers access to 400+ large language models from 60+ providers through a single OpenAI-compatible endpoint, with automatic routing on price, latency, and uptime.

How much does OpenRouter cost?

Provider list prices pass through with no token markup; OpenRouter charges a 5.5% platform fee when you buy pay-as-you-go credits, and BYOK usage is free for the first 1M requests per month, then 5% of the equivalent OpenRouter cost.

Does OpenRouter host its own models?

No — it is a routing layer, not an inference provider; requests are proxied to 60+ upstream providers including Anthropic, OpenAI, Google, xAI, and open-weight hosts, with provider-level failover.

How is OpenRouter different from Vercel AI Gateway?

Both are zero-markup multi-model gateways; OpenRouter is the standalone category leader with 400+ models, public usage rankings, and a 25T tokens/week run rate, while Vercel's gateway is a feature of the Vercel platform.

Executive Summary

OpenRouter is the switchboard of the multi-model era: a unified gateway that puts 400+ large language models from 60+ providers behind a single OpenAI-compatible endpoint, routing each request on price, latency, and uptime with provider-level failover.[1][2] It owns no GPUs and hosts no models — provider list prices pass through "without any markup," and the company monetizes the convenience layer instead: a 5.5% fee on credit purchases and a 5% bring-your-own-key fee after 1M free requests per month, working out to roughly a 5% take on routed inference spend.[2][3][1]

The traction validates the thesis. Weekly token volume grew 5x in six months — from 5 trillion to 25 trillion tokens — putting OpenRouter on pace to process over a quadrillion tokens this year across 8M+ developers.[4] Sacra estimates $50M in annualized revenue as of March 2026, up from about $19M at the end of 2025.[1] In May 2026 the company closed a $113M Series B led by CapitalG (Alphabet's growth fund) at roughly $1.3B post-money — more than double the $547M valuation from its $40M Series A a year earlier — with the venture arms of NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks all participating.[5][4]

AttributeValue
CompanyOpenRouter, Inc. (New York, NY)[1]
FounderAlex Atallah (CEO)[1]
Founded2023[1]
Funding$113M Series B (May 2026) led by CapitalG at ~$1.3B post; $40M Series A (June 2025, a16z + Menlo); $12.5M seed (a16z)[5][4][1]
Scale25T tokens/week; 8M+ developers; 400+ models; 60+ providers[4][1]
Revenue~$50M annualized (March 2026, Sacra estimate) on a ~5% take rate[1]

Product Overview

The core loop: swap your OpenAI base URL for OpenRouter's, and one API key now reaches every major lab and open-weight host. OpenRouter implements the OpenAI API specification for /completions and /chat/completions, so existing OpenAI SDKs work as a drop-in replacement.[2] Behind the endpoint, intelligent routing handles provider failover, cost and latency optimization, and quality-aware selection — including filtering to zero-data-retention providers only.[4]

The past year expanded the surface beyond text: image, audio, speech, transcription, embedding, and video models now route through the same gateway, alongside enterprise controls — workspaces, spend management, and guardrails.[4] The public model rankings, which chart token share across the industry in near-real time, have become a de facto market-share scoreboard that HN threads cite as primary data.[6][7]

Key Capabilities

CapabilityDescription
Unified APIOpenAI-compatible endpoint to 400+ models from 60+ providers[2][1]
Intelligent routingProvider failover, cost/latency optimization, quality-aware routing[4]
Consolidated billingOne credit balance across all providers; per-key spend limits with periodic refill[3][7]
BYOKBring your own provider keys; first 1M requests/month free, then 5% fee[3]
Privacy controlsZero prompt/completion logging by default; opt-in logging earns a 1% discount; ZDR provider filtering[2][4]
MultimodalImage, audio, speech, transcription, embedding, and video models[4]
Free tierSmall free allowance plus rate-limited free models for testing[2]

Technical Architecture

OpenRouter is a proxy, not a cloud: requests hit its routing layer, which selects an upstream provider per model based on price, latency, uptime, and user preferences, then streams the response back through the OpenAI-compatible interface.[1][2] The asset-light model — no GPUs, no model hosting — is what enables high gross margins on a ~5% take and instant coverage of every new model release.[1] The flip side is that output quality inherits whatever the routed provider does: third-party hosts vary in quantization and caching behavior, a recurring developer complaint (see What Developers Say).[7]

Key Technical Details

AspectDetail
DeploymentManaged cloud gateway only; no self-hosting[6]
Model(s)400+ models from Anthropic, OpenAI, Google, xAI, DeepSeek, and 60+ providers[5][1]
IntegrationsOpenAI SDK drop-in; supported natively by most agent CLIs and IDE tools[2]
Open SourcePlatform proprietary; SDKs and docs on GitHub[6]

Strengths

  • The default multi-model on-ramp — one key, one OpenAI-compatible API, every frontier and open-weight model on release day; 8M+ developers use it, and "lowest friction" is the consistent community verdict.[4][7]
  • Hypergrowth with revenue behind it — token volume up 5x in six months to 25T/week, and ~$50M annualized revenue (March 2026) versus ~$19M at the end of 2025.[4][1]
  • No token markup, transparent take — provider list prices pass through; the platform fee sits on credit purchases (5.5%) and post-allowance BYOK (5%), so the economics are legible.[2][3]
  • Billing controls providers still lack — prepaid credits, per-key spend limits with periodic refill, and hard caps; HN commenters single this out versus hyperscaler billing.[7]
  • Strategic cap table as distribution — CapitalG, NVentures, ServiceNow, MongoDB, Snowflake, and Databricks venture arms signal enterprise-platform alignment, not just capital.[4]

Cautions

  • The 5% take is a standing dare — Sacra flags that providers offering direct pricing incentives could undermine the routing value proposition, and at scale the fees compound; even fans say to migrate to first-party APIs for production volume.[1][7]
  • Provider quality variance is your problem — the same open-weight model can be served at different quantizations and cache behaviors by different upstream hosts; community research found some third-party providers markedly worse.[7]
  • BYOK allowance erodes under agent workloads — 1M free requests/month sounds large until agents fire multiple tool calls per task; past it, the 5% fee applies to traffic on keys you already pay for.[3]
  • Free models are a data trade — community consensus is that free-tier model traffic should be assumed to feed someone's training data; ZDR filtering exists but is opt-in.[7][4]
  • Single point of failure by design — putting one proxy in front of all inference concentrates uptime, retention, and compliance exposure in a company that publishes no formal SLA on its self-serve tiers.[3][1]
  • Gateway competition is converging — Vercel, Cloudflare, and the hyperscalers now ship equivalent unified-API products, some bundled free or at matching fees.[7]

What Developers Say

The 253-comment Series B thread (May 2026) is the best single read on sentiment: heavy real-world usage, genuine affection for the convenience and billing controls, and consistent skepticism about fees at scale and routed-provider quality.[7]

"Originally I didn't understand why anyone would put a proxy between them and an LLM, but it actually adds some quite significant value." — simonw on Hacker News[7]

"It's definitely the best way to try out new models without fiddling with each providers distinct APIs." — minimaxir on Hacker News[7]

"IMO being able to buy credits and not have them locked to one provider is worth the 5% to me." — 542458 on Hacker News[7]

"I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale." — Aurornis on Hacker News[7]

"[I] did some research on it to come up with a provider tiers list and found a bunch of open-source 3rd party hosts are simply trash tier." — GodelNumbering on Hacker News, on routed provider quality[7]


Pricing & Licensing

No subscription — pay-as-you-go credits with provider list prices passed through at no markup.[2][3]

TierPriceIncludes
Free$0Small testing allowance; rate-limited free models[2]
Pay-as-you-goProvider list price + 5.5% platform fee on credit purchasesAll 400+ models, routing, failover, per-key spend limits[3]
BYOKFirst 1M requests/month free, then 5% of equivalent OpenRouter costUse your own provider keys through the gateway[3]
EnterpriseCustom (discounted platform fees; inference not discounted)Workspaces, spend management, guardrails, ZDR policies, invoicing/POs[3][4]

Licensing model: Proprietary managed gateway; the API surface is the OpenAI specification, which keeps switching costs low by design.[2]

Hidden costs: The 5.5% fee applies when credits are purchased (crypto payments carry their own fee); BYOK turns from free to 5% past 1M requests/month, which agentic workloads can hit quickly; opting into prompt logging is what earns the advertised 1% discount.[3][2]


Competitive Positioning

Direct Competitors

CompetitorDifferentiation
Vercel AI GatewayThe most direct rival — zero token markup, BYOK, budget controls, 200K+ teams; bundled into the Vercel platform, while OpenRouter is the standalone leader with more models (400+ vs 100s) and the public rankings
Together AIVertically integrated — runs its own GPU fleet at ~$1B annualized revenue; OpenRouter routes across providers like Together rather than competing on infrastructure
Groq / Fireworks AI / DeepInfraInference providers that appear behind OpenRouter's router; choosing one directly removes the aggregator layer (and fee) at the cost of multi-provider flexibility
Cloudflare AI Gateway / AWS Bedrock / Vertex AIHyperscaler gateways with bundled compliance; Cloudflare's unified billing carries the same 5% fee, per HN commenters[7]
Portkey / Martian / Not DiamondSmaller routing/gateway startups Sacra lists as direct competitors, without OpenRouter's volume or model breadth[1]

When to Choose OpenRouter Over Alternatives

  • Choose OpenRouter when: you need every model behind one key on release day, consolidated billing with hard spend caps, or routing/failover across providers — especially for multi-model apps and agent harnesses.
  • Choose Vercel AI Gateway when: you already build on Vercel and want the gateway colocated with your deployment platform.
  • Choose Together AI or another direct provider when: you've converged on specific open-weight models at production scale and want provider-grade SLAs without an aggregator fee.
  • Choose a hyperscaler gateway when: your contracts, compliance, and spend commitments already live in AWS, Google Cloud, or Azure.

Ideal Customer Profile

Best fit:

  • Developers and startups building multi-model products or agents who want one API, one bill, and instant access to new models
  • Teams that value billing caps, per-key limits, and the ability to A/B models without per-provider account sprawl
  • Enterprises moving from single-model pilots to multi-model production who want routing, failover, and ZDR filtering as managed infrastructure

Poor fit:

  • High-volume production workloads on a single known model, where first-party APIs avoid the 5–5.5% platform economics
  • Teams requiring contractual SLAs or self-hosted/in-VPC gateways
  • Quality-critical open-weight inference where provider-level quantization variance is unacceptable without careful provider pinning

Viability Assessment

FactorAssessment
Financial HealthStrong — $113M Series B at ~$1.3B post (May 2026), ~$50M annualized revenue, asset-light margins[5][1]
Market PositionCategory leader in standalone AI gateways — 25T tokens/week and rankings the industry treats as market data[4][7]
Innovation PaceHigh — multimodal routing, enterprise controls, and quality-aware routing all shipped in the past year[4]
Community/EcosystemDeep — 8M+ developers, default support in agent CLIs and IDE tools, heavily discussed (and used) on HN[4][7]
Long-term OutlookGood but contested — provider concentration, take-rate pressure, and gateway commoditization are the named structural risks[1]

The bull case is that OpenRouter has become measurement infrastructure as much as routing infrastructure — when its rankings are how the industry tracks model share, the gateway is the default place inference lands. The bear case is structural: it sits on a ~5% toll between customers and providers who could discount it away, and Vercel, Cloudflare, and the hyperscalers are shipping the same primitive as a platform feature.[1][7] The Series B's strategic investor list — Alphabet, NVIDIA, ServiceNow, MongoDB, Snowflake, Databricks — suggests the platforms themselves are betting the neutral layer survives.[4]


Bottom Line

OpenRouter is the proven answer to a real problem: 400+ models from 60+ providers behind one OpenAI-compatible key, with billing controls the labs themselves only recently matched, growing 5x in six months to 25 trillion tokens a week. The trade is a ~5% convenience toll, inherited provider-quality variance on open-weight models, and a single managed dependency in front of all your inference — acceptable for experimentation and multi-model products, worth re-evaluating at single-model production scale.

Recommended for: Multi-model apps and agent builders, teams that want consolidated billing and spend caps, and anyone who needs new models on release day without per-provider integration work.

Not recommended for: Single-model production workloads at volumes where 5–5.5% is material, SLA-bound enterprises without an Enterprise contract, or teams needing self-hosted gateways.

Outlook: Watch whether the take rate holds as Vercel and Cloudflare bundle equivalent gateways, whether quality-aware routing neutralizes the provider-variance complaint, and whether the quadrillion-token pace turns the rankings into durable, network-effect infrastructure.


Research by Ry Walker Research • methodology