OpenRouter | Ry Walker Research

Key takeaways

$113M Series B led by CapitalG (Alphabet) at ~$1.3B post-money in May 2026 — more than double the $547M valuation from the June 2025 Series A — with NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks venture arms participating
Weekly token volume grew 5x in six months, from 5 trillion to 25 trillion tokens/week, on pace for a quadrillion tokens this year across 8M+ developers and 400+ models
The business model is a ~5% take on routed inference: provider list prices pass through with no token markup, monetized via a 5.5% fee on credit purchases and a 5% BYOK fee after 1M free requests/month
Asset-light by design — OpenRouter owns no GPUs; it routes across 60+ providers with failover, cost/latency optimization, and zero-data-retention filtering, which is both the moat and the commoditization risk

FAQ

What is OpenRouter?

OpenRouter is a unified API gateway that gives developers access to 400+ large language models from 60+ providers through a single OpenAI-compatible endpoint, with automatic routing on price, latency, and uptime.

How much does OpenRouter cost?

Provider list prices pass through with no token markup; OpenRouter charges a 5.5% platform fee when you buy pay-as-you-go credits, and BYOK usage is free for the first 1M requests per month, then 5% of the equivalent OpenRouter cost.

Does OpenRouter host its own models?

No — it is a routing layer, not an inference provider; requests are proxied to 60+ upstream providers including Anthropic, OpenAI, Google, xAI, and open-weight hosts, with provider-level failover.

How is OpenRouter different from Vercel AI Gateway?

Both are zero-markup multi-model gateways; OpenRouter is the standalone category leader with 400+ models, public usage rankings, and a 25T tokens/week run rate, while Vercel's gateway is a feature of the Vercel platform.

Executive Summary

OpenRouter is the switchboard of the multi-model era: a unified gateway that puts 400+ large language models from 60+ providers behind a single OpenAI-compatible endpoint, routing each request on price, latency, and uptime with provider-level failover.^[1]^[2] It owns no GPUs and hosts no models — provider list prices pass through "without any markup," and the company monetizes the convenience layer instead: a 5.5% fee on credit purchases and a 5% bring-your-own-key fee after 1M free requests per month, working out to roughly a 5% take on routed inference spend.^[2]^[3]^[1]

The traction validates the thesis. Weekly token volume grew 5x in six months — from 5 trillion to 25 trillion tokens — putting OpenRouter on pace to process over a quadrillion tokens this year across 8M+ developers.^[4] Sacra estimates $50M in annualized revenue as of March 2026, up from about $19M at the end of 2025.^[1] In May 2026 the company closed a $113M Series B led by CapitalG (Alphabet's growth fund) at roughly $1.3B post-money — more than double the $547M valuation from its $40M Series A a year earlier — with the venture arms of NVIDIA, ServiceNow, MongoDB, Snowflake, and Databricks all participating.^[5]^[4]

Attribute	Value
Company	OpenRouter, Inc. (New York, NY)^[1]
Founder	Alex Atallah (CEO)^[1]
Founded	2023^[1]
Funding	$113M Series B (May 2026) led by CapitalG at ~$1.3B post; $40M Series A (June 2025, a16z + Menlo); $12.5M seed (a16z)^[5]^[4]^[1]
Scale	25T tokens/week; 8M+ developers; 400+ models; 60+ providers^[4]^[1]
Revenue	~$50M annualized (March 2026, Sacra estimate) on a ~5% take rate^[1]

Product Overview

The core loop: swap your OpenAI base URL for OpenRouter's, and one API key now reaches every major lab and open-weight host. OpenRouter implements the OpenAI API specification for /completions and /chat/completions, so existing OpenAI SDKs work as a drop-in replacement.^[2] Behind the endpoint, intelligent routing handles provider failover, cost and latency optimization, and quality-aware selection — including filtering to zero-data-retention providers only.^[4]

The past year expanded the surface beyond text: image, audio, speech, transcription, embedding, and video models now route through the same gateway, alongside enterprise controls — workspaces, spend management, and guardrails.^[4] The public model rankings, which chart token share across the industry in near-real time, have become a de facto market-share scoreboard that HN threads cite as primary data.^[6]^[7]

Key Capabilities

Capability	Description
Unified API	OpenAI-compatible endpoint to 400+ models from 60+ providers^[2]^[1]
Intelligent routing	Provider failover, cost/latency optimization, quality-aware routing^[4]
Consolidated billing	One credit balance across all providers; per-key spend limits with periodic refill^[3]^[7]
BYOK	Bring your own provider keys; first 1M requests/month free, then 5% fee^[3]
Privacy controls	Zero prompt/completion logging by default; opt-in logging earns a 1% discount; ZDR provider filtering^[2]^[4]
Multimodal	Image, audio, speech, transcription, embedding, and video models^[4]
Free tier	Small free allowance plus rate-limited free models for testing^[2]

Technical Architecture

OpenRouter is a proxy, not a cloud: requests hit its routing layer, which selects an upstream provider per model based on price, latency, uptime, and user preferences, then streams the response back through the OpenAI-compatible interface.^[1]^[2] The asset-light model — no GPUs, no model hosting — is what enables high gross margins on a ~5% take and instant coverage of every new model release.^[1] The flip side is that output quality inherits whatever the routed provider does: third-party hosts vary in quantization and caching behavior, a recurring developer complaint (see What Developers Say).^[7]

Key Technical Details

Aspect	Detail
Deployment	Managed cloud gateway only; no self-hosting^[6]
Model(s)	400+ models from Anthropic, OpenAI, Google, xAI, DeepSeek, and 60+ providers^[5]^[1]
Integrations	OpenAI SDK drop-in; supported natively by most agent CLIs and IDE tools^[2]
Open Source	Platform proprietary; SDKs and docs on GitHub^[6]

Strengths

The default multi-model on-ramp — one key, one OpenAI-compatible API, every frontier and open-weight model on release day; 8M+ developers use it, and "lowest friction" is the consistent community verdict.^[4]^[7]
Hypergrowth with revenue behind it — token volume up 5x in six months to 25T/week, and ~$50M annualized revenue (March 2026) versus ~$19M at the end of 2025.^[4]^[1]
No token markup, transparent take — provider list prices pass through; the platform fee sits on credit purchases (5.5%) and post-allowance BYOK (5%), so the economics are legible.^[2]^[3]
Billing controls providers still lack — prepaid credits, per-key spend limits with periodic refill, and hard caps; HN commenters single this out versus hyperscaler billing.^[7]
Strategic cap table as distribution — CapitalG, NVentures, ServiceNow, MongoDB, Snowflake, and Databricks venture arms signal enterprise-platform alignment, not just capital.^[4]

Cautions

The 5% take is a standing dare — Sacra flags that providers offering direct pricing incentives could undermine the routing value proposition, and at scale the fees compound; even fans say to migrate to first-party APIs for production volume.^[1]^[7]
Provider quality variance is your problem — the same open-weight model can be served at different quantizations and cache behaviors by different upstream hosts; community research found some third-party providers markedly worse.^[7]
BYOK allowance erodes under agent workloads — 1M free requests/month sounds large until agents fire multiple tool calls per task; past it, the 5% fee applies to traffic on keys you already pay for.^[3]
Free models are a data trade — community consensus is that free-tier model traffic should be assumed to feed someone's training data; ZDR filtering exists but is opt-in.^[7]^[4]
Single point of failure by design — putting one proxy in front of all inference concentrates uptime, retention, and compliance exposure in a company that publishes no formal SLA on its self-serve tiers.^[3]^[1]
Gateway competition is converging — Vercel, Cloudflare, and the hyperscalers now ship equivalent unified-API products, some bundled free or at matching fees.^[7]

What Developers Say

The 253-comment Series B thread (May 2026) is the best single read on sentiment: heavy real-world usage, genuine affection for the convenience and billing controls, and consistent skepticism about fees at scale and routed-provider quality.^[7]

"Originally I didn't understand why anyone would put a proxy between them and an LLM, but it actually adds some quite significant value." — simonw on Hacker News^[7]

"It's definitely the best way to try out new models without fiddling with each providers distinct APIs." — minimaxir on Hacker News^[7]

"IMO being able to buy credits and not have them locked to one provider is worth the 5% to me." — 542458 on Hacker News^[7]

"I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale." — Aurornis on Hacker News^[7]

"[I] did some research on it to come up with a provider tiers list and found a bunch of open-source 3rd party hosts are simply trash tier." — GodelNumbering on Hacker News, on routed provider quality^[7]

Pricing & Licensing

No subscription — pay-as-you-go credits with provider list prices passed through at no markup.^[2]^[3]

Tier	Price	Includes
Free	$0	Small testing allowance; rate-limited free models^[2]
Pay-as-you-go	Provider list price + 5.5% platform fee on credit purchases	All 400+ models, routing, failover, per-key spend limits^[3]
BYOK	First 1M requests/month free, then 5% of equivalent OpenRouter cost	Use your own provider keys through the gateway^[3]
Enterprise	Custom (discounted platform fees; inference not discounted)	Workspaces, spend management, guardrails, ZDR policies, invoicing/POs^[3]^[4]

Licensing model: Proprietary managed gateway; the API surface is the OpenAI specification, which keeps switching costs low by design.^[2]

Hidden costs: The 5.5% fee applies when credits are purchased (crypto payments carry their own fee); BYOK turns from free to 5% past 1M requests/month, which agentic workloads can hit quickly; opting into prompt logging is what earns the advertised 1% discount.^[3]^[2]

Competitive Positioning

Direct Competitors

Competitor	Differentiation
Vercel AI Gateway	The most direct rival — zero token markup, BYOK, budget controls, 200K+ teams; bundled into the Vercel platform, while OpenRouter is the standalone leader with more models (400+ vs 100s) and the public rankings
Together AI	Vertically integrated — runs its own GPU fleet at ~$1B annualized revenue; OpenRouter routes across providers like Together rather than competing on infrastructure
Groq / Fireworks AI / DeepInfra	Inference providers that appear behind OpenRouter's router; choosing one directly removes the aggregator layer (and fee) at the cost of multi-provider flexibility
Cloudflare AI Gateway / AWS Bedrock / Vertex AI	Hyperscaler gateways with bundled compliance; Cloudflare's unified billing carries the same 5% fee, per HN commenters^[7]
Portkey / Martian / Not Diamond	Smaller routing/gateway startups Sacra lists as direct competitors, without OpenRouter's volume or model breadth^[1]

When to Choose OpenRouter Over Alternatives

Choose OpenRouter when: you need every model behind one key on release day, consolidated billing with hard spend caps, or routing/failover across providers — especially for multi-model apps and agent harnesses.
Choose Vercel AI Gateway when: you already build on Vercel and want the gateway colocated with your deployment platform.
Choose Together AI or another direct provider when: you've converged on specific open-weight models at production scale and want provider-grade SLAs without an aggregator fee.
Choose a hyperscaler gateway when: your contracts, compliance, and spend commitments already live in AWS, Google Cloud, or Azure.

Ideal Customer Profile

Best fit:

Developers and startups building multi-model products or agents who want one API, one bill, and instant access to new models
Teams that value billing caps, per-key limits, and the ability to A/B models without per-provider account sprawl
Enterprises moving from single-model pilots to multi-model production who want routing, failover, and ZDR filtering as managed infrastructure

Poor fit:

High-volume production workloads on a single known model, where first-party APIs avoid the 5–5.5% platform economics
Teams requiring contractual SLAs or self-hosted/in-VPC gateways
Quality-critical open-weight inference where provider-level quantization variance is unacceptable without careful provider pinning

Viability Assessment

Factor	Assessment
Financial Health	Strong — $113M Series B at ~$1.3B post (May 2026), ~$50M annualized revenue, asset-light margins^[5]^[1]
Market Position	Category leader in standalone AI gateways — 25T tokens/week and rankings the industry treats as market data^[4]^[7]
Innovation Pace	High — multimodal routing, enterprise controls, and quality-aware routing all shipped in the past year^[4]
Community/Ecosystem	Deep — 8M+ developers, default support in agent CLIs and IDE tools, heavily discussed (and used) on HN^[4]^[7]
Long-term Outlook	Good but contested — provider concentration, take-rate pressure, and gateway commoditization are the named structural risks^[1]

The bull case is that OpenRouter has become measurement infrastructure as much as routing infrastructure — when its rankings are how the industry tracks model share, the gateway is the default place inference lands. The bear case is structural: it sits on a ~5% toll between customers and providers who could discount it away, and Vercel, Cloudflare, and the hyperscalers are shipping the same primitive as a platform feature.^[1]^[7] The Series B's strategic investor list — Alphabet, NVIDIA, ServiceNow, MongoDB, Snowflake, Databricks — suggests the platforms themselves are betting the neutral layer survives.^[4]

Bottom Line

OpenRouter is the proven answer to a real problem: 400+ models from 60+ providers behind one OpenAI-compatible key, with billing controls the labs themselves only recently matched, growing 5x in six months to 25 trillion tokens a week. The trade is a ~5% convenience toll, inherited provider-quality variance on open-weight models, and a single managed dependency in front of all your inference — acceptable for experimentation and multi-model products, worth re-evaluating at single-model production scale.

Recommended for: Multi-model apps and agent builders, teams that want consolidated billing and spend caps, and anyone who needs new models on release day without per-provider integration work.

Not recommended for: Single-model production workloads at volumes where 5–5.5% is material, SLA-bound enterprises without an Enterprise contract, or teams needing self-hosted gateways.

Outlook: Watch whether the take rate holds as Vercel and Cloudflare bundle equivalent gateways, whether quality-aware routing neutralizes the provider-variance complaint, and whether the quadrillion-token pace turns the rankings into durable, network-effect infrastructure.

Research by Ry Walker Research • methodology

Sources