Key takeaways
- Acquired by Cloudflare (announced November 17, 2025; completed early 2026) — continues as a distinct brand with an unchanged API
- Simplest developer experience — run any model with one line of code, no GPU management
- 50,000+ community and official models, with deep integration into Cloudflare Workers AI underway
- Dominant in image and video generation use cases (Stable Diffusion, Flux, etc.)
- Pay-per-use pricing — hardware-billed by the second or official models billed per token/image
FAQ
What is Replicate?
A platform to run open-source AI models via API. One line of code, no GPU management required. It is now owned by Cloudflare.
Did Cloudflare acquire Replicate?
Yes. Cloudflare announced the acquisition on November 17, 2025, and it closed in early 2026. Replicate operates as a distinct brand, its API is unchanged, and its 50,000+ model catalog is being integrated into Cloudflare Workers AI.
How does Replicate pricing work?
Pay only for what you use. Public models are billed per second of hardware time; official models are billed by tokens, images, or seconds of output video.
What models can I run on Replicate?
More than 50,000 community-contributed and official models for image generation, LLMs, audio, video, and more.
Company Overview
Replicate is a developer-friendly platform for running open-source AI models via API.[1] The core premise is radical simplicity: run any model with one line of code, pay only when it runs, and never think about GPUs.
Status (as of June 2026): Replicate is now part of Cloudflare. The acquisition was announced on November 17, 2025 and completed in early 2026; financial terms were not disclosed.[2][3] Replicate continues to operate as a distinct brand with an unchanged API — "The API isn't changing. The models you're using today will keep working."[4] Before the acquisition, Replicate was backed by Y Combinator, Sequoia Capital, and other investors.[3]
Replicate has built a strong community marketplace where model creators publish and share models — more than 50,000 production-ready models as of the acquisition.[3] This community-driven approach has made it especially popular for image and video generation workloads, with customers including Character.ai, Photo.ai, and Magnific.[1]
What It Does
- Model API — Run tens of thousands of models via simple REST or Python API[5]
- Community models — Browse and run models published by community creators
- Official models — Frontier and popular models (FLUX, DeepSeek, Claude, video models) billed per token or output[6]
- Custom models — Deploy your own models using Cog (open-source packaging tool)
- Fine-tuning — Train custom versions of supported models
- Streaming — Real-time output streaming for LLMs and generative models
- Cloudflare integration (in progress) — The model catalog is being woven into Cloudflare Workers AI and the broader developer platform (R2, Vectorize, Queues, Durable Objects)[7]
How It Works
- Find a model on Replicate's explore page or bring your own
- Call the API — one line:
replicate.run("model/name", input={...}) - Get results — output delivered as URLs (images/video) or streamed text (LLMs)
- Pay per use — billed only for compute used during generation
For custom models, package with Cog (Docker-based) and push to Replicate. The platform handles GPU allocation, scaling, and cold starts.
Pricing
Verified June 2026:[6]
- Pay per use — public models charged per second of hardware time; official models billed by tokens, images, or seconds of output video
- No idle costs for public models — they scale to zero when not in use; private models bill for dedicated uptime (fast-booting fine-tunes bill active time only)
- GPU tiers — pricing varies by hardware, including multi-GPU (2x/4x/8x) configurations
| Hardware | Cost |
|---|---|
| CPU | $0.000100/sec ($0.36/hr) |
| NVIDIA T4 | $0.000225/sec ($0.81/hr) |
| NVIDIA L40S | $0.000975/sec ($3.51/hr) |
| NVIDIA A100 (80GB) | $0.001400/sec ($5.04/hr) |
| NVIDIA H100 | $0.001525/sec ($5.49/hr) |
Official model examples: FLUX image models run $0.025–$0.04 per output image, and video generation models run roughly $0.09–$0.25 per second of output video.[6]
Strengths
- Simplest DX — Lowest barrier to running AI models; one line of code
- Community marketplace — 50,000+ ready-to-use models[3]
- No GPU management — Entirely abstracted infrastructure
- Pay-per-use — No idle costs for public models, perfect for bursty workloads
- Image/video leader — Strong in generative media use cases
- Cog open-source — Model packaging without vendor lock-in
- Cloudflare backing — Acquisition removes standalone-startup viability risk and promises global-network performance gains[2]
Weaknesses / Risks
- Less control — Can't tune infrastructure, GPU selection, or optimization
- Cold starts — Models that haven't run recently take longer to start
- Limited custom model support — Less flexible than Baseten or Fireworks for complex deployments
- Integration uncertainty — Cloudflare says the brand and API persist, but long-term roadmap independence post-acquisition is unproven[4]
- LLM competition — Groq, Together AI, DeepInfra offer better LLM inference economics
- Image/video competition — fal has focused aggressively on generative media and contests Replicate's core niche
What Developers Say
Hacker News discussion of the Cloudflare acquisition was largely congratulatory, with some skepticism about deal economics and the crowded market:[8]
"Congrats Ben and team! I think this is Cloudflare's most notable acquisition yet?" — simonw, Hacker News (Nov 2025)
"I don't know that this was an acquisition in the sense that the Replicate investors and team made bank. I don't see a price tag, and the market for these 'run model' infra companies is pretty crowded." — echelon, Hacker News (Nov 2025)
"replicate 'only' raised $50m, so I'd wager the founders, investors and early employees did well here." — mritchie712, Hacker News (Nov 2025)
Competitive Landscape
vs. DeepInfra: DeepInfra focuses on LLM inference with lower per-token pricing. Replicate wins on model variety (image, video, audio).
vs. Baseten: Baseten offers more control and compliance. Replicate wins on simplicity.
vs. Modal: Modal lets you run arbitrary Python with GPU access. Replicate is model-specific but much simpler.
vs. fal: fal concentrated on image/video generation and is Replicate's sharpest rival in generative media.
vs. Hugging Face Inference: Similar community model approach. Replicate offers simpler API and better scaling.
vs. Cloudflare Workers AI: No longer a competitor — Replicate's catalog is becoming the model layer of Workers AI itself.[7]
Ideal User
- Developers wanting to prototype with AI models quickly
- Startups building image/video generation products
- Indie hackers and makers who don't want to manage infrastructure
- Teams already on Cloudflare's developer platform (Workers, R2, Vectorize) wanting native model inference
- Teams exploring multiple models before committing to a platform
Bottom Line
Replicate is the "Heroku of AI models" — maximum simplicity at the cost of control — and it now has Cloudflare's network and balance sheet behind it. The community marketplace of 50,000+ models remains a unique moat, especially for image and video generation.
Recommended for: developers shipping AI features fast without becoming infrastructure experts, generative media products, and teams building on Cloudflare's developer platform.
Not recommended for: teams needing deep infrastructure control, custom optimization, or the lowest possible per-token LLM costs — Baseten, Fireworks, Groq, or Together AI fit better.
Outlook: Positive. The Cloudflare acquisition resolves standalone viability questions and points toward edge-native inference — run a model from a Worker, store results in R2 or Vectorize, manage agent state in Durable Objects.[7] Watch whether the promised brand and API independence holds as integration deepens.
Sources
- [1] Replicate Website
- [2] Cloudflare to Acquire Replicate (Press Release, Nov 17, 2025)
- [3] Cloudflare acquires AI deployment startup Replicate (SiliconANGLE)
- [4] Replicate is joining Cloudflare (Replicate Blog, Nov 17, 2025)
- [5] Replicate Documentation
- [6] Replicate Pricing
- [7] Why Replicate is joining Cloudflare (Cloudflare Blog)
- [8] Replicate is joining Cloudflare — Hacker News discussion