← Back to research
·9 min read·company

Modal

Modal is a serverless Python cloud with elastic GPU scaling and AI agent sandboxes. Raised a $355M Series C at $4.65B (May 2026) with ~$300M annualized revenue.

Key takeaways

  • Raised a $355M Series C at a $4.65B valuation (May 2026); revenue grew ~5x to roughly $300M annualized
  • Only official OpenAI Agents SDK sandbox provider with GPU acceleration; claims 1B+ sandboxes run and 100k+ concurrent
  • Python-first "programmable infrastructure" with sub-second cold starts and elastic scaling to hundreds of GPUs

FAQ

What is Modal?

Modal is a serverless Python cloud platform with elastic GPU scaling and AI agent sandboxes, used for inference, training, batch processing, and agent code execution with sub-second cold starts.

How much does Modal cost?

Modal Starter is free with $30/month credits. Per-second billing for CPU ($0.0000131/core/sec) and GPU (T4 ~$0.59/hr up to B200 ~$6.25/hr). Team is $250/mo; Enterprise custom.

How much has Modal raised?

Modal has raised $465M+ total, including a $355M Series C at a $4.65B valuation in May 2026 led by General Catalyst and Redpoint Ventures.

Who competes with Modal?

E2B, Daytona, Runloop, and Cloudflare Sandboxes for agent sandboxes; RunPod, Lambda Labs, Beam for GPU cloud.

Executive Summary

Modal is a serverless Python cloud platform that provides elastic GPU scaling with sub-second cold starts. Unlike sandbox-focused platforms, Modal is built for general AI/ML workloads including inference, training, and batch processing — making it the clear choice when GPU access is required. The agent boom has made Sandboxes a core product: Modal claims over 1 billion sandboxes run and 100k+ concurrent sessions, and in May 2026 it closed a $355M Series C at a $4.65B valuation as revenue grew roughly fivefold to ~$300M annualized.

AttributeValue
CompanyModal Labs
Founded2021
Funding$465M+ ($355M Series C at $4.65B valuation, May 2026)
Revenue~$300M annualized (May 2026), up from ~$60M in Sept 2025
HeadquartersNew York, NY (also SF, Stockholm)

Product Overview

Modal was founded by Erik Bernhardsson (former CTO of Better, built Spotify's recommendation system) and Akshat Bubna. The platform's core innovation is "programmable infrastructure" — define compute requirements in Python decorators, and Modal handles the rest.

Modal has become the go-to platform for AI teams needing GPU access without the complexity of managing cloud infrastructure. Companies like Substack, Lovable, Ramp, and numerous AI startups rely on Modal for inference, training, and agent sandbox workloads — Lovable ran over 1 million sandboxes in 48 hours, peaking at 20,000 concurrent. Modal acquired sandbox-infrastructure startup Jamsocket in July 2025 and, as of April 2026, is one of seven hosted sandbox providers natively integrated into the OpenAI Agents SDK — and the only one offering GPU acceleration.

Key Capabilities

CapabilityDescription
Elastic GPU ScalingA100, H100, and other GPUs with no quotas or reservations
Sub-Second Cold StartsContainers launch instantly; no waiting for GPU allocation
Python-FirstDefine infrastructure as Python decorators; no YAML
Distributed StorageVolumes mount across sandbox runs; durable filesystem survives restarts
Auto-ScalingScale from 0 to thousands of containers; 100k+ concurrent sandboxes
Sandboxes & NotebooksAgent code execution with snapshotting; Notebooks with GPU memory snapshots (10x faster startup)

Product Surfaces / Editions

SurfaceDescriptionAvailability
Modal FunctionsServerless Python functions with GPUGA
Modal SandboxesIsolated agent code execution; granular snapshotting; OpenAI Agents SDK native integrationGA
Modal NotebooksJupyter-style notebooks with GPUGA
Web EndpointsHTTP endpoints for inference APIsGA
Scheduled JobsCron-style scheduled executionGA

Technical Architecture

Modal uses gVisor for container isolation, providing kernel-level security with minimal performance overhead. The platform is built for fast autoscaling, launching containers in sub-seconds rather than minutes.

┌─────────────────────────────────────────┐
│           Modal Platform                │
├─────────────────────────────────────────┤
│  ┌───────────┐ ┌───────────┐            │
│  │  Function │ │  Function │    ...     │
│  │  (gVisor) │ │  (gVisor) │            │
│  └─────┬─────┘ └─────┬─────┘            │
│        │             │                  │
│  ┌─────┴─────────────┴─────┐            │
│  │    Distributed FS       │            │
│  │    (Model Loading)      │            │
│  └─────────────────────────┘            │
│                                         │
│  ┌─────────────────────────────────┐    │
│  │   Multi-Cloud GPU Pool          │    │
│  │   (A100, H100, no quotas)       │    │
│  └─────────────────────────────────┘    │
└─────────────────────────────────────────┘

Key Technical Details

AspectDetail
IsolationgVisor (kernel-level)
Cold StartSub-second
GPU SupportB200, H200, H100, A100, L40S, L4, T4
LanguagesPython-centric (TypeScript/JavaScript and Go SDKs for Sandboxes)
Open SourceNo (proprietary platform)
Self-HostingNo

Strengths

  • GPU access — Only sandbox-adjacent platform with elastic GPU scaling; no quotas or reservations
  • Developer experience — Praised as "how Python apps should deploy"; Vercel-like simplicity for AI
  • Fast cold starts — Sub-second container launches; tight feedback loops for development
  • Python-first — Infrastructure as code with decorators; no YAML configuration files
  • Auto-scaling — Scale from 0 to hundreds of GPUs automatically based on load
  • Well-funded — $465M+ raised; $355M Series C at $4.65B valuation (May 2026) with ~$300M annualized revenue
  • Sandbox scale proven — 1B+ sandboxes run; Lovable ran 1M sandboxes in 48 hours; only GPU-accelerated OpenAI Agents SDK provider
  • Great documentation — Extensive examples and tutorials for common AI use cases

Cautions

  • Python-centric — JavaScript and Go SDKs exist for Sandboxes, but the platform remains Python-first; not ideal for polyglot teams
  • Pricing premium — More expensive than Lambda Labs or Voltage Park for raw GPU hours; "costs can be higher under heavy usage compared to some alternatives"
  • No self-hosting — Fully managed; cannot deploy on-premises or in your own VPC
  • Spot instance issues — Reports of frequent preemption on spot instances for smaller workloads
  • Proprietary — Not open source; vendor lock-in risk for critical infrastructure
  • Sandboxes face new competition — Cloudflare Sandboxes hit GA in April 2026 and OpenAI's Agents SDK lists seven providers; sandbox features alone no longer differentiate

What Developers Say

"Modal is great, it's been able to handle us chunking 10k files/second. The developer experience is also great, so we highly recommend it." — williamzeng0, Hacker News

"They give us GPUs on-demand, which is critical... Pricing is about $2/hr per GPU. Long story short, things get VERY expensive quickly." — sid-the-kid, Hacker News

"I know of modal.com, which I believe is used by Codegen and Cognition." — sudb, Hacker News (May 2025)


Pricing & Licensing

TierPriceIncludes
Starter$0 + compute$30/month free credits, 3 seats, 100 containers + 10 GPU concurrency
Team$250/mo + compute$100/month credits, unlimited seats, 1000 containers + 50 GPU concurrency
EnterpriseCustomVolume discounts, higher GPU concurrency

Compute costs (per second, as of June 2026):

ResourceCost
CPU (physical core)$0.0000131/core/sec (min 0.125 cores)
Memory$0.00000222/GiB/sec
GPU — T4$0.000164/sec (~$0.59/hr)
GPU — A100 80GB$0.000694/sec (~$2.50/hr)
GPU — H100$0.001097/sec (~$3.95/hr)
GPU — B200$0.001736/sec (~$6.25/hr)
Volumes$0.09/GiB/month (1 TiB/month free)

Licensing model: Proprietary, usage-based pricing

Hidden costs: GPU costs add up quickly for training; spot instances can be unreliable


Competitive Positioning

Direct Competitors

CompetitorDifferentiation
E2BE2B is sandbox-focused with Firecracker; Modal is general compute with GPUs
DaytonaDaytona has Computer Use and open source; Modal has GPUs and broader compute
RunloopRunloop focuses on agent development; Modal is for general AI/ML workloads
Cloudflare SandboxesGA April 2026 with persistent isolated environments; Modal has GPUs and Python-native DX
RunPodRunPod has cheaper raw GPU; Modal has better DX and auto-scaling
Lambda LabsLambda has cheaper reserved GPUs; Modal has serverless scaling

When to Choose Modal Over Alternatives

  • Choose Modal when: You need GPU access, value developer experience, or want serverless Python deployment
  • Choose E2B when: You need dedicated AI agent sandboxes with Firecracker isolation
  • Choose RunPod when: You need the cheapest possible GPU access and can manage infrastructure
  • Choose Lambda Labs when: You have predictable GPU needs and want reserved capacity

Ideal Customer Profile

Best fit:

  • AI teams needing GPU access without infrastructure management
  • Python developers wanting Vercel-like deployment experience
  • Inference workloads with spiky or unpredictable traffic
  • Teams running ML training, fine-tuning, or batch processing
  • Startups and scale-ups valuing developer velocity over cost optimization

Poor fit:

  • Teams needing dedicated AI agent sandboxes (E2B, Daytona better fit)
  • Cost-sensitive workloads where GPU pricing is critical
  • Organizations requiring on-premises or self-hosted deployment
  • Polyglot teams needing non-Python language support

Viability Assessment

FactorAssessment
Financial HealthVery strong — $465M+ raised at $4.65B valuation; ~$300M annualized revenue
Market PositionLeader — Dominant in serverless GPU compute; top-tier agent sandbox provider
Innovation PaceRapid — Regular releases, expanding capabilities
Community/EcosystemActive — Strong developer advocacy, extensive docs
Long-term OutlookPositive — Well-positioned for AI infrastructure growth

Modal has established itself as the "Vercel for AI" with strong developer experience and elastic GPU access. Its May 2026 Series C — $355M at $4.65B, quadrupling its valuation, with revenue up ~5x to ~$300M annualized — confirms it is the breakout company in this category. Main competitive pressure comes from cheaper GPU providers and major cloud vendors entering the sandbox space (Cloudflare Sandboxes reached GA in April 2026).


Bottom Line

Modal is the clear choice when you need GPU access with serverless simplicity. The Python-first developer experience, sub-second cold starts, and elastic GPU scaling make it the go-to platform for AI teams who don't want to manage infrastructure.

The trade-off is cost (premium over raw GPU providers) and lock-in (proprietary, no self-hosting). The sandbox story has strengthened considerably since early 2026: 1B+ sandboxes run, native OpenAI Agents SDK integration, and proof points like Lovable and Ramp make Modal a serious sandbox contender, not just a GPU cloud. For lightweight CPU-only agent sandboxes, E2B or Daytona may still be cheaper fits.

Recommended for: AI teams needing GPU access with serverless simplicity, inference APIs, training jobs, RL rollouts, Python-based batch processing, or GPU-accelerated agent sandboxes at scale.

Not recommended for: Cost-sensitive GPU workloads, polyglot teams outside Python, or organizations requiring self-hosted infrastructure.

Outlook: With $355M in fresh Series C capital and ~$300M annualized revenue, expect aggressive expansion in agent infrastructure — Sandboxes, Notebooks with memory snapshots, and enterprise compliance — while fending off Cloudflare and hyperscaler entrants.


Research by Ry Walker Research • methodology