Kosmos (Edison Scientific) | Ry Walker Research

Key takeaways

A single 12-hour Kosmos run executes ~200 agent rollouts, writes an average ~42,000 lines of analysis code, and reads ~1,500 full-text papers — work users estimate equals roughly six months of a human researcher's effort
Every statement in a Kosmos report is cited to either its own code or primary literature; independent scientists rated 79.4% of statements accurate, with synthesis claims (58%) far weaker than data analysis (85.5%)
A FutureHouse spinout co-led by Sam Rodriques and Andrew White, backed by a $70M seed; a generous free academic tier with power users paying $200 per run, and a strategic collaboration embedding Kosmos across Incyte's R&D

FAQ

What is Kosmos?

Kosmos is an autonomous "AI Scientist" from Edison Scientific that, given a goal and datasets, iterates across data analysis, literature search, and hypothesis generation over a 12-hour run and returns a fully cited research report.

How much does Kosmos cost?

Edison maintains a generous free tier for academic researchers and charges power users and enterprise clients $200 per research run.

How does Kosmos work?

Given a goal and one or more datasets, Kosmos runs roughly 20 cycles of data analysis, literature search, and hypothesis generation — about 200 agent rollouts over up to 12 hours — citing every statement to code or a primary source.

How is Kosmos different from Sakana's AI Scientist?

Kosmos is a closed commercial platform focused on data-driven discovery across biology, chemistry, and materials with embedded enterprise pharma deployments, whereas Sakana's AI Scientist is an open-source system focused on autonomously writing and publishing ML papers.

Executive Summary

Kosmos is Edison Scientific's autonomous "AI Scientist": given a research goal and one or more datasets, it iterates across data analysis, literature search, and hypothesis generation, then returns a fully cited research report.^[1]^[2] A single run lasts up to 12 hours, fires roughly 200 agent rollouts across about 20 cycles, writes an average of ~42,000 lines of analysis code, and reads ~1,500 full-text papers — output users estimate equals roughly six months of a human researcher's effort.^[2]^[3] Edison positions the system as able to reason over 175 million full-text papers, clinical trials, and patents, and to run hundreds of research tasks in parallel.^[1]

Edison Scientific is a for-profit spinout of FutureHouse, the Eric Schmidt-funded nonprofit AI-for-science lab, co-led by Sam Rodriques (CEO) and Andrew White; a portion of the FutureHouse team moved to Edison while the nonprofit continues foundational research.^[4]^[5] The company raised a $70M seed and reported that roughly 30,000 academic and biotech users tried Kosmos, with interest from 6 of the top 10 pharma companies.^[6]^[5] In May 2026 it announced a strategic collaboration to embed Kosmos across Incyte's discovery and development lifecycle.^[7]

Attribute	Value
Company	Edison Scientific (FutureHouse spinout)^[4]
Leadership	Sam Rodriques (CEO), Andrew White^[4]^[5]
Funding	$70M seed^[6]
Paper	"Kosmos: An AI Scientist for Autonomous Discovery," Mitchener et al. (37 authors), arXiv:2511.02824^[2]
Pricing	Free academic tier; $200 per research run for power/enterprise users^[1]^[6]

Product Overview

Kosmos takes a stated goal plus one or more datasets and runs an autonomous discovery loop that interleaves three activities — computational data analysis, literature search, and hypothesis generation — over roughly 20 cycles per run.^[2] It operates interactively, sending updates mid-run so a researcher can follow along "like a colleague," and returns a structured, fully cited report at the end.^[1] Edison reports seven discoveries to date released with academic beta testers: three reproduced previously unpublished findings, and four are described as net-new contributions to the literature, spanning metabolomics, materials science, neuroscience, and statistical genetics.^[5]^[2]

Key Capabilities

Capability	Description
12-hour discovery runs	~20 cycles, ~200 agent rollouts per run^[2]
Code-driven analysis	Average ~42,000 lines of analysis code generated per run^[2]
Literature reasoning	~1,500 full-text papers read per run; corpus of 175M+ papers, trials, patents^[2]^[1]
Traceable citations	Every statement cited to its own code or primary literature^[2]
Parallel agents	Hundreds of research tasks executed in parallel^[1]

Technical Architecture

Kosmos is a closed, hosted platform delivered as research runs rather than installable software.^[1] The published system orchestrates an "AI Scientist" agent that alternates data-analysis and literature-search rollouts, accumulating findings across cycles into a single report; the authors emphasize that "Kosmos cites all statements in its reports with code or primary literature, ensuring its reasoning is traceable."^[2] The underlying foundation models and full agent stack are not disclosed in the public materials.^[2]^[1]

Key Technical Details

Aspect	Detail
Deployment	Hosted SaaS; runs purchased per execution; no self-hosting disclosed^[1]
Run profile	Up to 12 hours, ~200 agent rollouts, ~20 cycles^[2]
Verification	Independent scientists rated 79.4% of statements accurate^[2]
Models	Not publicly disclosed^[2]
Open Source	No — proprietary commercial platform^[1]

Strengths

Traceability is built in, not bolted on — every statement in a Kosmos report links to either the code that produced it or a primary source, making the reasoning auditable rather than a black-box summary.^[2]
Genuine data-to-discovery scope — unlike paper-writing systems, Kosmos analyzes user datasets directly, and its strongest verified band is exactly that: 85.5% of data-analysis statements were rated supported.^[5]^[2]
Credible scientific lineage — Rodriques's bench science background plus the FutureHouse research record (the first AI agent to beat humans at real-world literature search) lends credibility in a field crowded with overclaiming.^[4]^[5]
Enterprise validation — a strategic collaboration embedding Kosmos across Incyte's discovery and development lifecycle is a concrete biopharma deployment, not a logo-on-a-slide pilot.^[7]
Real reproduction results — three of seven reported discoveries independently reproduced previously unpublished findings, a meaningful validity signal beyond novelty claims.^[5]

Cautions

One in five conclusions is wrong — independent reviewers rated only 79.4% of statements accurate, and synthesis claims fell to 58% supported, the weakest and most consequential category.^[2]^[5]
Speed claims may be overstated in practice — the "six months in a day" framing is a user estimate; critics note verification time can offset much of the apparent gain.^[3]^[5]
Closed and undisclosed — neither the foundation models nor the agent stack are public, so buyers cannot audit the system itself, only its citations.^[2]
"AI scientist" skepticism is the headwind — even Edison's CEO publicly cautions that AI "probably won't cure diseases anytime soon," and named domain scientists call the system too error-prone for unsupervised work.^[8]^[5]
Literature-pollution risk — researchers worry tools like Kosmos accelerate an "exponential increase of papers" that is not necessarily meaningful.^[5]

What Scientists Say

Reaction clusters in an Alzforum feature surveying named researchers who beta-tested or reviewed Kosmos, mixing enthusiasm with pointed skepticism.^[5]

"One in five conclusions is still wrong." — Georg Meisl, University of Cambridge, who advises treating Kosmos "more like getting an opinion from a well-read colleague than thorough analysis"^[5]

Worried about a "future where we don't read the papers we cite … or write papers of our own," and about an "exponential increase of papers" that is "not necessarily meaningful." — Betty Tijms, Amsterdam UMC^[5]

"It is hard not to be impressed with the capabilities of AI." — Lary Walker, Emory University, who recommends a "use, but verify" principle^[5]

Now runs most of his datasets through Kosmos "not only to make new insights, but also to validate or replicate his own findings." — Mathieu Bourdenx, University College London^[5]

"AI tools will become an integral part of the workflow for most labs in the near future." — Jason Moore, Cedars-Sinai^[5]

CEO Sam Rodriques himself frames Kosmos modestly — "people should think of it as a research tool," likening it to "a humble DNA cloning kit."^[5]

Pricing & Licensing

Tier	Price	Includes
Academic / Free	$0	Generous free tier for academic researchers^[1]^[6]
Power / Enterprise	$200 per run	Higher rate limits, additional features; per-run billing^[6]
Strategic collaboration	Custom	Embedded deployment across an organization's R&D (e.g., Incyte)^[7]

Licensing model: Proprietary, hosted SaaS sold per research run; no open-source or self-hosted option disclosed.^[1]

Hidden costs: The dominant cost is human verification — with ~20% of statements inaccurate and synthesis at 58% supported, every report needs expert review before use.^[2]^[5]

Competitive Positioning

Direct Competitors

Competitor	Differentiation
AI Scientist	Sakana's open-source system autonomously writes and publishes ML papers; Kosmos is closed, data-driven, and aimed at biology/chemistry discovery with enterprise pharma deployments
Deep Research	Open deep-research agents synthesize web sources into reports; Kosmos additionally analyzes user datasets with generated code and targets novel scientific discovery
Google DeepMind co-scientist	A frontier-lab "AI co-scientist" research effort; Kosmos counters with a shipping commercial product, per-run pricing, and a named pharma deployment

When to Choose Kosmos Over Alternatives

Choose Kosmos when you have proprietary datasets and want autonomous, code-backed analysis plus literature synthesis returned as a cited report, and you have domain experts to verify it.
Choose AI Scientist when you want an open, auditable, self-hostable system for autonomous ML experimentation and paper generation.
Choose Deep Research when the task is web-knowledge synthesis rather than dataset-driven discovery and cost/openness matter most.

Ideal Customer Profile

Best fit:

Biopharma and biotech R&D teams with rich proprietary datasets and expert reviewers in the loop
Academic labs that can use the free tier to triage hypotheses and replicate prior findings
Organizations wanting an embedded, continuously-learning discovery layer across the R&D lifecycle (the Incyte model)

Poor fit:

Teams needing self-hosting, model transparency, or an auditable open-source core
Workflows that cannot absorb expert verification of every conclusion
Anyone treating outputs as publication-ready without human validation

Viability Assessment

Factor	Assessment
Financial Health	Strong for stage — $70M seed and a clear per-run revenue model^[6]
Market Position	Front-runner among commercial "AI Scientist" products, with a peer-reviewable arXiv paper and a named pharma deployment^[2]^[7]
Innovation Pace	High — FutureHouse lineage and rapid productization of agentic literature/data research^[4]
Community/Ecosystem	~30,000 users trialed Kosmos; sentiment is engaged but cautious among domain scientists^[5]
Long-term Outlook	Hinges on closing the accuracy gap — synthesis at 58% supported is the credibility bottleneck^[2]^[5]

The differentiator is trust engineering: by citing every statement to code or a primary source, Kosmos makes its errors findable rather than hidden, which is the only way "AI scientist" claims survive a skeptical field.^[2] The open question is whether enterprise customers like Incyte report durable productivity gains net of the verification tax that a ~20% error rate imposes.^[7]^[5]

Bottom Line

Kosmos is the most credible commercial "AI Scientist" to date: a 12-hour autonomous loop that writes ~42,000 lines of code, reads ~1,500 papers, cites every claim to code or literature, and has reproduced unpublished findings — backed by a $70M seed and embedded in Incyte's R&D.^[2]^[6]^[7] The catch is the same one that dogs the whole category: one in five conclusions is wrong, synthesis is the weakest link, and named scientists treat it as a well-read colleague to be verified, not an oracle.^[2]^[5]

Recommended for: Biopharma R&D and academic labs with proprietary data and expert reviewers who can exploit autonomous analysis while verifying every conclusion.

Not recommended for: Teams needing model transparency or self-hosting, or any workflow that would trust Kosmos output without human review.

Outlook: Watch whether the accuracy gap — especially 58% on synthesis — closes with model improvements, and whether the Incyte collaboration yields disclosed, durable productivity gains net of verification cost.^[2]^[7]

Research by Ry Walker Research • methodology

Sources