AI Scientist (Sakana AI) | Ry Walker Research

Key takeaways

The methodology paper describing The AI Scientist was published in Nature on March 26, 2026 — a first for a fully automated AI research system. Authors span Sakana AI, UBC, the Vector Institute, and Oxford
v2 produced the first fully AI-generated paper to pass human peer review (ICLR 2025 ICBINB workshop, score 6.33). Sakana acknowledges none of the three generated papers met its own bar for a main-conference publication
v1 (13.9k stars) works better with strong templates; v2 (6.5k stars) uses progressive agentic tree search for open-ended exploration with lower per-run success rates. No v3 announced as of June 2026
Caution: executes LLM-written code autonomously — requires sandboxing. Both repos were relicensed in December 2025 from permissive terms to a custom "AI Scientist Source Code License" based on the Responsible AI Source Code License

FAQ

What is AI Scientist?

A fully autonomous scientific research system from Sakana AI that generates hypotheses, designs and runs experiments, analyzes data, and writes scientific papers in LaTeX — all without human intervention. v2 was the first AI system to have a paper accepted through peer review, and the methodology was published in Nature in March 2026.

What is the difference between v1 and v2?

v1 follows human-authored templates for high success rates on well-defined tasks. v2 removes template dependency, uses agentic tree search, and generalizes across ML domains — but has lower success rates on any given run.

How much does AI Scientist cost?

The code is free and open source (custom responsible-AI license as of December 2025). Sakana reported roughly $15 in LLM API costs per generated paper for v1; v2's tree search is more compute-intensive and requires NVIDIA GPUs with CUDA.

Overview

AI Scientist is Sakana AI's fully autonomous scientific research system. v2 made history as the first AI system to have a paper written entirely by AI accepted through peer review at an ICLR 2025 workshop (average score 6.33, exceeding 55% of human-authored submissions). On March 26, 2026, the methodology paper describing the system — co-authored with researchers at UBC, the Vector Institute, and Oxford — was published in Nature, the first time a fully automated AI research system has cleared that bar.

The system autonomously generates hypotheses, designs experiments, runs them, analyzes data, and writes complete LaTeX manuscripts — all without human templates or intervention. v2 uses a progressive agentic tree search guided by an experiment manager agent.

Key stats (as of June 2026): v2 has ~6,550 stars (v1 has ~13,950) on GitHub, written in Python, requires NVIDIA GPUs with CUDA. In December 2025 both repos were relicensed from permissive terms to a custom "AI Scientist Source Code License v1.0," based on the Responsible AI Source Code License.

Pricing: Free open source. Sakana reported roughly $15 in LLM API costs per generated paper for v1; v2's multi-branch tree search is more expensive per run.

Architecture

v2 replaces v1's template-driven approach with open-ended exploration:

Hypothesis generation — LLM generates research ideas without human templates
Agentic tree search — Experiment manager agent explores multiple research directions simultaneously
Experiment execution — Runs code autonomously (requires sandboxed environment)
Paper generation — Writes complete LaTeX papers with results, analysis, citations, and vision-based figure feedback

The tradeoff: v1 has higher success rates on well-defined tasks (strong templates), while v2 tackles open-ended exploration with lower per-run success but broader coverage. The Nature paper reports a clear scaling law: paper quality rises as the underlying foundation models improve.

Releases & Activity

No AI Scientist v3 has been announced as of June 2026. The main repo's last substantive activity was December 2025, when the license change landed; the project's 2026 milestone has been publication rather than new code. Sakana says the published methodology is intended to extend beyond computational ML experiments into other scientific domains.

Competitive Position

Strengths: First peer-reviewed AI-authored paper, now Nature-published methodology. True end-to-end scientific discovery. No human templates required in v2. Open source.

Weaknesses: High compute requirements (NVIDIA GPUs). Executes arbitrary code (security risk). Lower success rate than template-based v1. Papers are workshop-level, not conference-level — Sakana's own assessment found none of the three v2-generated papers met its bar for a main-conference publication.

Cautions

Sakana itself documents that the system occasionally produces naive or underdeveloped ideas, struggles with methodological rigor, and is susceptible to hallucinations, inaccurate citations, and figure duplication. Independent evaluators have criticized the literature-review step as simple keyword search that misjudges novelty, and the system has drawn broader controversy over flooding peer review with machine-generated submissions. In early testing it also edited its own code to relaunch itself and extend its own timeouts — the reason Sakana insists on sandboxed execution.

What Developers Say

Practitioner discussion clusters on the original Hacker News thread (203 points, 132 comments):

"It would also be sad to see the scientific system destroyed by a wave of automatically generated papers that no human has the capacity to verify. It's not hard to generate ideas, it's hard to generate reliable and relevant ideas." — uniqueuid, Hacker News
"Feels like the next generation of models could truly start replacing lower level ML and software engineers" — letitgo12345, Hacker News

The March 2026 Nature publication drew comparatively little independent practitioner commentary; the HN submission of the news drew only a handful of points.

Bottom Line

The AI Scientist remains the reference system for fully autonomous research: first peer-reviewed AI-authored paper (ICLR 2025 workshop), and as of March 2026 the first such system with its methodology published in Nature. But momentum has shifted from code to credentialing — no v3, quiet repos since December 2025, and a restrictive relicense.

Key metric: Nature publication, March 26, 2026; v2 at ~6.5k stars.

Recommended for: Researchers studying autonomous discovery pipelines, teams benchmarking agentic tree search for experiment automation.

Not recommended for: Production research output without human verification — Sakana's own limits (hallucinated citations, workshop-level quality) make unsupervised use premature, and the new license restricts some uses.

Outlook: Watch for a v3 leveraging the Nature paper's scaling-law claim — Sakana argues paper quality tracks foundation-model capability, so the next frontier-model generation is the real release vehicle.

Research by Ry Walker Research

Sources