← Back to research
·2 min read·company

AI Scientist

AI Scientist v2 — the first fully autonomous scientific discovery system to have a paper accepted through peer review. Agentic tree search generates hypotheses, runs experiments, writes LaTeX papers. 2.3k stars (v2), MIT license.

Key takeaways

  • First AI system to have a paper written entirely by AI accepted through peer review (ICLR 2025 workshop). End-to-end: hypothesis generation, experiment design, execution, data analysis, and LaTeX paper writing
  • v2 uses progressive agentic tree search guided by an experiment manager agent, removing reliance on human-authored templates. Generalizes across ML domains
  • v1 (12.3k stars) works better with strong templates; v2 (2.3k stars) is designed for open-ended scientific exploration with lower but broader success rates
  • Caution: executes LLM-written code autonomously. Requires sandboxed environment. Uses NVIDIA GPUs with CUDA/PyTorch

FAQ

What is AI Scientist?

A fully autonomous scientific research system from Sakana AI that generates hypotheses, designs and runs experiments, analyzes data, and writes scientific papers in LaTeX — all without human intervention. v2 was the first AI system to have a paper accepted through peer review.

What is the difference between v1 and v2?

v1 follows human-authored templates for high success rates on well-defined tasks. v2 removes template dependency, uses agentic tree search, and generalizes across ML domains — but has lower success rates on any given run.

Overview

AI Scientist is Sakana AI's fully autonomous scientific research system. v2 made history as the first AI system to have a paper written entirely by AI accepted through peer review at an ICLR 2025 workshop.

The system autonomously generates hypotheses, designs experiments, runs them, analyzes data, and writes complete LaTeX manuscripts — all without human templates or intervention. v2 uses a progressive agentic tree search guided by an experiment manager agent.

Key stats: v2 has 2,277 stars (v1 has 12,330), MIT-equivalent license, Python. Requires NVIDIA GPUs with CUDA.


Architecture

v2 replaces v1's template-driven approach with open-ended exploration:

  • Hypothesis generation — LLM generates research ideas without human templates
  • Agentic tree search — Experiment manager agent explores multiple research directions simultaneously
  • Experiment execution — Runs code autonomously (requires sandboxed environment)
  • Paper generation — Writes complete LaTeX papers with results, analysis, and citations

The tradeoff: v1 has higher success rates on well-defined tasks (strong templates), while v2 tackles open-ended exploration with lower per-run success but broader coverage.


Competitive Position

Strengths: First peer-reviewed AI-authored paper. True end-to-end scientific discovery. No human templates required in v2. Open source.

Weaknesses: High compute requirements (NVIDIA GPUs). Executes arbitrary code (security risk). Lower success rate than template-based v1. Papers are workshop-level, not conference-level.


Research by Ry Walker Research