Key takeaways
- The original open-source deep research agent — planner generates research questions, execution agents gather information from 20+ sources, publisher aggregates into comprehensive reports with citations
- 25.7k stars, Apache-2.0 license, active since May 2023. The longest-running and most battle-tested deep research agent in the ecosystem
- Works with any LLM provider. Supports web research, local document research, and hybrid modes. Claude Code skill integration via skills.sh
- Inspired by Plan-and-Solve and RAG papers — addresses hallucination, speed, and bias by parallelizing agent work across multiple sources
FAQ
What is GPT Researcher?
An autonomous deep research agent that generates detailed, factual research reports with citations. It creates a research plan, dispatches crawler agents to gather information, then aggregates findings into a comprehensive report.
How does GPT Researcher differ from ChatGPT or Claude deep research?
GPT Researcher is fully open source (Apache-2.0), works with any LLM provider, and can research both web sources and local documents. Proprietary alternatives are locked to their respective platforms.
Overview
GPT Researcher is the original open-source deep research agent — an autonomous system that produces detailed, factual, and unbiased research reports with citations. Created in May 2023 by Assaf Elovic, it predates the deep research wave by nearly two years and remains the most-starred open-source deep research tool at 25.7k stars.
The core architecture uses a planner/execution pattern: the planner generates research questions from a user query, crawler agents gather information from 20+ web sources in parallel, and a publisher aggregates all findings into a comprehensive report with source tracking.
Key stats: 25,753 stars, Apache-2.0 license, Python, active development since May 2023.
Architecture
The agent follows a structured pipeline:
- Create a task-specific agent based on the research query
- Generate questions that collectively form an objective opinion on the task
- Use crawler agents to gather information for each question in parallel
- Summarize and source-track each resource
- Filter and aggregate summaries into a final research report
The parallelized approach addresses key problems with naive LLM research: it reduces hallucination by cross-referencing multiple sources, increases speed through concurrent crawling, and mitigates bias by drawing from diverse sources.
Competitive Position
Strengths: Longest-running deep research agent. Battle-tested, well-documented, large community. Apache-2.0 license. Works with any LLM provider. Claude Code skill integration.
Weaknesses: Prompt-based approach (no fine-tuned model). Newer tools like Tongyi DeepResearch are beating it on benchmarks with RL-trained specialized models.
Research by Ry Walker Research