Key takeaways
- Posted 74.4% pass@1 on SWE-bench Verified (372/500 tasks) with Claude 4 Sonnet in June 2025 — #1 among open-source agents and #2 overall at submission, with the full pipeline published as open source
- Self-hosting is the differentiator — a BSD-licensed server you run yourself, with enterprise fine-tuning of models on your own codebase and multi-GPU support, versus cloud-only rivals
- Broad IDE surface for its size — VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, with a Free tier and Pro from $10/month
FAQ
What is Refact.ai?
Refact.ai is an open-source AI coding agent from Small Magellanic Cloud AI that plugs into your IDE, plans and executes engineering tasks end-to-end, and can be fully self-hosted on your own hardware.
How much does Refact.ai cost?
A Free tier offers limited daily agent usage; Pro starts at $10/month; Enterprise (self-hosted, fine-tuning, priority support) is custom-priced. The core server and agent are open source.
What models does Refact.ai use?
Claude and GPT-4o-family models on the cloud plans, bring-your-own-key support for Gemini, Grok, OpenAI, and DeepSeek, and locally hosted models on the self-hosted server.
How is Refact.ai different from Cline?
Cline is a local extension that talks to cloud LLM providers; Refact.ai offers a full self-hosted server — including fine-tuning models on your own codebase — for teams that can't let code leave their infrastructure.
Executive Summary
Refact.ai is an open-source AI agent that handles engineering tasks end-to-end — it integrates with developers' tools, plans, executes, and iterates until it achieves a successful result.[1] Built by London-based Small Magellanic Cloud AI Ltd., it ships as IDE plugins for VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, backed by an engine written in Rust.[2][1][3] The repository holds 3,500+ GitHub stars with active development as of June 2026.[1]
Two things distinguish it in a crowded category. First, benchmarks: in June 2025 the Refact.ai Agent scored 74.4% pass@1 on SWE-bench Verified (372 of 500 tasks) using Anthropic's Claude 4 Sonnet — #1 among open-source agents and #2 overall at submission — plus #1 on SWE-bench Multimodal (35.59%), with the full pipeline published as open source.[4] Second, deployment: unlike cloud-only rivals, Refact.ai offers a fully self-hosted server with enterprise fine-tuning of models on a company's own codebase, multi-GPU support, and complete data ownership.[2]
| Attribute | Value |
|---|---|
| Company | Small Magellanic Cloud AI Ltd. (London, UK)[3] |
| Founded | Repo created April 2023[1] |
| Funding | Not publicly disclosed[3] |
| GitHub Stars | 3,500+ (June 2026)[1] |
| License | BSD-3-Clause[1] |
Product Overview
Refact.ai is an agent you use from inside your existing IDE: it takes a task, builds context from your codebase, and works through it — editing files, running tools, and iterating — rather than only suggesting completions.[1] Alongside the agent it offers context-aware code completion and chat, and on paid and self-hosted plans it can be fine-tuned on a company's own code for completions that match house conventions.[2]
Key Capabilities
| Capability | Description |
|---|---|
| Autonomous Agent | Plans, executes, and iterates on engineering tasks end-to-end[1] |
| Code Completion & Chat | Context-aware completion and in-IDE chat[2] |
| Tool Integrations | Integrates with developers' existing tools as part of agent runs[1] |
| Codebase Fine-Tuning | Train models on your own code (Enterprise/self-hosted)[2] |
| Self-Hosting | Run the full server on your own hardware with data ownership[2] |
| BYOK Models | Bring your own keys for Gemini, Grok, OpenAI, DeepSeek, and others[2] |
Product Surfaces
| Surface | Description | Availability |
|---|---|---|
| VS Code Plugin | Primary IDE surface[2] | GA |
| JetBrains Plugins | PyCharm, WebStorm, GoLand, IntelliJ, CLion[2] | GA |
| Other Editors | Visual Studio, Neovim, Sublime Text[2] | GA |
| Self-Hosted Server | Open-source server for on-prem deployment[2] | GA |
Technical Architecture
Refact.ai splits into IDE plugins and an engine/server layer; the core repository is primarily Rust under a BSD-3-Clause license.[1] In cloud mode, plugins talk to Refact's hosted models (Claude and GPT-4o family) or to providers via bring-your-own-key; in self-hosted mode, the server runs on your own GPUs with locally hosted models, fine-tuning, and multi-GPU optimization — the configuration aimed at teams whose code cannot leave their infrastructure.[2]
Key Technical Details
| Aspect | Detail |
|---|---|
| Deployment | Cloud, or fully self-hosted/on-prem server[2] |
| Model(s) | Claude, GPT-4o / GPT-4o mini; BYOK for Gemini, Grok, OpenAI, DeepSeek; local models when self-hosted[2] |
| Integrations | VS Code, JetBrains family, Visual Studio, Neovim, Sublime Text; developer tools during agent runs[2][1] |
| Open Source | Yes (BSD-3-Clause)[1] |
Benchmark pipeline: The SWE-bench harness Refact.ai used for its Verified and Multimodal submissions is itself open source, so the 74.4% pass@1 result can be reproduced end-to-end.[4]
Strengths
- Verified benchmark results — 74.4% pass@1 on SWE-bench Verified (372/500) and #1 on SWE-bench Multimodal (35.59%, 184/517) with Claude 4 Sonnet at its June 2025 submission, up from 70.4% with Claude 3.7[4]
- SWE-bench Lite leader — 60.0% (180/300 tasks), the top open-source result on the Lite leaderboard when published[5]
- Self-hosting as a first-class mode — full server on your own hardware with complete data ownership, not a cloud product with an on-prem afterthought[2]
- Fine-tuning on your codebase — enterprise deployments can train models on company code with multi-GPU optimization[2]
- Broad IDE coverage — VS Code, five JetBrains IDEs, Visual Studio, Neovim, and Sublime Text[2]
- Reproducible claims — benchmark pipeline published as open source rather than asserted[4]
- Cheap entry — Free tier with daily agent usage; Pro from $10/month[2]
Cautions
- Benchmark claims are dated — the headline SWE-bench results are from June 2025; agent leaderboards move quickly, and the current standings should be checked rather than assumed[4]
- Benchmark skepticism from the community — in the project's HN launch discussion, commenters questioned how the team could be sure HumanEval data hadn't leaked into training data (the team cited LSH-based filtering) and pushed for human-acceptance comparisons against Copilot instead of static benchmarks[6]
- Small community relative to rivals — 3,500+ stars and 319 forks versus 60K+ for the category leaders; fewer contributors and less third-party content[1]
- Funding opacity — no disclosed rounds; Tracxn lists the company without published funding amounts, making financial runway hard to assess from outside[3]
- Best benchmark results lean on Claude — the 74.4% score is powered by Claude 4 Sonnet, so fully air-gapped local-model deployments shouldn't expect leaderboard-level autonomy[4]
Pricing & Licensing
| Tier | Price | Includes |
|---|---|---|
| Free | $0 | Limited daily agent usage, completion and chat[2] |
| Pro | From $10/month | Expanded agent usage, premium models (Claude, GPT-4o family), BYOK[2] |
| Enterprise | Custom | Self-hosted/on-prem, fine-tuning on company codebase, multi-GPU optimization, code privacy, priority support[2] |
Licensing model: Open-core — the engine and self-hosted server are BSD-3-Clause open source, with paid cloud plans and a custom-priced enterprise tier on top.[1][2]
Hidden costs: Self-hosting shifts spend from subscriptions to GPU hardware and ops time; BYOK usage bills against your own provider keys.
Competitive Positioning
Direct Competitors
| Competitor | Differentiation |
|---|---|
| Cline | VS Code-native extension with GUI approval for every action; Refact.ai adds a self-hosted server and codebase fine-tuning Cline doesn't offer |
| OpenHands | Platform-style autonomous agent with a hosted cloud; Refact.ai leads with on-prem deployment and IDE-resident workflows |
| Goose | Local-first CLI/desktop agent under Linux Foundation governance; Refact.ai is IDE-plugin-first with an enterprise fine-tuning story |
| GitHub Copilot | Autocomplete-first incumbent; Refact.ai is an end-to-end agent that can also run fully on your own hardware |
| Tabby / Continue | Other self-hostable open-source assistants; Refact.ai differentiates with published SWE-bench agent results[4] |
When to Choose Refact.ai Over Alternatives
- Choose Refact.ai when: Your code cannot leave your infrastructure and you still want an autonomous agent — self-hosting plus fine-tuning on your own codebase is the core bet
- Choose Cline when: You live in VS Code and want the largest community with explicit approval for every action
- Choose OpenHands when: You want cloud-hosted background agents rather than IDE-resident ones
- Choose Goose when: You want a vendor-neutral, foundation-governed local agent independent of any editor
Ideal Customer Profile
Best fit:
- Teams with data-residency or compliance requirements that rule out cloud coding agents
- Organizations with GPUs who want models fine-tuned on their own codebase[2]
- JetBrains-heavy shops underserved by VS Code-only agents
- Budget-sensitive individuals — Free tier and $10/month Pro[2]
Poor fit:
- Teams that pick tools by community size and ecosystem depth
- Buyers who need disclosed funding and a long enterprise track record
- Users expecting leaderboard-level autonomy from fully local models
Viability Assessment
| Factor | Assessment |
|---|---|
| Financial Health | Unknown — funding not publicly disclosed[3] |
| Market Position | Niche challenger — 3,500+ stars, but the strongest published SWE-bench results among self-hostable agents[1][4] |
| Innovation Pace | Active — repo pushed within the last two weeks as of June 2026; benchmark scores improved 70.4% → 74.4% across model upgrades[1][4] |
| Community/Ecosystem | Modest — 319 forks; far smaller than Cline or Goose communities[1] |
| Long-term Outlook | Cautiously positive — clear differentiation (self-hosting + fine-tuning), but undisclosed funding and giant competitors are real risks |
Refact.ai punches above its star count: its June 2025 SWE-bench Verified submission outscored far larger open-source projects, and it occupies a defensible niche — the self-hosted, fine-tunable agent — that cloud-first rivals have largely ceded.[4] The open questions are financial (no disclosed funding) and competitive (whether the niche stays defensible as enterprise tiers of bigger agents add on-prem options).[3]
Bottom Line
Refact.ai is the self-hosting specialist among open-source coding agents: a BSD-licensed, end-to-end engineering agent with credible, reproducible benchmark results and an enterprise story — on-prem deployment plus fine-tuning on your own code — that the larger players don't match. Its community is small and its benchmark headlines are a year old, but the product targets a real constraint (code that can't leave the building) rather than competing head-on for mindshare.
Recommended for: Compliance-bound teams that need an autonomous coding agent on their own infrastructure, JetBrains-centric organizations, and anyone who wants codebase fine-tuning without sending code to a third party.
Not recommended for: Developers who optimize for ecosystem size and community support, or buyers who require a well-capitalized vendor with disclosed funding.
Outlook: Refact.ai's fate hinges on whether self-hosted agentic coding becomes a procurement requirement at scale. If regulated industries adopt agents the way they adopted self-hosted Git, its niche becomes a moat; if cloud agents win on capability, a 3.5K-star challenger will struggle for attention. Watch for refreshed benchmark submissions and any funding disclosure.
Research by Ry Walker Research • methodology
Sources
- [1] Refact.ai GitHub Repository
- [2] Refact.ai Website
- [3] Small Magellanic Cloud AI — Company Profile (Tracxn)
- [4] Refact.ai Agent achieved leading results on SWE-bench Multimodal and Verified
- [5] Open-Source Refact.ai Agent is SOTA on SWE-bench Lite With a 60.0% Score
- [6] Refact LLM launch discussion (Hacker News)