Refact.ai | Ry Walker Research

Key takeaways

Posted 74.4% pass@1 on SWE-bench Verified (372/500 tasks) with Claude 4 Sonnet in June 2025 — #1 among open-source agents and #2 overall at submission, with the full pipeline published as open source
Self-hosting is the differentiator — a BSD-licensed server you run yourself, with enterprise fine-tuning of models on your own codebase and multi-GPU support, versus cloud-only rivals
Broad IDE surface for its size — VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, with a Free tier and Pro from $10/month

FAQ

What is Refact.ai?

Refact.ai is an open-source AI coding agent from Small Magellanic Cloud AI that plugs into your IDE, plans and executes engineering tasks end-to-end, and can be fully self-hosted on your own hardware.

How much does Refact.ai cost?

A Free tier offers limited daily agent usage; Pro starts at $10/month; Enterprise (self-hosted, fine-tuning, priority support) is custom-priced. The core server and agent are open source.

What models does Refact.ai use?

Claude and GPT-4o-family models on the cloud plans, bring-your-own-key support for Gemini, Grok, OpenAI, and DeepSeek, and locally hosted models on the self-hosted server.

How is Refact.ai different from Cline?

Cline is a local extension that talks to cloud LLM providers; Refact.ai offers a full self-hosted server — including fine-tuning models on your own codebase — for teams that can't let code leave their infrastructure.

Executive Summary

Refact.ai is an open-source AI agent that handles engineering tasks end-to-end — it integrates with developers' tools, plans, executes, and iterates until it achieves a successful result.^[1] Built by London-based Small Magellanic Cloud AI Ltd., it ships as IDE plugins for VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, backed by an engine written in Rust.^[2]^[1]^[3] The repository holds 3,500+ GitHub stars with active development as of June 2026.^[1]

Two things distinguish it in a crowded category. First, benchmarks: in June 2025 the Refact.ai Agent scored 74.4% pass@1 on SWE-bench Verified (372 of 500 tasks) using Anthropic's Claude 4 Sonnet — #1 among open-source agents and #2 overall at submission — plus #1 on SWE-bench Multimodal (35.59%), with the full pipeline published as open source.^[4] Second, deployment: unlike cloud-only rivals, Refact.ai offers a fully self-hosted server with enterprise fine-tuning of models on a company's own codebase, multi-GPU support, and complete data ownership.^[2]

Attribute	Value
Company	Small Magellanic Cloud AI Ltd. (London, UK)^[3]
Founded	Repo created April 2023^[1]
Funding	Not publicly disclosed^[3]
GitHub Stars	3,500+ (June 2026)^[1]
License	BSD-3-Clause^[1]

Product Overview

Refact.ai is an agent you use from inside your existing IDE: it takes a task, builds context from your codebase, and works through it — editing files, running tools, and iterating — rather than only suggesting completions.^[1] Alongside the agent it offers context-aware code completion and chat, and on paid and self-hosted plans it can be fine-tuned on a company's own code for completions that match house conventions.^[2]

Key Capabilities

Capability	Description
Autonomous Agent	Plans, executes, and iterates on engineering tasks end-to-end^[1]
Code Completion & Chat	Context-aware completion and in-IDE chat^[2]
Tool Integrations	Integrates with developers' existing tools as part of agent runs^[1]
Codebase Fine-Tuning	Train models on your own code (Enterprise/self-hosted)^[2]
Self-Hosting	Run the full server on your own hardware with data ownership^[2]
BYOK Models	Bring your own keys for Gemini, Grok, OpenAI, DeepSeek, and others^[2]

Product Surfaces

Surface	Description	Availability
VS Code Plugin	Primary IDE surface^[2]	GA
JetBrains Plugins	PyCharm, WebStorm, GoLand, IntelliJ, CLion^[2]	GA
Other Editors	Visual Studio, Neovim, Sublime Text^[2]	GA
Self-Hosted Server	Open-source server for on-prem deployment^[2]	GA

Technical Architecture

Refact.ai splits into IDE plugins and an engine/server layer; the core repository is primarily Rust under a BSD-3-Clause license.^[1] In cloud mode, plugins talk to Refact's hosted models (Claude and GPT-4o family) or to providers via bring-your-own-key; in self-hosted mode, the server runs on your own GPUs with locally hosted models, fine-tuning, and multi-GPU optimization — the configuration aimed at teams whose code cannot leave their infrastructure.^[2]

Key Technical Details

Aspect	Detail
Deployment	Cloud, or fully self-hosted/on-prem server^[2]
Model(s)	Claude, GPT-4o / GPT-4o mini; BYOK for Gemini, Grok, OpenAI, DeepSeek; local models when self-hosted^[2]
Integrations	VS Code, JetBrains family, Visual Studio, Neovim, Sublime Text; developer tools during agent runs^[2]^[1]
Open Source	Yes (BSD-3-Clause)^[1]

Benchmark pipeline: The SWE-bench harness Refact.ai used for its Verified and Multimodal submissions is itself open source, so the 74.4% pass@1 result can be reproduced end-to-end.^[4]

Strengths

Verified benchmark results — 74.4% pass@1 on SWE-bench Verified (372/500) and #1 on SWE-bench Multimodal (35.59%, 184/517) with Claude 4 Sonnet at its June 2025 submission, up from 70.4% with Claude 3.7^[4]
SWE-bench Lite leader — 60.0% (180/300 tasks), the top open-source result on the Lite leaderboard when published^[5]
Self-hosting as a first-class mode — full server on your own hardware with complete data ownership, not a cloud product with an on-prem afterthought^[2]
Fine-tuning on your codebase — enterprise deployments can train models on company code with multi-GPU optimization^[2]
Broad IDE coverage — VS Code, five JetBrains IDEs, Visual Studio, Neovim, and Sublime Text^[2]
Reproducible claims — benchmark pipeline published as open source rather than asserted^[4]
Cheap entry — Free tier with daily agent usage; Pro from $10/month^[2]

Cautions

Benchmark claims are dated — the headline SWE-bench results are from June 2025; agent leaderboards move quickly, and the current standings should be checked rather than assumed^[4]
Benchmark skepticism from the community — in the project's HN launch discussion, commenters questioned how the team could be sure HumanEval data hadn't leaked into training data (the team cited LSH-based filtering) and pushed for human-acceptance comparisons against Copilot instead of static benchmarks^[6]
Small community relative to rivals — 3,500+ stars and 319 forks versus 60K+ for the category leaders; fewer contributors and less third-party content^[1]
Funding opacity — no disclosed rounds; Tracxn lists the company without published funding amounts, making financial runway hard to assess from outside^[3]
Best benchmark results lean on Claude — the 74.4% score is powered by Claude 4 Sonnet, so fully air-gapped local-model deployments shouldn't expect leaderboard-level autonomy^[4]

Pricing & Licensing

Tier	Price	Includes
Free	$0	Limited daily agent usage, completion and chat^[2]
Pro	From $10/month	Expanded agent usage, premium models (Claude, GPT-4o family), BYOK^[2]
Enterprise	Custom	Self-hosted/on-prem, fine-tuning on company codebase, multi-GPU optimization, code privacy, priority support^[2]

Licensing model: Open-core — the engine and self-hosted server are BSD-3-Clause open source, with paid cloud plans and a custom-priced enterprise tier on top.^[1]^[2]

Hidden costs: Self-hosting shifts spend from subscriptions to GPU hardware and ops time; BYOK usage bills against your own provider keys.

Competitive Positioning

Direct Competitors

Competitor	Differentiation
Cline	VS Code-native extension with GUI approval for every action; Refact.ai adds a self-hosted server and codebase fine-tuning Cline doesn't offer
OpenHands	Platform-style autonomous agent with a hosted cloud; Refact.ai leads with on-prem deployment and IDE-resident workflows
Goose	Local-first CLI/desktop agent under Linux Foundation governance; Refact.ai is IDE-plugin-first with an enterprise fine-tuning story
GitHub Copilot	Autocomplete-first incumbent; Refact.ai is an end-to-end agent that can also run fully on your own hardware
Tabby / Continue	Other self-hostable open-source assistants; Refact.ai differentiates with published SWE-bench agent results^[4]

When to Choose Refact.ai Over Alternatives

Choose Refact.ai when: Your code cannot leave your infrastructure and you still want an autonomous agent — self-hosting plus fine-tuning on your own codebase is the core bet
Choose Cline when: You live in VS Code and want the largest community with explicit approval for every action
Choose OpenHands when: You want cloud-hosted background agents rather than IDE-resident ones
Choose Goose when: You want a vendor-neutral, foundation-governed local agent independent of any editor

Ideal Customer Profile

Best fit:

Teams with data-residency or compliance requirements that rule out cloud coding agents
Organizations with GPUs who want models fine-tuned on their own codebase^[2]
JetBrains-heavy shops underserved by VS Code-only agents
Budget-sensitive individuals — Free tier and $10/month Pro^[2]

Poor fit:

Teams that pick tools by community size and ecosystem depth
Buyers who need disclosed funding and a long enterprise track record
Users expecting leaderboard-level autonomy from fully local models

Viability Assessment

Factor	Assessment
Financial Health	Unknown — funding not publicly disclosed^[3]
Market Position	Niche challenger — 3,500+ stars, but the strongest published SWE-bench results among self-hostable agents^[1]^[4]
Innovation Pace	Active — repo pushed within the last two weeks as of June 2026; benchmark scores improved 70.4% → 74.4% across model upgrades^[1]^[4]
Community/Ecosystem	Modest — 319 forks; far smaller than Cline or Goose communities^[1]
Long-term Outlook	Cautiously positive — clear differentiation (self-hosting + fine-tuning), but undisclosed funding and giant competitors are real risks

Refact.ai punches above its star count: its June 2025 SWE-bench Verified submission outscored far larger open-source projects, and it occupies a defensible niche — the self-hosted, fine-tunable agent — that cloud-first rivals have largely ceded.^[4] The open questions are financial (no disclosed funding) and competitive (whether the niche stays defensible as enterprise tiers of bigger agents add on-prem options).^[3]

Bottom Line

Refact.ai is the self-hosting specialist among open-source coding agents: a BSD-licensed, end-to-end engineering agent with credible, reproducible benchmark results and an enterprise story — on-prem deployment plus fine-tuning on your own code — that the larger players don't match. Its community is small and its benchmark headlines are a year old, but the product targets a real constraint (code that can't leave the building) rather than competing head-on for mindshare.

Recommended for: Compliance-bound teams that need an autonomous coding agent on their own infrastructure, JetBrains-centric organizations, and anyone who wants codebase fine-tuning without sending code to a third party.

Not recommended for: Developers who optimize for ecosystem size and community support, or buyers who require a well-capitalized vendor with disclosed funding.

Outlook: Refact.ai's fate hinges on whether self-hosted agentic coding becomes a procurement requirement at scale. If regulated industries adopt agents the way they adopted self-hosted Git, its niche becomes a moat; if cloud agents win on capability, a 3.5K-star challenger will struggle for attention. Watch for refreshed benchmark submissions and any funding disclosure.

Research by Ry Walker Research • methodology

Sources