← Back to research
·9 min read·opensource

Refact.ai

Refact.ai is an open-source, self-hostable AI coding agent from Small Magellanic Cloud AI with 3.5K+ GitHub stars, IDE plugins for VS Code and JetBrains, and a 74.4% SWE-bench Verified score (June 2025) — differentiated by on-prem deployment and fine-tuning on your own codebase.

Key takeaways

  • Posted 74.4% pass@1 on SWE-bench Verified (372/500 tasks) with Claude 4 Sonnet in June 2025 — #1 among open-source agents and #2 overall at submission, with the full pipeline published as open source
  • Self-hosting is the differentiator — a BSD-licensed server you run yourself, with enterprise fine-tuning of models on your own codebase and multi-GPU support, versus cloud-only rivals
  • Broad IDE surface for its size — VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, with a Free tier and Pro from $10/month

FAQ

What is Refact.ai?

Refact.ai is an open-source AI coding agent from Small Magellanic Cloud AI that plugs into your IDE, plans and executes engineering tasks end-to-end, and can be fully self-hosted on your own hardware.

How much does Refact.ai cost?

A Free tier offers limited daily agent usage; Pro starts at $10/month; Enterprise (self-hosted, fine-tuning, priority support) is custom-priced. The core server and agent are open source.

What models does Refact.ai use?

Claude and GPT-4o-family models on the cloud plans, bring-your-own-key support for Gemini, Grok, OpenAI, and DeepSeek, and locally hosted models on the self-hosted server.

How is Refact.ai different from Cline?

Cline is a local extension that talks to cloud LLM providers; Refact.ai offers a full self-hosted server — including fine-tuning models on your own codebase — for teams that can't let code leave their infrastructure.

Executive Summary

Refact.ai is an open-source AI agent that handles engineering tasks end-to-end — it integrates with developers' tools, plans, executes, and iterates until it achieves a successful result.[1] Built by London-based Small Magellanic Cloud AI Ltd., it ships as IDE plugins for VS Code, the JetBrains family, Visual Studio, Neovim, and Sublime Text, backed by an engine written in Rust.[2][1][3] The repository holds 3,500+ GitHub stars with active development as of June 2026.[1]

Two things distinguish it in a crowded category. First, benchmarks: in June 2025 the Refact.ai Agent scored 74.4% pass@1 on SWE-bench Verified (372 of 500 tasks) using Anthropic's Claude 4 Sonnet — #1 among open-source agents and #2 overall at submission — plus #1 on SWE-bench Multimodal (35.59%), with the full pipeline published as open source.[4] Second, deployment: unlike cloud-only rivals, Refact.ai offers a fully self-hosted server with enterprise fine-tuning of models on a company's own codebase, multi-GPU support, and complete data ownership.[2]

AttributeValue
CompanySmall Magellanic Cloud AI Ltd. (London, UK)[3]
FoundedRepo created April 2023[1]
FundingNot publicly disclosed[3]
GitHub Stars3,500+ (June 2026)[1]
LicenseBSD-3-Clause[1]

Product Overview

Refact.ai is an agent you use from inside your existing IDE: it takes a task, builds context from your codebase, and works through it — editing files, running tools, and iterating — rather than only suggesting completions.[1] Alongside the agent it offers context-aware code completion and chat, and on paid and self-hosted plans it can be fine-tuned on a company's own code for completions that match house conventions.[2]

Key Capabilities

CapabilityDescription
Autonomous AgentPlans, executes, and iterates on engineering tasks end-to-end[1]
Code Completion & ChatContext-aware completion and in-IDE chat[2]
Tool IntegrationsIntegrates with developers' existing tools as part of agent runs[1]
Codebase Fine-TuningTrain models on your own code (Enterprise/self-hosted)[2]
Self-HostingRun the full server on your own hardware with data ownership[2]
BYOK ModelsBring your own keys for Gemini, Grok, OpenAI, DeepSeek, and others[2]

Product Surfaces

SurfaceDescriptionAvailability
VS Code PluginPrimary IDE surface[2]GA
JetBrains PluginsPyCharm, WebStorm, GoLand, IntelliJ, CLion[2]GA
Other EditorsVisual Studio, Neovim, Sublime Text[2]GA
Self-Hosted ServerOpen-source server for on-prem deployment[2]GA

Technical Architecture

Refact.ai splits into IDE plugins and an engine/server layer; the core repository is primarily Rust under a BSD-3-Clause license.[1] In cloud mode, plugins talk to Refact's hosted models (Claude and GPT-4o family) or to providers via bring-your-own-key; in self-hosted mode, the server runs on your own GPUs with locally hosted models, fine-tuning, and multi-GPU optimization — the configuration aimed at teams whose code cannot leave their infrastructure.[2]

Key Technical Details

AspectDetail
DeploymentCloud, or fully self-hosted/on-prem server[2]
Model(s)Claude, GPT-4o / GPT-4o mini; BYOK for Gemini, Grok, OpenAI, DeepSeek; local models when self-hosted[2]
IntegrationsVS Code, JetBrains family, Visual Studio, Neovim, Sublime Text; developer tools during agent runs[2][1]
Open SourceYes (BSD-3-Clause)[1]

Benchmark pipeline: The SWE-bench harness Refact.ai used for its Verified and Multimodal submissions is itself open source, so the 74.4% pass@1 result can be reproduced end-to-end.[4]


Strengths

  • Verified benchmark results — 74.4% pass@1 on SWE-bench Verified (372/500) and #1 on SWE-bench Multimodal (35.59%, 184/517) with Claude 4 Sonnet at its June 2025 submission, up from 70.4% with Claude 3.7[4]
  • SWE-bench Lite leader — 60.0% (180/300 tasks), the top open-source result on the Lite leaderboard when published[5]
  • Self-hosting as a first-class mode — full server on your own hardware with complete data ownership, not a cloud product with an on-prem afterthought[2]
  • Fine-tuning on your codebase — enterprise deployments can train models on company code with multi-GPU optimization[2]
  • Broad IDE coverage — VS Code, five JetBrains IDEs, Visual Studio, Neovim, and Sublime Text[2]
  • Reproducible claims — benchmark pipeline published as open source rather than asserted[4]
  • Cheap entry — Free tier with daily agent usage; Pro from $10/month[2]

Cautions

  • Benchmark claims are dated — the headline SWE-bench results are from June 2025; agent leaderboards move quickly, and the current standings should be checked rather than assumed[4]
  • Benchmark skepticism from the community — in the project's HN launch discussion, commenters questioned how the team could be sure HumanEval data hadn't leaked into training data (the team cited LSH-based filtering) and pushed for human-acceptance comparisons against Copilot instead of static benchmarks[6]
  • Small community relative to rivals — 3,500+ stars and 319 forks versus 60K+ for the category leaders; fewer contributors and less third-party content[1]
  • Funding opacity — no disclosed rounds; Tracxn lists the company without published funding amounts, making financial runway hard to assess from outside[3]
  • Best benchmark results lean on Claude — the 74.4% score is powered by Claude 4 Sonnet, so fully air-gapped local-model deployments shouldn't expect leaderboard-level autonomy[4]

Pricing & Licensing

TierPriceIncludes
Free$0Limited daily agent usage, completion and chat[2]
ProFrom $10/monthExpanded agent usage, premium models (Claude, GPT-4o family), BYOK[2]
EnterpriseCustomSelf-hosted/on-prem, fine-tuning on company codebase, multi-GPU optimization, code privacy, priority support[2]

Licensing model: Open-core — the engine and self-hosted server are BSD-3-Clause open source, with paid cloud plans and a custom-priced enterprise tier on top.[1][2]

Hidden costs: Self-hosting shifts spend from subscriptions to GPU hardware and ops time; BYOK usage bills against your own provider keys.


Competitive Positioning

Direct Competitors

CompetitorDifferentiation
ClineVS Code-native extension with GUI approval for every action; Refact.ai adds a self-hosted server and codebase fine-tuning Cline doesn't offer
OpenHandsPlatform-style autonomous agent with a hosted cloud; Refact.ai leads with on-prem deployment and IDE-resident workflows
GooseLocal-first CLI/desktop agent under Linux Foundation governance; Refact.ai is IDE-plugin-first with an enterprise fine-tuning story
GitHub CopilotAutocomplete-first incumbent; Refact.ai is an end-to-end agent that can also run fully on your own hardware
Tabby / ContinueOther self-hostable open-source assistants; Refact.ai differentiates with published SWE-bench agent results[4]

When to Choose Refact.ai Over Alternatives

  • Choose Refact.ai when: Your code cannot leave your infrastructure and you still want an autonomous agent — self-hosting plus fine-tuning on your own codebase is the core bet
  • Choose Cline when: You live in VS Code and want the largest community with explicit approval for every action
  • Choose OpenHands when: You want cloud-hosted background agents rather than IDE-resident ones
  • Choose Goose when: You want a vendor-neutral, foundation-governed local agent independent of any editor

Ideal Customer Profile

Best fit:

  • Teams with data-residency or compliance requirements that rule out cloud coding agents
  • Organizations with GPUs who want models fine-tuned on their own codebase[2]
  • JetBrains-heavy shops underserved by VS Code-only agents
  • Budget-sensitive individuals — Free tier and $10/month Pro[2]

Poor fit:

  • Teams that pick tools by community size and ecosystem depth
  • Buyers who need disclosed funding and a long enterprise track record
  • Users expecting leaderboard-level autonomy from fully local models

Viability Assessment

FactorAssessment
Financial HealthUnknown — funding not publicly disclosed[3]
Market PositionNiche challenger — 3,500+ stars, but the strongest published SWE-bench results among self-hostable agents[1][4]
Innovation PaceActive — repo pushed within the last two weeks as of June 2026; benchmark scores improved 70.4% → 74.4% across model upgrades[1][4]
Community/EcosystemModest — 319 forks; far smaller than Cline or Goose communities[1]
Long-term OutlookCautiously positive — clear differentiation (self-hosting + fine-tuning), but undisclosed funding and giant competitors are real risks

Refact.ai punches above its star count: its June 2025 SWE-bench Verified submission outscored far larger open-source projects, and it occupies a defensible niche — the self-hosted, fine-tunable agent — that cloud-first rivals have largely ceded.[4] The open questions are financial (no disclosed funding) and competitive (whether the niche stays defensible as enterprise tiers of bigger agents add on-prem options).[3]


Bottom Line

Refact.ai is the self-hosting specialist among open-source coding agents: a BSD-licensed, end-to-end engineering agent with credible, reproducible benchmark results and an enterprise story — on-prem deployment plus fine-tuning on your own code — that the larger players don't match. Its community is small and its benchmark headlines are a year old, but the product targets a real constraint (code that can't leave the building) rather than competing head-on for mindshare.

Recommended for: Compliance-bound teams that need an autonomous coding agent on their own infrastructure, JetBrains-centric organizations, and anyone who wants codebase fine-tuning without sending code to a third party.

Not recommended for: Developers who optimize for ecosystem size and community support, or buyers who require a well-capitalized vendor with disclosed funding.

Outlook: Refact.ai's fate hinges on whether self-hosted agentic coding becomes a procurement requirement at scale. If regulated industries adopt agents the way they adopted self-hosted Git, its niche becomes a moat; if cloud agents win on capability, a 3.5K-star challenger will struggle for attention. Watch for refreshed benchmark submissions and any funding disclosure.


Research by Ry Walker Research • methodology