Key takeaways
- The category went vertical in 2026: CodeGraph hit 47.4k stars within five months of its January launch — the biggest tool in the category — and GitNexus rocketed from ~1.2k to 42k stars between April and June, adding an enterprise tier. Code intelligence is no longer a niche.
- Local-first graphs are the winning pattern. The two breakout leaders (CodeGraph's embedded SQLite graph, GitNexus's zero-server LadybugDB) both pre-compute structure on-device and serve it over MCP — no cloud, no embeddings API, no code egress.
- The "agentic grep vs semantic index" debate turned empirical. May 2026 benchmarks favor semantic/indexed retrieval — an independent test measured 97% fewer Claude Code input tokens via grepai, independent reviews measured 58–70% fewer tool calls with CodeGraph, and Augment claims 70%+ agent quality gains — even as Anthropic itself still ships grep-only retrieval.
- Context engines are unbundling from IDEs: Augment Code ($252M raised) spun its proprietary context engine out as a standalone MCP server in February 2026, and Zilliz ships claude-context as a funnel to its vector cloud. The durable primitive is the index, not the agent.
FAQ
What is code intelligence for AI agents?
Tools that give AI coding agents structural understanding of a codebase — dependencies, call chains, blast radius, symbols, semantic retrieval — so they can make informed edits instead of blind changes. Ranges from lightweight context packing to full knowledge graph engines and commercial context engines.
Which code intelligence tool should I use with Claude Code?
CodeGraph (MIT, 47.4k stars) is the easiest local graph to adopt — one SQLite file, eight agent integrations. GitNexus has the deepest Claude Code integration (16 MCP tools + skills + hooks) but a noncommercial license. Serena is the standard LSP-backed symbol layer. For lightweight context, Aider's built-in repo-map or Repomix work without extra setup.
Do I need a knowledge graph or is context packing enough?
For small repos (under 10k files), context packing tools like Repomix often suffice. For large codebases with complex dependency chains, a knowledge graph (CodeGraph, GitNexus, CodeGraphContext) provides blast radius analysis and impact detection that flat context cannot. Semantic search tools (claude-context, grepai) are the middle path for fuzzy "where is the code that does X" retrieval.
Are these tools safe to use with proprietary code?
CodeGraph, GitNexus, Serena, grepai, Aider repo-map, and Repomix all run entirely local. claude-context sends code chunks to a cloud embedding API and vector store by default (self-hosted Milvus + Ollama avoids it). Augment's remote mode hosts your index on its cloud; Sourcegraph Cody and Greptile have cloud components. Always check the data flow before indexing proprietary code.
Executive Summary
AI coding agents have a structural awareness problem. They can read code, generate code, and even reason about code — but they routinely break things because they do not understand how code connects. An agent edits a function without knowing that 47 other functions call it. It renames a class without tracing the import chain. It refactors a module without checking the blast radius.
Code intelligence tools solve this by building a structural understanding layer — knowledge graphs, symbol indexes, semantic search, dependency maps — and exposing it to agents via MCP, CLI, or API. In the three months since this category was first mapped, it went vertical: the two graph leaders now hold 47.4k and 42k GitHub stars between them, a well-funded vendor unbundled its context engine as a standalone product, and the retrieval debate acquired actual benchmarks.
Key Findings:
- The category went vertical — CodeGraph launched January 18, 2026 and hit 47.4k stars in under five months (the biggest tool in the category); GitNexus broke out from ~1.2k stars in April to 42k by June and added an enterprise/commercial tier via Akon Labs
- Local-first graphs are the winning pattern — both breakouts pre-compute structure entirely on-device (CodeGraph: embedded SQLite + tree-sitter; GitNexus: LadybugDB, native or in-browser WASM) and serve it over MCP with zero code egress
- The "agentic grep vs semantic index" debate turned empirical — May 2026 measurements favor indexed retrieval: 97% fewer Claude Code input tokens (grepai, independently benchmarked), 58–70% fewer tool calls (CodeGraph, vendor + independent), 88% fewer tool calls in a 17-agent production audit (GitNexus), and 70%+ agent quality gains (Augment, vendor-run)
- Context engines are unbundling from IDEs — Augment Code ($252M raised, $977M valuation) shipped its proprietary engine as a standalone MCP server in February 2026; Zilliz maintains claude-context (11.8k stars) as a funnel to its vector cloud
- The symbol-level tier has a standard — Serena (25.2k stars, MIT, LSP-over-MCP) was a pre-existing omission from this comparison and is the default answer for symbol-level retrieval and editing
Market Definition
Code intelligence tools for AI agents are systems that give coding agents structural understanding of a codebase — beyond what raw file reading provides — so agents can make informed, safe edits.
Inclusion Criteria:
- Provides structural or semantic code understanding (dependencies, call chains, symbols, semantic retrieval, or comprehensive context)
- Designed to work with AI coding agents (MCP, API, or agent-native integration)
- Active development (updates in last 6 months) — flagged where status has changed
Exclusion Criteria:
- Pure code editors/IDEs without dedicated intelligence layers
- Static analysis tools that only report findings without agent integration
- Documentation generators without structural analysis
Tier 1: Knowledge Graph Engines
Build a full graph of codebase relationships — every import, call, definition, extension. Expose the graph via MCP for agents to query before making changes. This tier produced both 2026 breakouts.
Market Map
| Tool | Stars | Created | Language | License | Status | Key Differentiator |
|---|---|---|---|---|---|---|
| CodeGraph | 47,413 | Jan 2026 | TypeScript | MIT | Very active (pre-1.0) | Biggest in category. Local SQLite symbol/call graph over MCP. 21 languages, 8 agent integrations, file-watcher incremental sync |
| GitNexus | 41,958 | Aug 2025 | TypeScript | PolyForm NC | Very active | Deepest MCP integration (16 tools incl. cross-repo groups). Zero-server. New enterprise/commercial tier via Akon Labs |
| CodeGraphContext | 3,702 | Aug 2025 | Python | MIT | Active | Pluggable graph backends (FalkorDB Lite, KuzuDB, Neo4j). 22 languages, ~31k PyPI downloads/month |
| Axon | 711 | Feb 2026 | Python | MIT | Stalled — no commits since Mar 25, 2026 | Best visualization: WebGL force-directed graph, coupling heatmaps, health scores |
What Makes This Tier Different
Knowledge graph engines do not just search code — they model structural relationships:
- IMPORTS — which modules depend on which
- CALLS — which functions call which functions
- DEFINES/IMPLEMENTS/EXTENDS — class hierarchies and interface contracts
- Clusters — functional groups detected via community algorithms (Leiden)
- Processes — execution flows traced through call chains
This enables capabilities that text search cannot provide:
- Blast radius analysis — "if I change this function, what breaks?"
- Impact detection — "these git changes affect these execution flows"
- Safe rename — "rename this symbol across all 23 files that reference it"
- Execution tracing — "trace this request from API endpoint to database query"
CodeGraph vs GitNexus vs CodeGraphContext
| Dimension | CodeGraph | GitNexus | CodeGraphContext |
|---|---|---|---|
| Stars (Jun 2026) | 47.4k | 42k | 3.7k |
| License | MIT | PolyForm Noncommercial | MIT |
| Language | TypeScript | TypeScript | Python |
| Database | Embedded SQLite (FTS5) | LadybugDB (custom) | Pluggable: FalkorDB Lite, KuzuDB, Neo4j, more |
| Incremental updates | Yes — OS-native file watchers | Roadmap (chunk cache only) | Re-index |
| MCP depth | 8 agent integrations | 16 tools, 7 resources, skills, Claude Code hooks | MCP server + CLI |
| Benchmarked savings | 58% fewer tool calls (vendor); 70% (independent) | 88% fewer tool calls, 74% token savings (production audit) | Project-reported only |
| Maintainer base | Solo (~91% of commits) | Single core maintainer, 23 contributors/cycle | Community, small |
| Commercial use | Free (MIT); hosted platform on waitlist | Requires separate license (Akon Labs) | Free (MIT) |
Bottom line: CodeGraph is the lightest to adopt (one SQLite file, MIT) and now the category's star leader; GitNexus has the deepest agent integration and the only enterprise track, but the noncommercial license drove at least one team (LangWatch) to MIT-licensed CodeGraphContext. Both breakouts carry solo-maintainer concentration risk, and both maintainers acknowledge star velocity outrunning community depth.
Tier 2: Symbol-Level and Semantic Code Search
Lighter than a full knowledge graph. These tools provide symbol navigation, semantic code search, and focused analysis via MCP — without modeling every structural relationship.
Market Map
| Tool | Stars | Created | Language | License | Key Differentiator |
|---|---|---|---|---|---|
| Serena | 25.2k | Mar 2025 | Python | MIT | The symbol-level standard: LSP-over-MCP retrieval and editing/refactoring, 40+ languages, 170+ contributors |
| claude-context | 11.8k | Jun 2025 | TypeScript | MIT | Zilliz's hybrid BM25 + vector semantic search; AST chunking, Merkle-tree incremental indexing |
| grepai | 1,734 | Jan 2026 | Go | MIT | Privacy-first semantic search + call-graph tracing; 100% local via Ollama embeddings |
| Octocode MCP | 863 | Jun 2025 | TypeScript | MIT | 14 MCP tools. LSP navigation, PR archaeology, GitHub multi-repo research, ~16k npm downloads/month |
| CodePathFinder | 137 | Nov 2023 | Go | Apache-2.0 | Security-focused: cross-file taint analysis, 211 rules, relicensed from AGPL |
| mcp-vector-search | 47 | Aug 2025 | Python | Elastic 2.0 | LanceDB semantic search + knowledge graph, complexity analysis, dead-code detection |
The New Entrants
Serena is a pre-existing omission corrected, not a new arrival: created March 2025 by Munich-based Oraios AI, it wraps language servers and exposes their semantics as MCP tools — find_symbol, find_referencing_symbols, replace_symbol_body, project-wide rename. It is the only widely-adopted tool in this category that does symbol-level editing, not just retrieval, and at 25.2k stars it is the default mention in agent-tooling threads. The live debate is whether improving first-party agent tools erode its value.
claude-context (also a pre-existing omission, created June 2025) is Zilliz's open-source flagship: hybrid BM25 + dense-vector search over AST-chunked code, stored in Milvus or Zilliz Cloud, with Merkle-tree incremental re-indexing. It is the most-cited semantic code-search MCP server — and unapologetically a funnel to Zilliz's managed vector database, with code chunks leaving the machine by default.
grepai (January 2026, French solo maintainer Yoan Bernabeu) is the privacy-first counterpoint: embeddings run 100% locally through Ollama by default, with call-graph tracing and a file-watcher daemon in a single Go binary. Its headline claim has rare independent verification — a third-party benchmark measured a 97% reduction in Claude Code input tokens and 27.5% lower API cost.
Grep vs Semantic Index: The Debate Got Data
Anthropic ships grep-only retrieval in Claude Code after reportedly finding grep "just worked better" — and the existence of this entire tier is a bet against that position. As of May–June 2026 the bet has numbers: grepai's independently benchmarked 97% input-token cut, CodeGraph's independently measured 70% median tool-call reduction, GitNexus's production-audited 88% fewer tool calls, and Augment's vendor-run 70%+ quality claims all point the same direction — indexed retrieval beats raw agentic grep on token economics for non-trivial codebases. The honest caveats: the strongest numbers are tool-specific, several are vendor-reported, and grep still wins on zero setup and exact-pattern speed.
When to Use Tier 2 vs Tier 1
Tier 2 tools are best when you need:
- Quick setup — no full-graph indexing step (Serena, Octocode)
- Fuzzy retrieval — "where is the code that does X" via embeddings (claude-context, grepai)
- Symbol-level editing — atomic rename/replace through the same semantic layer (Serena)
- Cross-repo search — query across GitHub orgs, not just local repos (Octocode)
- Security analysis — taint analysis and vulnerability rules (CodePathFinder)
They are weaker when you need:
- Full dependency chain tracing
- Blast radius analysis with confidence scoring
- Execution flow mapping
- Community/cluster detection
Tier 3: Context Packing
The simplest approach: flatten your codebase into a single LLM-friendly format. No graph, no database — just comprehensive context in one prompt.
Market Map
| Tool | Stars | Created | Language | Status | Key Differentiator |
|---|---|---|---|---|---|
| Repomix | 26,188 | Jul 2024 | TypeScript | Active | XML-structured output. Tree-sitter compression (~70% token reduction). ~255k npm downloads/month |
| Context Hub | 13,556 | Mar 2026 | TypeScript | Active, slowing | Andrew Ng's curated, versioned API docs CLI for coding agents — 622 doc entries |
| code2prompt | 7,399 | Mar 2024 | Rust | Stable, slowing | Fast CLI. Handlebars templates, Python bindings; no tagged release since Dec 2025 |
| Aider repo-map | (built-in) | 2023 | Python | Active | Tree-sitter tag map. Dynamically optimized per chat context |
The Context Packing Philosophy
These tools take the opposite approach from knowledge graphs: instead of building a queryable structure, they pack everything the LLM might need into a single context window.
Repomix is the category leader (26.2k stars, ~255k npm downloads/month). It packs entire repos into XML-structured files optimized for Claude's XML parsing; Tree-sitter compression cuts tokens by ~70% while preserving structure, and the April 2026 v1.14.0 release cut pack time 58% in response to speed criticism. It also has an MCP server for dynamic packing.
code2prompt remains the solid second — Rust-fast, with a template system and Python bindings Repomix lacks — but it is slowing: 7.4k stars to Repomix's 26.2k, with no tagged release since December 2025 despite continued commits. The adoption gap is widening, not closing.
Aider's repo-map is the most sophisticated built-in approach. It uses Tree-sitter to extract a tag map of all definitions and references, then dynamically selects the most relevant context for each chat. It is not a separate tool — it is integrated into Aider's agent loop.
Context Hub packs a different kind of context: curated, versioned API documentation rather than your own repo, attacking the hallucinated-API problem. It grew from 68 docs at its March 2026 launch to 622 entries and 13.5k stars by June, though release cadence has cooled.
Limitations
Context packing breaks down at scale:
- Token limits — even with compression, large monorepos exceed context windows
- No structural queries — you cannot ask "what calls this function?" without the graph
- No blast radius — changing code requires understanding that flat context does not provide
- Stale context — packed files are snapshots, not live indexes
Tier 4: Platforms and Commercial Context Engines
Enterprise, cloud-hosted, and paid code intelligence with AI integration. The 2026 development: dedicated context engines unbundling from the IDEs and assistants that built them.
Market Map
| Tool | Type | Key Differentiator |
|---|---|---|
| Augment Context Engine | Commercial MCP (closed source) | Augment Code's semantic engine unbundled as MCP, GA Feb 2026. Local + hosted cross-repo modes; vendor claims 70%+ agent quality gains. $252M-funded parent |
| Sourcegraph Cody | Enterprise SaaS | Code search + intelligence + Cody AI. RAG over entire codebase |
| DeepWiki | Free cloud tool | AI-generated documentation for any public GitHub repo |
| Greptile | YC-backed SaaS | AI code review with full codebase context. GitHub/GitLab/Bitbucket |
The Unbundling
Augment Context Engine is the signal event: a $977M-valuation coding-assistant vendor conceded the agent layer to Claude Code and Cursor and shipped its real asset — the semantic index — as a standalone MCP server any agent can call. Two modes (local Auggie CLI with real-time indexing; Augment-hosted cross-repo index via GitHub App), token-based pricing with a 40% service fee, and vendor-run benchmarks claiming 70%+ quality improvement (80% for Claude Code + Opus 4.5) on 300 Elasticsearch PRs. The numbers are self-reported and unreplicated, but the strategy — context as the durable primitive, agents as interchangeable consumers — is the clearest articulation yet of where this category is heading.
Platform vs Open Source
| Dimension | Platform / Commercial (Tier 4) | Open Source (Tiers 1-3) |
|---|---|---|
| Setup | Minutes (cloud) | Manual indexing |
| Privacy | Code or index goes to cloud (Augment local mode excepted) | Everything local (claude-context's default excepted) |
| Scale | Massive monorepos, cross-repo | Varies by tool |
| Cost | Paid plans / per-query fees | Free |
| Customization | Limited | Full control |
| Agent integration | MCP (Augment, Cody) or API | MCP-native |
DeepWiki deserves special mention: it generates readable documentation and architecture diagrams for any public GitHub repo. GitNexus positions itself as "like DeepWiki but deeper" — DeepWiki describes code in natural language while GitNexus models structural relationships. They solve different problems.
Greptile is the most agent-oriented review platform — it indexes your entire codebase and uses that context for AI code review on every PR. YC-backed, growing fast in the enterprise segment.
Technical Comparison
| Dimension | Knowledge Graphs | Symbol/Semantic Search | Context Packing | Platforms/Commercial |
|---|---|---|---|---|
| Structural awareness | Full (relationships, clusters, flows) | Partial (symbols, references, semantics) | None (flat text) | Varies |
| Blast radius | Yes | Limited (call-graph tracing in grepai) | No | Sourcegraph: partial |
| Fuzzy semantic retrieval | Limited (CodeGraph: no) | Yes (claude-context, grepai, mcp-vector-search) | No | Yes (Augment) |
| Setup effort | Low–medium (indexing step) | Low (MCP config; claude-context needs vector DB) | Low (CLI) | Low (cloud) |
| Privacy | Local | Local (claude-context cloud by default) | Local | Cloud (Augment has local mode) |
| Real-time updates | CodeGraph: file watchers; GitNexus: re-index | Serena: live LSP; claude-context: Merkle incremental; grepai: watcher | Re-pack required | Continuous sync |
| Agent integration | MCP (deep) | MCP | Prompt injection / MCP | MCP or API |
| Best for | Complex refactors, impact analysis | Navigation, fuzzy search, symbol editing | Small/medium repos, one-shot context | Enterprise teams, cross-repo |
Competitive Dynamics
What Is Driving the Category
-
MCP as the standard. Every tool that wants to serve AI agents needs MCP support. The Cambrian explosion of late 2025/early 2026 has matured into a layered stack: graphs, symbols, semantics, packing — all over the same transport.
-
Token economics as the buying trigger. The 2026 benchmarks reframed the pitch from "fewer broken edits" to "measurably cheaper sessions" — 97% input-token cuts and 58–88% tool-call reductions translate directly into more tasks per capped plan.
-
The unbundling. Augment proved that a context engine can be a product independent of the assistant that built it. Expect more vendors to sell the index and let Claude Code and Cursor keep the seat.
-
The IDE convergence. Cursor, Windsurf, and Claude Code are all building native code intelligence. Every tool in this category is racing the possibility that agents absorb its function — the sharpest community criticism of Serena is precisely that Claude Code's native tools got better.
The Concentration Problem
The category's new scale sits on remarkably thin foundations. CodeGraph is ~91% one person's commits at 47.4k stars; GitNexus's core decisions rest with one maintainer; grepai, Octocode, CodePathFinder, and mcp-vector-search are all solo projects. Both breakout star curves also outrun their community footprints — CodeGraph has 115 watchers and no HN launch thread, and GitNexus's maintainer concedes some growth may be bot-driven. Treat star counts as awareness signals, not durability signals.
MCP provides the transport layer but not the semantic layer. A "code intelligence protocol" that standardizes how agents query codebase structure — independent of the backend engine — would unlock composability. Nobody has shipped it.
What to Watch
Near-term (H2 2026)
- Whether CodeGraph ships 1.0, grows a second maintainer, and reveals hosted-platform terms — and whether its star velocity converts into contributor depth
- GitNexus's commercial execution: public pricing and real enterprise references for the Akon Labs track, plus incremental indexing (still roadmap) and the documented heap-OOM scaling fixes
- Independent replication of Augment's 70%+ benchmark — the whole paid tier's value proposition rests on it
- Whether Anthropic ships first-party semantic retrieval in Claude Code, which would validate the semantic-index thesis while absorbing much of the niche
- Axon: commits resume or it becomes a finished artifact
Medium-term (2026-2027)
- Knowledge graph tools merge with agent memory — understanding codebase structure plus remembering past changes and decisions
- Cross-repo intelligence matures (GitNexus group tools and Augment's hosted mode are the first real entries)
- IDE-native intelligence narrows the gap, but complex refactors still need dedicated tools
Long-term (2027+)
- Real-time, always-current code graphs become the norm (CodeGraph's file watchers and claude-context's Merkle sync are the early pattern)
- Structural awareness becomes an expected capability, not a separate tool
- The winners will be whoever defines the standard semantic layer for code intelligence
Bottom Line
Code intelligence for AI agents graduated from promising niche to breakout category in the spring of 2026: two tools north of 40k stars, a $252M-funded vendor unbundling its engine into the space, and benchmarks that finally put numbers on the grep-vs-index debate. The landscape is stratified:
- Need blast radius and impact analysis? → Knowledge graph: CodeGraph (MIT, biggest, zero-dependency SQLite) or GitNexus (deepest MCP integration, enterprise tier — but PolyForm Noncommercial). CodeGraphContext is the pluggable-backend MIT alternative.
- Need symbol-level navigation and editing? → Serena (25.2k stars, the de facto standard)
- Need fuzzy semantic search? → claude-context (cloud-indexed, monorepo scale) or grepai (100% local, independently benchmarked)
- Need to pack a repo into a prompt? → Repomix (26.2k stars, ~255k npm downloads/month, category leader)
- Enterprise team with cloud budget? → Augment Context Engine for cross-repo retrieval, Sourcegraph Cody, or Greptile
Status flags from this refresh: Axon is stalled (no commits since March 25, 2026), code2prompt is stable but slowing (no release since December 2025), and Context Hub's post-launch cadence has cooled. Everything else in the category is shipping.
The biggest risk is unchanged in kind but larger in degree: IDE-native intelligence (Cursor, Claude Code, Windsurf) absorbing the standalone market — now with the added twist that Anthropic's grep-only stance is the position the entire semantic tier exists to refute. The concentration risk is new: the category's two 40k-star leaders are effectively solo-maintained.
The gap that closed: incremental, real-time updates — last cycle's "biggest gap" — is now table stakes at the front of the pack (CodeGraph's file watchers, claude-context's Merkle sync, grepai's daemon, Serena's live LSP). The gap that remains: a standard semantic layer, so agents can query any engine the same way. The first to define it wins the category.
Research by Ry Walker Research • methodology
Sources
- [1] abhigyanpatwari/GitNexus
- [2] CodeGraphContext/CodeGraphContext
- [3] Sourcegraph Cody
- [4] DeepWiki
- [5] Greptile
- [6] Aider repo-map docs
- [7] yamadashy/repomix
- [8] mufeedvh/code2prompt
- [9] bgauryy/octocode-mcp
- [10] shivasurya/code-pathfinder
- [11] harshkedia177/axon
- [12] colbymchenry/codegraph
- [13] oraios/serena
- [14] zilliztech/claude-context
- [15] yoanbernabeu/grepai
- [16] Augment Code — Context Engine MCP
- [17] andrewyng/context-hub
- [18] bobmatnyc/mcp-vector-search