← Back to research
·11 min read·company

claude-context

claude-context is Zilliz's open-source MCP server for semantic code search — hybrid BM25 + dense-vector indexing of entire codebases in Milvus or Zilliz Cloud, with AST chunking and Merkle-tree incremental sync. 11.8K+ GitHub stars make it the most-cited tool in the category, and a marketing wedge for Zilliz Cloud.

Key takeaways

  • The most-visible semantic code-search MCP server: 11.8K+ stars and 869 forks as of June 2026, roughly a year after its June 2025 creation — established 2025, so this profile corrects a pre-existing omission rather than flagging a new entrant
  • Architecture is the differentiator: hybrid BM25 + dense-vector search, AST-based chunking, and Merkle-tree incremental indexing, backed by Milvus or Zilliz Cloud and pluggable embeddings (OpenAI, VoyageAI, Gemini, Ollama)
  • Vendor-channel economics: the code is MIT-free, but the default path runs on Zilliz Cloud (free tier available) plus a paid embedding API — the project is Zilliz's flagship funnel for its managed vector database
  • The premise is contested: Anthropic itself ships grep-only retrieval in Claude Code after reportedly finding grep "just worked better," and privacy-motivated local clones of claude-context have appeared on HN

FAQ

What is claude-context?

An open-source MCP server from Zilliz that indexes a codebase into Milvus or Zilliz Cloud and lets Claude Code and other AI coding agents run hybrid BM25 + vector semantic search over it instead of grepping.

How much does claude-context cost?

The software is MIT-licensed and free; running it requires an embedding API key (e.g., OpenAI, billed separately, or free local Ollama) and a vector database — Zilliz Cloud's free tier or a self-hosted Milvus.

Which agents and embedding providers does it support?

Claude Code, Codex CLI, Gemini CLI, Cursor, Windsurf, Cline, Roo Code, and other MCP clients; embeddings via OpenAI, VoyageAI, Gemini, or Ollama, stored in Milvus or Zilliz Cloud.

How is claude-context different from Serena?

Serena navigates code symbolically through language servers with no index or external services; claude-context builds a cloud vector index for natural-language semantic search, trading setup and data egress for fuzzy retrieval over millions of lines.

Executive Summary

claude-context is Zilliz's open-source MCP server that turns an entire codebase into a searchable hybrid index — BM25 full-text plus dense-vector similarity — stored in Milvus or Zilliz Cloud, so Claude Code and other coding agents can ask "find the functions that handle authentication" instead of grepping blindly and dumping whole files into context.[1][2] Indexing is AST-aware (language-sensitive chunking with fallback) and incremental via Merkle-tree change detection, so only modified files re-embed.[1] Zilliz's own side-by-side testing claims token usage drops "over 40%" without loss of recall — a vendor number, but the only published benchmark.[2]

This is a 2025 project, not a new entrant — it was created June 6, 2025 and had 2.6K stars by the time of Zilliz's launch-era blog post; its absence from this category until now was an omission in this research, not a gap in the market.[1][2] At 11.8K+ stars and 869 forks as of June 2026 it is the most-cited semantic code-search MCP server, and it remains active: v0.1.14 of the MCP package shipped June 8, 2026.[1][3] The strategic context matters — Zilliz is the company behind the Milvus vector database, and claude-context is its flagship developer funnel into Zilliz Cloud.[2][1]

AttributeValue
CompanyZilliz (creator of the Milvus vector database)[2]
CreatedJune 6, 2025[1]
GitHub Stars11.8K+ (869 forks) as of June 2026[1]
LicenseMIT[1]
Latest Release@zilliz/claude-context-mcp v0.1.14, June 8, 2026[3]
FundingCorporate-backed (Zilliz); no separate funding for the project

Product Overview

The workflow: point the MCP server at a repo, it chunks the code along AST boundaries, embeds the chunks, and writes them to Milvus or Zilliz Cloud; from then on the agent calls a search tool that runs hybrid BM25 + vector retrieval and injects only the relevant snippets into context.[1] A Merkle tree over the file system detects changes so re-indexing touches only modified files.[1] The repo lists 15 supported clients, including Claude Code, OpenAI Codex CLI, Gemini CLI, Qwen Code, Cursor, Windsurf, Cline, Roo Code, and LangChain/LangGraph, and claims support for 13+ programming languages.[1]

Key Capabilities

CapabilityDescription
Hybrid searchBM25 full-text + dense-vector similarity over the whole repo[1]
AST chunkingLanguage-aware code segmentation with fallback splitting[1]
Incremental indexingMerkle-tree change detection; only modified files re-embed[1]
Pluggable embeddingsOpenAI, VoyageAI, Gemini, or local Ollama[1]
Vector backendsMilvus (self-hosted) or Zilliz Cloud (managed)[1]
Token economicsVendor-claimed 40%+ token reduction at equivalent retrieval quality[2]

Product Surfaces

SurfaceDescriptionAvailability
MCP server@zilliz/claude-context-mcp, configured via env vars in any MCP clientGA (0.x)[3]
Core library@zilliz/claude-context-core indexing engine for custom pipelinesGA (0.x)[1]
VSCode extensionSemantic search inside the editorGA[1]

Technical Architecture

A TypeScript / Node.js 20+ monorepo of three packages (core engine, MCP server, VSCode extension).[1] The MCP server is configured entirely through environment variables — an embedding-provider API key plus a Milvus address or Zilliz Cloud endpoint/token — and added to Claude Code with a one-line claude mcp add command.[1] Nothing runs as a managed service: the user supplies both the embedding API and the vector database, which is precisely the architecture critics flag, since code chunks leave the machine by default.[1][4]

Key Technical Details

AspectDetail
DeploymentLocal MCP server; vector storage in self-hosted Milvus or managed Zilliz Cloud[1]
ModelsBring-your-own embeddings: OpenAI, VoyageAI, Gemini, Ollama (local)[1]
Integrations15 listed MCP clients incl. Claude Code, Cursor, Gemini CLI, Codex CLI[1]
Open SourceMIT; 115 open issues as of June 2026[1]

Strengths

  • Category-leading visibility — 11.8K+ stars and 869 forks as of June 2026, the most-cited semantic code-search MCP server, regularly recommended by name in HN threads about agent retrieval.[1][4]
  • Real retrieval engineering, not a demo — hybrid BM25 + vector search, AST-aware chunking, and Merkle-tree incremental sync are the same techniques production code-search systems use.[1]
  • Corporate maintenance without corporate pricing — Zilliz staffs the project, the license is MIT, and the stack runs on free tiers (Zilliz Cloud free tier, or fully local with Ollama + self-hosted Milvus).[1][5]
  • Agent-agnostic by design — one index serves Claude Code, Cursor, Codex CLI, Gemini CLI, and a dozen other MCP clients.[1]
  • Still shipping in 2026 — pushes through June 8, 2026, with five MCP releases since late April.[1][3]

Cautions

  • It is a marketing wedge — claude-context exists to funnel developers into Zilliz Cloud, and the headline 40% token-reduction benchmark comes from Zilliz's own blog with no independent replication.[2]
  • The premise is contested by Anthropic itself — Claude Code ships grep-only retrieval, and community recollection is that Anthropic "experimented a lot with Vector Search and grep just worked better."[4]
  • Code leaves the machine by default — the standard setup sends chunks to a cloud embedding API and stores vectors in Zilliz Cloud, a concern prominent enough that local clones (claude-context-local, LEANN) launched on HN specifically to remove it.[6][4]
  • Stars overstate active usage — the MCP package saw roughly 1,073 npm downloads in the last week of May 2026, modest against 11.8K stars.[3][1]
  • Pre-1.0 with a real issue backlog — versioning is still 0.x and 115 issues are open as of June 2026.[3][1]
  • Two external dependencies to stand up — an embedding API key plus a vector database before the first search, versus zero-config alternatives.[1]

What Developers Say

Community discussion happens in threads about agent retrieval rather than a single launch post; the tool is the reference point others position against.[4]

"You can get semantic search in Claude Code using this unofficial plugin … it's built by and uses a managed vector database called Zilliz Cloud." — simonw on Hacker News, November 2025[4]

"I think it was Boris from Anthropic that said they experimented a lot with Vector Search and grep just worked better. You can try it out using the Claude-Context MCP." — kroaton on Hacker News, January 2026[4]

"Unlike Claude-context, which uploads all data to the cloud, or Serena, which is heavy and limited to keyword search, our solution installs in just 1 minute." — yichuan (LEANN maintainer, Berkeley SkyLab — a competitor, so read adversarially) on Hacker News[4]

The existence of "Claude Context but local" clones on HN is itself a sentiment signal: demand for the capability, paired with resistance to the cloud-upload default.[6]


Pricing & Licensing

The software is free; the running costs are the embedding API and the vector database.[1]

TierPriceIncludes
OSS (self-hosted)FreeMIT-licensed code; self-hosted Milvus; local Ollama embeddings possible[1]
Zilliz Cloud FreeFreeManaged vector database free tier, the README's recommended starting point[1][5]
Zilliz Cloud paidUsage-basedServerless and dedicated cluster plans beyond free-tier limits[5]

Licensing model: MIT for all three packages on GitHub.[1]

Hidden costs: Embedding API spend on every index and re-index (OpenAI/VoyageAI/Gemini billed separately); Zilliz Cloud charges once a codebase outgrows the free tier.[1][5]


Competitive Positioning

Direct Competitors

CompetitorDifferentiation
grepaiLocal-first semantic grep; claude-context counters with a server-backed hybrid index that scales to millions of lines and serves any MCP client
SerenaSymbol-level LSP navigation with no index, no API keys, no egress; claude-context trades that self-containment for fuzzy natural-language retrieval
MCP Vector SearchSingle-maintainer LanceDB tool with static-analysis extras (knowledge graph, dead-code detection); claude-context has ~250x the stars and corporate maintenance, but needs external services
claude-context-local / LEANNPrivacy-motivated local clones — same idea, fully on-device embeddings, no cloud dependency[6][4]

When to Choose claude-context Over Alternatives

  • Choose claude-context when: the codebase is large enough that grep round-trips dominate token spend, you want one shared index serving multiple agents/teammates, and a cloud vector DB (or self-hosted Milvus) is acceptable.
  • Choose Serena when: you want precise symbol-level navigation with zero external services and zero code egress.
  • Choose grepai when: you want semantic search that stays local and drops into existing grep-shaped workflows.
  • Choose MCP Vector Search when: you want local vector search plus static-analysis features and can accept a one-maintainer project.

Ideal Customer Profile

Best fit:

  • Teams running agents against large monorepos (millions of lines) where grep-based discovery burns context across multi-round searches
  • Multi-agent shops wanting one index shared across Claude Code, Cursor, and CLI agents
  • Organizations already on Milvus or Zilliz Cloud

Poor fit:

  • Security postures that prohibit sending code chunks to third-party embedding APIs or a managed vector store
  • Small repos where Claude Code's built-in grep is already fast and cheap
  • Teams that will not operate (or pay for) a vector database as the codebase grows

Viability Assessment

FactorAssessment
Financial HealthStrong sponsor — maintained by Zilliz, the venture-backed company behind Milvus, as a strategic funnel[2]
Market PositionCategory reference point — most-starred semantic code-search MCP server; competitors literally describe themselves relative to it[1][4]
Innovation PaceSteady — 2.6K to 11.8K+ stars over the year since the launch blog; v0.1.14 in June 2026, though still pre-1.0[2][1][3]
Community/EcosystemActive but vendor-shaped — 869 forks and local-first clones, with much of the advocacy coming from Zilliz channels[1][6]
Long-term OutlookTied to Zilliz strategy and to whether semantic retrieval beats grep for agents — a question Anthropic has so far answered the other way[4]

The project will not die of neglect — Zilliz has every incentive to keep its best-known developer funnel healthy — but its trajectory depends on a thesis fight: vendor benchmarks say 40%+ token savings, while Anthropic ships grep-only retrieval after testing the alternative.[2][4] The ~1K weekly npm downloads against 11.8K stars suggest the audience is still far more curious than committed.[3][1]


Bottom Line

claude-context is the default answer to "how do I add semantic code search to Claude Code" — the most-cited, best-engineered, corporate-maintained option in the category, with hybrid BM25 + vector retrieval and incremental indexing that hold up at monorepo scale. The honest caveats: it is a Zilliz Cloud funnel whose headline benchmark is self-published, its default path ships your code to two external services, and the agent vendor it is named after chose grep instead.

Recommended for: Teams with genuinely large codebases where grep round-trips measurably burn tokens, and any shop standardizing one code index across multiple MCP agents.

Not recommended for: Code-egress-restricted environments (use a local alternative or fully self-hosted Milvus + Ollama), small repos, or anyone unwilling to operate a vector database.

Outlook: Watch for a 1.0 release, independent replication of the 40% token claim, and whether Anthropic ever ships first-party semantic retrieval — which would either validate the thesis or absorb the niche.


Research by Ry Walker Research • methodology