claude-context | Ry Walker Research

Key takeaways

The most-visible semantic code-search MCP server: 11.8K+ stars and 869 forks as of June 2026, roughly a year after its June 2025 creation — established 2025, so this profile corrects a pre-existing omission rather than flagging a new entrant
Architecture is the differentiator: hybrid BM25 + dense-vector search, AST-based chunking, and Merkle-tree incremental indexing, backed by Milvus or Zilliz Cloud and pluggable embeddings (OpenAI, VoyageAI, Gemini, Ollama)
Vendor-channel economics: the code is MIT-free, but the default path runs on Zilliz Cloud (free tier available) plus a paid embedding API — the project is Zilliz's flagship funnel for its managed vector database
The premise is contested: Anthropic itself ships grep-only retrieval in Claude Code after reportedly finding grep "just worked better," and privacy-motivated local clones of claude-context have appeared on HN

FAQ

What is claude-context?

An open-source MCP server from Zilliz that indexes a codebase into Milvus or Zilliz Cloud and lets Claude Code and other AI coding agents run hybrid BM25 + vector semantic search over it instead of grepping.

How much does claude-context cost?

The software is MIT-licensed and free; running it requires an embedding API key (e.g., OpenAI, billed separately, or free local Ollama) and a vector database — Zilliz Cloud's free tier or a self-hosted Milvus.

Which agents and embedding providers does it support?

Claude Code, Codex CLI, Gemini CLI, Cursor, Windsurf, Cline, Roo Code, and other MCP clients; embeddings via OpenAI, VoyageAI, Gemini, or Ollama, stored in Milvus or Zilliz Cloud.

How is claude-context different from Serena?

Serena navigates code symbolically through language servers with no index or external services; claude-context builds a cloud vector index for natural-language semantic search, trading setup and data egress for fuzzy retrieval over millions of lines.

Executive Summary

claude-context is Zilliz's open-source MCP server that turns an entire codebase into a searchable hybrid index — BM25 full-text plus dense-vector similarity — stored in Milvus or Zilliz Cloud, so Claude Code and other coding agents can ask "find the functions that handle authentication" instead of grepping blindly and dumping whole files into context.^[1]^[2] Indexing is AST-aware (language-sensitive chunking with fallback) and incremental via Merkle-tree change detection, so only modified files re-embed.^[1] Zilliz's own side-by-side testing claims token usage drops "over 40%" without loss of recall — a vendor number, but the only published benchmark.^[2]

This is a 2025 project, not a new entrant — it was created June 6, 2025 and had 2.6K stars by the time of Zilliz's launch-era blog post; its absence from this category until now was an omission in this research, not a gap in the market.^[1]^[2] At 11.8K+ stars and 869 forks as of June 2026 it is the most-cited semantic code-search MCP server, and it remains active: v0.1.14 of the MCP package shipped June 8, 2026.^[1]^[3] The strategic context matters — Zilliz is the company behind the Milvus vector database, and claude-context is its flagship developer funnel into Zilliz Cloud.^[2]^[1]

Attribute	Value
Company	Zilliz (creator of the Milvus vector database)^[2]
Created	June 6, 2025^[1]
GitHub Stars	11.8K+ (869 forks) as of June 2026^[1]
License	MIT^[1]
Latest Release	@zilliz/claude-context-mcp v0.1.14, June 8, 2026^[3]
Funding	Corporate-backed (Zilliz); no separate funding for the project

Product Overview

The workflow: point the MCP server at a repo, it chunks the code along AST boundaries, embeds the chunks, and writes them to Milvus or Zilliz Cloud; from then on the agent calls a search tool that runs hybrid BM25 + vector retrieval and injects only the relevant snippets into context.^[1] A Merkle tree over the file system detects changes so re-indexing touches only modified files.^[1] The repo lists 15 supported clients, including Claude Code, OpenAI Codex CLI, Gemini CLI, Qwen Code, Cursor, Windsurf, Cline, Roo Code, and LangChain/LangGraph, and claims support for 13+ programming languages.^[1]

Key Capabilities

Capability	Description
Hybrid search	BM25 full-text + dense-vector similarity over the whole repo^[1]
AST chunking	Language-aware code segmentation with fallback splitting^[1]
Incremental indexing	Merkle-tree change detection; only modified files re-embed^[1]
Pluggable embeddings	OpenAI, VoyageAI, Gemini, or local Ollama^[1]
Vector backends	Milvus (self-hosted) or Zilliz Cloud (managed)^[1]
Token economics	Vendor-claimed 40%+ token reduction at equivalent retrieval quality^[2]

Product Surfaces

Surface	Description	Availability
MCP server	`@zilliz/claude-context-mcp`, configured via env vars in any MCP client	GA (0.x)^[3]
Core library	`@zilliz/claude-context-core` indexing engine for custom pipelines	GA (0.x)^[1]
VSCode extension	Semantic search inside the editor	GA^[1]

Technical Architecture

A TypeScript / Node.js 20+ monorepo of three packages (core engine, MCP server, VSCode extension).^[1] The MCP server is configured entirely through environment variables — an embedding-provider API key plus a Milvus address or Zilliz Cloud endpoint/token — and added to Claude Code with a one-line claude mcp add command.^[1] Nothing runs as a managed service: the user supplies both the embedding API and the vector database, which is precisely the architecture critics flag, since code chunks leave the machine by default.^[1]^[4]

Key Technical Details

Aspect	Detail
Deployment	Local MCP server; vector storage in self-hosted Milvus or managed Zilliz Cloud^[1]
Models	Bring-your-own embeddings: OpenAI, VoyageAI, Gemini, Ollama (local)^[1]
Integrations	15 listed MCP clients incl. Claude Code, Cursor, Gemini CLI, Codex CLI^[1]
Open Source	MIT; 115 open issues as of June 2026^[1]

Strengths

Category-leading visibility — 11.8K+ stars and 869 forks as of June 2026, the most-cited semantic code-search MCP server, regularly recommended by name in HN threads about agent retrieval.^[1]^[4]
Real retrieval engineering, not a demo — hybrid BM25 + vector search, AST-aware chunking, and Merkle-tree incremental sync are the same techniques production code-search systems use.^[1]
Corporate maintenance without corporate pricing — Zilliz staffs the project, the license is MIT, and the stack runs on free tiers (Zilliz Cloud free tier, or fully local with Ollama + self-hosted Milvus).^[1]^[5]
Agent-agnostic by design — one index serves Claude Code, Cursor, Codex CLI, Gemini CLI, and a dozen other MCP clients.^[1]
Still shipping in 2026 — pushes through June 8, 2026, with five MCP releases since late April.^[1]^[3]

Cautions

It is a marketing wedge — claude-context exists to funnel developers into Zilliz Cloud, and the headline 40% token-reduction benchmark comes from Zilliz's own blog with no independent replication.^[2]
The premise is contested by Anthropic itself — Claude Code ships grep-only retrieval, and community recollection is that Anthropic "experimented a lot with Vector Search and grep just worked better."^[4]
Code leaves the machine by default — the standard setup sends chunks to a cloud embedding API and stores vectors in Zilliz Cloud, a concern prominent enough that local clones (claude-context-local, LEANN) launched on HN specifically to remove it.^[6]^[4]
Stars overstate active usage — the MCP package saw roughly 1,073 npm downloads in the last week of May 2026, modest against 11.8K stars.^[3]^[1]
Pre-1.0 with a real issue backlog — versioning is still 0.x and 115 issues are open as of June 2026.^[3]^[1]
Two external dependencies to stand up — an embedding API key plus a vector database before the first search, versus zero-config alternatives.^[1]

What Developers Say

Community discussion happens in threads about agent retrieval rather than a single launch post; the tool is the reference point others position against.^[4]

"You can get semantic search in Claude Code using this unofficial plugin … it's built by and uses a managed vector database called Zilliz Cloud." — simonw on Hacker News, November 2025^[4]

"I think it was Boris from Anthropic that said they experimented a lot with Vector Search and grep just worked better. You can try it out using the Claude-Context MCP." — kroaton on Hacker News, January 2026^[4]

"Unlike Claude-context, which uploads all data to the cloud, or Serena, which is heavy and limited to keyword search, our solution installs in just 1 minute." — yichuan (LEANN maintainer, Berkeley SkyLab — a competitor, so read adversarially) on Hacker News^[4]

The existence of "Claude Context but local" clones on HN is itself a sentiment signal: demand for the capability, paired with resistance to the cloud-upload default.^[6]

Pricing & Licensing

The software is free; the running costs are the embedding API and the vector database.^[1]

Tier	Price	Includes
OSS (self-hosted)	Free	MIT-licensed code; self-hosted Milvus; local Ollama embeddings possible^[1]
Zilliz Cloud Free	Free	Managed vector database free tier, the README's recommended starting point^[1]^[5]
Zilliz Cloud paid	Usage-based	Serverless and dedicated cluster plans beyond free-tier limits^[5]

Licensing model: MIT for all three packages on GitHub.^[1]

Hidden costs: Embedding API spend on every index and re-index (OpenAI/VoyageAI/Gemini billed separately); Zilliz Cloud charges once a codebase outgrows the free tier.^[1]^[5]

Competitive Positioning

Direct Competitors

Competitor	Differentiation
grepai	Local-first semantic grep; claude-context counters with a server-backed hybrid index that scales to millions of lines and serves any MCP client
Serena	Symbol-level LSP navigation with no index, no API keys, no egress; claude-context trades that self-containment for fuzzy natural-language retrieval
MCP Vector Search	Single-maintainer LanceDB tool with static-analysis extras (knowledge graph, dead-code detection); claude-context has ~250x the stars and corporate maintenance, but needs external services
claude-context-local / LEANN	Privacy-motivated local clones — same idea, fully on-device embeddings, no cloud dependency^[6]^[4]

When to Choose claude-context Over Alternatives

Choose claude-context when: the codebase is large enough that grep round-trips dominate token spend, you want one shared index serving multiple agents/teammates, and a cloud vector DB (or self-hosted Milvus) is acceptable.
Choose Serena when: you want precise symbol-level navigation with zero external services and zero code egress.
Choose grepai when: you want semantic search that stays local and drops into existing grep-shaped workflows.
Choose MCP Vector Search when: you want local vector search plus static-analysis features and can accept a one-maintainer project.

Ideal Customer Profile

Best fit:

Teams running agents against large monorepos (millions of lines) where grep-based discovery burns context across multi-round searches
Multi-agent shops wanting one index shared across Claude Code, Cursor, and CLI agents
Organizations already on Milvus or Zilliz Cloud

Poor fit:

Security postures that prohibit sending code chunks to third-party embedding APIs or a managed vector store
Small repos where Claude Code's built-in grep is already fast and cheap
Teams that will not operate (or pay for) a vector database as the codebase grows

Viability Assessment

Factor	Assessment
Financial Health	Strong sponsor — maintained by Zilliz, the venture-backed company behind Milvus, as a strategic funnel^[2]
Market Position	Category reference point — most-starred semantic code-search MCP server; competitors literally describe themselves relative to it^[1]^[4]
Innovation Pace	Steady — 2.6K to 11.8K+ stars over the year since the launch blog; v0.1.14 in June 2026, though still pre-1.0^[2]^[1]^[3]
Community/Ecosystem	Active but vendor-shaped — 869 forks and local-first clones, with much of the advocacy coming from Zilliz channels^[1]^[6]
Long-term Outlook	Tied to Zilliz strategy and to whether semantic retrieval beats grep for agents — a question Anthropic has so far answered the other way^[4]

The project will not die of neglect — Zilliz has every incentive to keep its best-known developer funnel healthy — but its trajectory depends on a thesis fight: vendor benchmarks say 40%+ token savings, while Anthropic ships grep-only retrieval after testing the alternative.^[2]^[4] The ~1K weekly npm downloads against 11.8K stars suggest the audience is still far more curious than committed.^[3]^[1]

Bottom Line

claude-context is the default answer to "how do I add semantic code search to Claude Code" — the most-cited, best-engineered, corporate-maintained option in the category, with hybrid BM25 + vector retrieval and incremental indexing that hold up at monorepo scale. The honest caveats: it is a Zilliz Cloud funnel whose headline benchmark is self-published, its default path ships your code to two external services, and the agent vendor it is named after chose grep instead.

Recommended for: Teams with genuinely large codebases where grep round-trips measurably burn tokens, and any shop standardizing one code index across multiple MCP agents.

Not recommended for: Code-egress-restricted environments (use a local alternative or fully self-hosted Milvus + Ollama), small repos, or anyone unwilling to operate a vector database.

Outlook: Watch for a 1.0 release, independent replication of the 40% token claim, and whether Anthropic ever ships first-party semantic retrieval — which would either validate the thesis or absorb the niche.

Research by Ry Walker Research • methodology

Sources