← Back to research
·4 min read·company

mcp-vector-search

mcp-vector-search — CLI-first semantic code search with MCP integration. LanceDB-powered vector search with AST parsing, knowledge graph, complexity analysis, and dead code detection. 47 stars, Python, Elastic License 2.0.

Key takeaways

  • CLI-first semantic code search powered by LanceDB (default since v2.1, replacing ChromaDB) and AST parsing across 13 languages, with MCP integration for AI coding assistants
  • Analysis capabilities beyond search: knowledge graph, complexity analysis, dead code detection, SARIF output for CI, and interactive D3.js visualization
  • Relicensed under Elastic License 2.0 — free to use, but cannot be offered as a hosted or managed service
  • 47 stars but unusually active for its size — 90 releases, v4.1.14 shipped May 21, 2026. Essentially a one-maintainer project with no independent community discussion yet

FAQ

What is mcp-vector-search?

A CLI-first semantic code search tool powered by LanceDB vector storage and AST parsing across 13 languages. Provides MCP integration, a knowledge graph, complexity analysis, dead code detection, and an interactive chat mode for codebase exploration.

Is mcp-vector-search free and open source?

It is free to install from PyPI, but as of 2026 it is licensed under the Elastic License 2.0, not MIT — you can use it commercially but cannot provide it to third parties as a hosted or managed service.

How does it compare to other code search MCP tools?

More analysis-oriented than cross-repo search tools like Octocode. Smaller community than CodeGraphContext, but combines vector search with static analysis features like complexity scoring, dead code detection, and a knowledge graph.

Overview

mcp-vector-search is a CLI-first semantic code search tool written in Python, powered by LanceDB vector storage and AST parsing across 13 languages. It combines vector-based semantic search with static analysis capabilities — a knowledge graph, complexity analysis, dead code detection — and exposes everything via MCP for AI coding assistants.

LanceDB became the default vector backend in v2.1+, replacing ChromaDB (the docs cite fewer index-corruption issues and better performance on large codebases; ChromaDB remains available as a legacy option via environment variable). The interactive chat mode supports iterative refinement with up to 30 queries, and outputs include JSON, SARIF (for CI), and markdown, plus an interactive D3.js relationship visualization.

Key stats: 47 stars, 11 forks, Python, Elastic License 2.0. Created August 2025; v4.1.14 released May 21, 2026 across 90 total releases.


Status (as of June 2026)

Alive and actively maintained: last push May 21, 2026, not archived. Development pace is unusually high for the star count — 90 releases since August 2025, with 2026 work adding CodeT5+ and CodeRankEmbed embedding models, a knowledge graph with entity extraction, and indexing/query performance improvements (3.4ms median search latency claimed with IVF-PQ indexing). Adoption remains minimal: stars grew from 24 (March 2026) to 47 as of June 11, 2026, and the project's primary public advocacy is the maintainer's own blog.

Pricing & Licensing

Free to install from PyPI. The license changed from the MIT noted in earlier coverage to the Elastic License 2.0: commercial use is allowed, but the software "may not be provided to third parties as a hosted or managed service." Teams that treat ELv2 as non-open-source for policy purposes should note this.


Competitive Position

Strengths: Combines vector search with static analysis and a knowledge graph. SARIF output for CI. Interactive chat mode and D3.js visualization. Very fast release cadence; feature-rich for its size.

Weaknesses: Very small community (47 stars, 11 forks). Effectively a single-maintainer project. Elastic License 2.0 limits redistribution as a service and may complicate enterprise approval. No independent benchmarks or third-party reviews.

Cautions

  • Bus factor of one — driven by a single maintainer (Bob Matsuoka); 90 releases in ten months is impressive velocity but also a key-person risk
  • License shift — now Elastic License 2.0 rather than MIT; verify it fits your org's OSS policy before adopting
  • Backend migration churn — the ChromaDB-to-LanceDB default switch shows the storage layer is still evolving; pin versions
  • No independent validation — performance claims (3.4ms median latency, 4.9x faster queries) come from the project's own docs

What Developers Say

Independent developer commentary is essentially absent as of June 11, 2026 — no substantive Hacker News or Reddit threads about the tool were found. The only published account is from the maintainer himself:

"it's become one of the most useful tools in my development workflow—not because it's fancy, but because it's always there." — Bob Matsuoka (maintainer), Hyperdev blog, December 2025

The absence of third-party voices is the honest signal here: this is a personal tool being built in public at high velocity on GitHub, not yet a community project.


Bottom Line

mcp-vector-search packs an unusual amount of capability — semantic search, knowledge graph, complexity analysis, dead code detection, CI-friendly SARIF output — into a small single-maintainer project, and the May 2026 release activity confirms it is very much alive. But 47 stars, an ELv2 license, and zero independent discussion make it an experiment, not infrastructure.

Recommended for: Individual developers who want local, free semantic code search wired into MCP clients and are comfortable depending on a one-person project.

Not recommended for: Teams needing enterprise license review simplicity (ELv2), anyone planning to embed it in a hosted service, or organizations that require community-validated tooling.

Outlook: Velocity is the story — if the maintainer sustains the release pace and the LanceDB backend stabilizes, it could become the analysis-heavy alternative in the code-search MCP niche; if not, it stays a well-built personal tool.


Research by Ry Walker Research