Span | Ry Walker Research

Key takeaways

Tool-agnostic AI attribution by model, not telemetry: span-detect-1 classifies code chunks as AI-generated or human-authored across all coding tools, reporting 95% accuracy with a 5% abstain rate at 3,000-character chunks — extended in June 2026 by an AI Effectiveness suite built on agent traces with line-level attribution
$25M in Seed + Series A funding announced November 2025 from Alt Capital, Craft Ventures, SV Angel, BoxGroup, and Bling Capital, with named customers including Ramp, Vanta, Carvana, Intercom, Braze, Writer, and ClassPass
Model-based detection draws real skepticism: Hacker News testers showed the detector can be steered (deliberately "messy" AI code scored 0% AI), and commenters argued a 5% error rate is too high when results drive administrative consequences

FAQ

What is Span?

Span is an AI-native engineering intelligence platform that unifies code, tickets, and surveys to show engineering leaders where time goes, how much shipped code is AI-generated, and whether AI tooling is actually paying off.

How much does Span cost?

Pricing is not publicly listed. Span sells to engineering organizations through a demo-led sales motion with custom quotes.

How does span-detect-1 detect AI-generated code?

span-detect-1 is a proprietary model that semantically chunks code and classifies each chunk as AI-generated, human-authored, or abstain — reporting 95% accuracy with a 5% abstain rate at 3,000-character chunks across Python, TypeScript, JavaScript, Java, Ruby, and C#, regardless of which AI tool wrote the code.

How is Span different from GitClear?

GitClear attributes AI code via direct API integrations with specific tools (Copilot, Cursor, Claude Code) and line-level Diff Delta analysis; Span infers AI authorship from the code itself with a detection model, so it covers tools that expose no telemetry — at the cost of probabilistic rather than ground-truth attribution.

Executive Summary

Span is an AI-native engineering intelligence platform founded by J Zac Stein (CEO) and Henry Liu (CTO), both of whom previously ran large engineering organizations.^[1] It unifies metrics, surveys, and contextual data from code, tickets, and incidents into one system, then layers AI-specific measurement on top: the proprietary span-detect-1 model classifies shipped code as AI-generated or human-authored at the chunk level across all coding tools, and a June 2026 AI Effectiveness suite uses agent traces — "the new work artifact in the AI era" — for line-level attribution from prompt to production.^[2]^[3]

The company announced $25M in combined Seed and Series A funding in November 2025, backed by Alt Capital, Craft Ventures, SV Angel, BoxGroup, and Bling Capital, with named customers including Ramp, Vanta, Carvana, Intercom, Braze, Writer, URBN, and ClassPass.^[4]^[1] Its central bet — that AI authorship can be inferred from the code itself rather than from per-tool telemetry — is also its most contested claim: public testing of the free detector surfaced steerable results and false positives that matter when the numbers feed performance conversations.^[5]

Attribute	Value
Company	Span (span.app)
Founders	J Zac Stein (CEO), Henry Liu (CTO)^[1]
Funding	$25M Seed + Series A, announced November 2025 — Alt Capital, Craft Ventures, SV Angel, BoxGroup, Bling Capital^[4]
Customers	Ramp, Vanta, Carvana, Intercom, Braze, Writer, URBN, ClassPass^[1]
Open Source	No — proprietary SaaS

Product Overview

Span positions itself as a control panel for engineering organizations navigating AI transformation: it ingests code, tickets, incidents, and developer surveys, auto-classifies the work, and answers where time goes, how much code is AI-authored, and whether AI investment is converting into output.^[3]^[4] The platform claims roughly 50% of engineering work is auto-mapped to projects before any engineering-manager involvement, and customer Fin cut software-capitalization reporting from three weeks to three days.^[3]

Key Capabilities

Capability	Description
AI Effectiveness suite	Agent-trace analysis connecting AI-assisted work to shipped code with line-level attribution; org-level effectiveness scoring across sentiment, prompt quality, and satisfaction (shipped June 2026)^[3]
span-detect-1	Proprietary model detecting AI-generated code at the chunk level across all tools — 95% accuracy, 5% abstain at 3,000-character chunks^[2]
Work classification	Automated mapping of code, tickets, and incidents to projects and investment categories^[3]
DevFinOps	R&D cost-capitalization and tax-credit attribution automation^[3]^[1]
Surveys + sentiment	Developer surveys unified with system metrics rather than reported separately^[3]
Span AI agent	Conversational reporting via Claude, ChatGPT, and Slack, with scheduled reports and proactive responses^[3]

Product Surfaces

Surface	Description	Availability
Web app	Dashboards for allocation, AI adoption, and effectiveness scoring^[3]	GA
Free AI Code Detector	Public chunk-level detector for arbitrary code samples^[6]	Free
Slack / chat	Span AI agent answers questions and schedules reports where teams work^[3]	GA

Technical Architecture

Span is a managed SaaS platform that plugs into existing engineering systems — GitHub, GitLab, issue trackers, and AI coding tools — and applies AI reasoning to classify work even when the underlying data is messy.^[1]^[4] The detection layer, span-detect-1, semantically chunks code documents and classifies each chunk three ways (AI-generated, human-authored, abstain); it was trained on a curated corpus of public GitHub repositories paired with AI-generated code from multiple leading models, and evaluated on roughly 45,000 balanced samples each for TypeScript and Python plus 11,000 TSX samples.^[2] Detection works at 2,000–3,000-character chunks because individual lines rarely carry enough signal; below 500 characters the model degrades rapidly.^[2] The June 2026 AI Effectiveness suite adds agent traces as a first-class data source, pushing attribution to the line level for agent-written code.^[3]

Key Technical Details

Aspect	Detail
Deployment	Managed SaaS; SOC 2 Type II, GDPR, zero data retention for LLMs, RBAC, SSO/SCIM^[3]
Model(s)	Proprietary span-detect-1 for code detection; LLM reasoning for work classification^[2]
Languages detected	Python, TypeScript, JavaScript, Java, Ruby, C# — more in development^[2]
Integrations	GitHub, GitLab, issue trackers, Claude, ChatGPT, Slack^[3]^[1]
Open Source	No

Strengths

Tool-agnostic AI attribution — because span-detect-1 infers authorship from the code itself, coverage extends to every AI tool, including ones that expose no usage APIs; per-tool telemetry approaches go dark the moment an engineer adopts an unmeasured tool^[6]^[2]
Agent traces as a data source — the June 2026 AI Effectiveness suite treats agent traces as the work artifact, connecting prompts to shipped lines, which is where measurement is heading as agents write more of the code^[3]
Beyond dashboards into automation — DevFinOps automates R&D capitalization and tax-credit attribution (Fin: three weeks to three days), giving the platform a CFO-legible payback story most engineering-metrics tools lack^[3]^[1]
Named logo density for its stage — Ramp, Vanta, Carvana, Intercom, Braze, Writer, URBN, and ClassPass disclosed at the $25M raise^[1]^[4]
Honest model documentation — Span publishes chunk-size accuracy curves, abstain rates, and evaluation methodology rather than a single headline number^[2]

Cautions

Detection is probabilistic, and steerable — a Hacker News tester had an AI write deliberately "messy undergraduate" code that scored 0% AI-generated while clean AI code scored 100%; commenters noted the adversarial dynamic means detection accuracy erodes as evasion becomes routine^[5]
5% error is high-stakes at org scale — commenters objected that "95% accuracy is very low" when results feed administrative consequences, and users reported fully AI-generated code scoring 40% and a 10-year-old JavaScript project flagged 50% AI^[5]
Headline metric scrutiny — when pressed for precision/recall rather than accuracy, the team cited 91.5 recall and 93.3 F1 without clearly defining the positive class^[5]
No public pricing — demo-led sales with custom quotes; budgeting requires a sales conversation
Young company, broad surface — engineering intelligence, AI detection, surveys, and finance automation is a wide product footprint for a team roughly two years into building^[1]
Measurement-only posture — Span observes and classifies; it does not change how code is captured, so attribution fidelity is bounded by what can be inferred after the fact

Pricing & Licensing

Tier	Price	Includes
AI Code Detector	Free	Public chunk-level detection of pasted code^[6]
Platform	Custom (not publicly listed)	Engineering intelligence, AI Effectiveness suite, DevFinOps, Span AI agent

Licensing model: Proprietary closed-source SaaS, sales-led contracts.

Hidden costs: Integration and rollout across repos and trackers, plus the organizational cost of acting on probabilistic AI-attribution data — disputes over flagged code are a process cost, not a license line item.

Competitive Positioning

Direct Competitors

Competitor	Differentiation
GitClear	GitClear attributes AI code via direct per-tool API integrations and line-level Diff Delta quality analysis; Span infers authorship with a detection model, covering tools without telemetry but trading ground truth for probability
Oobo	Oobo captures AI provenance at write time (capture-time attribution) rather than detecting it after the fact; Span requires no workflow change but inherits detection uncertainty
DX	DX leads with developer-experience research frameworks and surveys; Span leads with model-based AI attribution and agent traces, treating surveys as one input
Jellyfish / LinearB	Incumbent engineering-management platforms anchored on allocation and DORA-style delivery metrics; Span is AI-native, with detection and agent traces as the core primitive

When to Choose Span Over Alternatives

Choose Span when: you need AI-adoption measurement across every tool — including ones with no usage API — plus allocation and capitalization automation in one platform
Choose GitClear when: you want deterministic per-tool attribution and code-quality trend analysis over probabilistic detection
Choose Oobo when: you can change the capture workflow and want provenance recorded at write time instead of inferred later
Choose DX when: developer experience and research-backed survey programs are the primary lens, not AI code attribution

Ideal Customer Profile

Best fit:

Engineering organizations (hundreds of engineers) where AI tool spend is material and leadership wants impact evidence across heterogeneous tools^[4]
Companies with R&D capitalization or tax-credit reporting burdens that DevFinOps can automate^[3]
Teams adopting coding agents who want agent traces, not just commits, as the unit of measurement^[3]

Poor fit:

Organizations that will use AI-authorship scores for individual performance consequences — the error modes documented publicly make that misuse-prone^[5]
Small teams without a metrics or finance-reporting burden
Buyers requiring deterministic, auditable attribution rather than model inference

Viability Assessment

Factor	Assessment
Financial Health	Solid for stage — $25M Seed + Series A (November 2025) from Alt Capital, Craft Ventures, SV Angel, BoxGroup, Bling Capital^[4]
Market Position	Early leader in model-based AI code attribution; strong logos (Ramp, Vanta, Intercom, Carvana) for a ~2-year-old company^[1]
Innovation Pace	Fast — span-detect-1 (September 2025) to agent-trace AI Effectiveness suite (June 2026) in nine months^[6]^[3]
Community/Ecosystem	Thin — closed source; the free detector is the main public surface^[6]
Long-term Outlook	Promising but contingent on detection staying ahead of evasion and on agent traces becoming the standard attribution substrate

Span has picked the right problem at the right time — every engineering leader is being asked what AI spend is returning — and shipped a differentiated answer fast. The structural risk is that its founding primitive, model-based detection, faces an adversarial ceiling that capture-time and telemetry-based approaches do not; the June 2026 pivot toward agent traces suggests the company sees that and is building the more durable substrate.

Bottom Line

Span is the most credible model-based entrant in AI engineering intelligence: a published, methodologically transparent detection model, an agent-trace attribution suite shipped ahead of incumbents, and a customer list that outpunches its funding stage.^[2]^[3]^[1] Buyers should treat its AI-authorship numbers as directional org-level signal — public testing shows the detector can be fooled in both directions — and lean on the agent-trace layer, where attribution is grounded in artifacts rather than inference.^[5]

Recommended for: Engineering organizations that want org-level AI adoption and impact measurement across all tools, plus capitalization automation, and can accept probabilistic attribution.

Not recommended for: Teams needing deterministic per-line provenance, anyone tying detection scores to individual performance decisions, or small teams without a reporting burden.

Outlook: Strong momentum with a real methodological caveat. The $25M raise, marquee customers, and nine-month cadence from detection model to agent-trace suite are all positive signals; whether detection accuracy holds as evasion spreads is the open question that will decide if Span's moat is the model or the platform around it.^[4]^[2]^[5]

Research by Ry Walker Research • methodology

Sources