I don't love thinking about developer measurement, and most engineers don't either. We didn't get into this work to be quantified. But the question isn't going away — and as AI tools become standard, the data exhaust from our work is only going to grow. The real question isn't whether to measure. It's how, and to what end.
The honest answer is that measuring developer performance requires a portfolio. Output metrics, with all their flaws. Peer feedback, with all its biases. Manager judgment, with all its subjectivity. Business impact, with all its attribution problems. Any one of these in isolation is wrong. All four together are enough to inform real decisions without pretending you've solved a fundamentally unsolvable problem.
AI tool usage is going to slot into this mix as another input. Maybe useful. Probably overweighted at first. Eventually understood as one signal among many. It won't be the answer any more than lines of code or story points were. The companies that figure this out fastest will save themselves a cycle of building dashboards, watching their best people leave, and quietly walking it back.
The deeper distinction is what the measurement is for. There are two postures, and they look identical from the outside until you watch what gets done with the data. Posture one: measurement to develop people. The data flows into 1:1s, learning programs, studies of what top performers do, team retros. Nobody gets fired because of a metric. Posture two: measurement to rank people. The data flows into stack rankings, comp decisions, performance improvement plans. Everybody games. Trust collapses. The best engineers — the ones with options — leave first.
I've argued elsewhere about the gaming problem and why even well-designed metrics get distorted the moment they touch compensation. The fix isn't smarter metrics. It's a clearer separation between development data and decision data, and a leadership team disciplined enough to hold the line.
I'm still thinking about all of this. I don't have the answer — nobody does, and the people claiming they do are usually selling something. But I'm pretty sure the conversation is about to get a lot more interesting, and the companies that handle it with care are going to compound advantages the rest of the industry will spend years trying to catch.
Sources
Related Essays
The Gaming Problem Never Goes Away
Any developer performance metric can be gamed. AI tools just give us new things to measure — and new ways to get it wrong.
Top Performer Analysis: The Real Opportunity in AI Tool Telemetry
The interesting use of AI coding tool data isn't ranking. It's understanding how your best engineers actually work — and helping the rest of the team catch up.
The GTM vs. R&D Measurement Gap
Sales has revenue. Engineering has hand-waving. The asymmetry in how we measure go-to-market versus R&D is a real problem, not a feature.
Key takeaways
- Developer performance is a portfolio problem, not a single metric.
- AI tool data is one input, not the answer.
- Companies that use this data to develop people will out-compete the ones that build leaderboards.
FAQ
What's the honest answer to measuring developer performance?
A combination of output metrics, peer feedback, manager judgment, and business impact. Each is flawed in isolation. Together they're enough to inform good decisions without pretending to objectivity.
Will AI tool data become part of this mix?
Yes, but as one signal among many — not the primary one. The orgs that treat it as the answer will overweight it and drive their best people out.