Jono Herrington

Posted on Mar 29 • Originally published at jonoherrington.com

Engineering Is a Team Sport. Your Incentives Aren't.

#discuss #leadership

I have a game tape dashboard. It tracks constraints and throughput across my engineering team ... PR cycle times, review patterns, where work slows down and where it moves. I built it because I believe engineering is a team sport, and team sports need game film. You can't coach what you can't see.

A few years into leading my current team, I opened that dashboard and saw something I couldn't ignore. Pull requests with over a thousand lines of code were taking, on average, five minutes longer to review than PRs with two hundred and fifty to five hundred lines. Five minutes. For four times the code.

The math told the story before anyone had to say a word out loud. PRs were sitting for days. When reviews did happen, they were almost always handled by the same one or two people. Everyone else had their own features to ship.

The math told the story before anyone had to say a word out loud.

The System I Inherited

I hadn't built this team from scratch. I walked into an incentive structure that had been in place long enough to become invisible ... one of those systems that nobody designed on purpose and nobody questions because it's just always been there.

Engineers were measured by ticket throughput. How many you closed. How fast you moved from one to the next. Whoever did the most tickets won. That was the implicit scoreboard, even if nobody said it in a meeting or wrote it in a doc.

It's a seductive way to measure a team when you're under delivery pressure. Tickets closed feels like progress. Velocity trending upward feels like health. And for a while, it is. But a system designed to maximize individual throughput produces exactly one thing when applied to work at the team level like code review ... engineers who are very good at protecting their time from other people's problems.

When someone else's PR sits in the queue, that's not your problem. Your problem is the ticket in front of you. The one after that. The sprint goal attached to your name. Someone else being blocked is unfortunate, but it doesn't change your scoreboard. So you keep moving.

Earlier in my career I was the manager who pulled up a velocity chart in planning and pushed for the number to go higher. The team complied. Velocity climbed. Leadership was happy. And quietly, underneath all of it, the codebase was accumulating debt that didn't show up until something broke. I set the expectation and got exactly what I asked for.

Incentives work. Just not always in the direction you intended.

Why Every Process Fix Misses

The standard playbook for slow PR reviews follows a familiar script. Break your work into smaller PRs. Create better review templates. Set up a rotation schedule so the same two people aren't carrying the load. Add SLAs with automated reminders. Some of these things have limited value. None of them address the actual problem.

If your performance is still measured by what you ship individually, a review rotation schedule just means it's your turn to pull time away from shipping. The calendar entry exists. The incentive hasn't moved. So the review gets done ... technically. But it gets done the way you'd expect. Skim. Approve. Move on. The queue clears and the same pattern comes back within a quarter.

When you treat an incentive problem as a process problem, you get process-shaped results. The symptom quiets down. The cause stays exactly where it was.

What Actually Changed

My first instinct when I saw that dashboard data was to reach for a new metric. Measure review participation alongside ticket throughput. Track who was reviewing, how often, how quickly. Put the numbers somewhere visible.

The problem is that metrics can be bluffed. Give people a number to hit and they will find the path of least resistance. If review participation becomes a tracked metric, you'll get reviews. You won't necessarily get reviews that actually catch anything. You'll get the five-minute rubber stamp dressed up as engagement, and your dashboard will look healthy while quality stays exactly where it was.

So instead of a new metric, we built a behavior.

When an engineer opens a PR, they drop it in a dedicated channel. The team sees it. Not as an item on a rotation list or an obligation attached to a schedule, but as a ball in play. The way I described it to the team was recovering a fumble in football. Somebody put something valuable on the field. The whole team swarms.

What made it work wasn't the channel. It was what the channel represented ... a shift from "your work is yours until it merges" to "the work belongs to all of us until it ships." Reviewing a teammate's PR wasn't taking time away from your sprint anymore. It was part of your sprint. Getting someone unblocked counted the same as closing your own ticket, because the thing we were actually measuring was how fast the team moved work through the system.

The senior engineers who had been too busy to review suddenly had time. Because now someone else being blocked was their problem too.

We went from review cycles that ran three days to same day.

We went from review cycles that ran three days to same day.

The Number That Actually Matters

I come back to the game tape dashboard here. The metric I actually care about is team throughput, not individual velocity. How fast does work move from opened to merged to deployed, across the whole team, over a rolling period of time.

Individual velocity can look healthy while team throughput quietly collapses. One engineer closing six tickets while four of their teammates' PRs sit unreviewed is not a win. It's a transfer. You moved the constraint from one column to another and called it productivity.

The fastest teams I've built had one thing in common. Helping a teammate ship counted as shipping. Not rhetorically. Not in the values doc. In actual performance conversations, in how growth was talked about, in what got recognized out loud.

That's where the behavior change lives. Not in the channel, not in the dashboard, not in any rotation schedule. In what your engineers believe will happen in their next performance conversation if they spend an afternoon doing thorough, careful reviews instead of closing their own tickets.

If the honest answer is "nothing good" ... you already know what to fix.

It's not the process.

One email a week from The Builder's Leader. The frameworks, the blind spots, and the conversations most leaders avoid. Subscribe for free.

Top comments (6)

Mykola Kondratiuk • Apr 1

The PR cycle time data cuts through a lot of the usual team health debates. When the incentive is personal velocity - closed tickets, merged PRs - review quality is a cost, not a contribution. The 1000-line PR getting five extra minutes is the incentive structure expressing itself honestly. Hard to fix with culture talks; easier to fix when the dashboard makes it visible to the whole team the way you are describing.

Jono Herrington • Apr 2

That was the signal for me too. A review pattern like that tells you people are protecting their own scoreboard first. Once that is true, “be a better teammate” usually stays rhetorical. The real shift happened when helping someone else ship started carrying the same weight as shipping your own work.

Mykola Kondratiuk • Apr 2

yeah and that incentive shift is harder to pull off than it sounds. i've seen teams try to add 'review points' to sprint metrics and it just became a new scoreboard to game. the real version of what you're describing needs trust that leadership actually values it - not just in retros but in perf reviews

Christie Cosky • Mar 31

I worry about when metrics done at the individual level, especially when there's a fear of stack-ranking. Measuring time to finish a ticket? Quality suffers, unit testing suffers, leaving-the-code-better-than-you-found-it suffers, and nobody wants to do the bigger stories anymore. Measuring the number of comments left on a pull request (where more comments = bad)? Now people rubber stamp everything. Measuring the number of story points done in a sprint? Estimates go up. Unit test coverage must increase? Sloppy unit tests that don't assert anything get generated in bulk by AI. As you said, metrics are so easy to be bluffed.

I don't blame the developers - as you said, it's the wrong incentive.

Interesting that dropping the MR in the shared channel worked for you - seems like it would increase the "bystander effect." Maybe that's because you incentivized it the right way.

Jono Herrington • Mar 31

That is exactly the failure mode.

The moment a metric gets tied to individual judgment in a low trust environment, people stop optimizing for the work and start optimizing for survival. You get faster ticket closure, quieter PRs, prettier dashboards ... and a worse system.

On the shared channel point, I think the difference is that we treated review as a team contribution, not as invisible extra credit. Once people know good review work actually matters, the bystander effect drops because the behavior is socially reinforced instead of taken for granted.

A lot of leaders say they want accountability, but what they really create is gaming. The design of the incentive decides which one shows up.

Christie Cosky • Apr 1

I get why large companies "manage by dashboard," but it definitely feels low-trust.

When I was a manager, at first I got frustrated when my devs gamed the system. Then I realized it was the natural consequence of the incentives in place. Don't hate the player, hate the game, or whatever.