Going from Outputs to Outcomes.
Independent VSM consultant
12 May 2026
AI multiplies output, but it's not multiplying outcomes. How do we close the gap?
Across the panel, the talks, and the coffee-line conversations at Prompt 2026. Ranked loosely by how often they surfaced.
Output is multiplying. Outcomes aren’t.
An engineer shipped five microservices with AI instead of one. Nobody pays per microservice. How do we know we’re building the right things?
The exception that proved the rule: Intercom was the only talk that tied engineering effort to a single business metric — support cases resolved with the user’s problem actually solved.
Picking what to build.
Getting the why and the what before the how. Working closer with customers to decide what actually matters.
Burnout and overload.
Context switching is up. More agents, more concurrent work. One CEO prescribed longer hours. Others pushed back.
Harness engineering (the weeds of agents, PRs, CD).
Everyone is building a version of this: orchestration, guardrails, feature flags, evals, the loop that keeps agents shipping safely. Different names, same problem.
Quality is degrading.
More code, more bugs. Speakers named agents addressing quality in production as an unsolved problem.
Three-month window. The questions you’ll get — and what you now have answers for.
Best guess at your CTO’s quarterly. The point of the last row is the one we haven’t guessed.
Local optimisation lives on Outputs. CTO targets are Outcomes.
- code shipped
- PRs merged
- deploys / week
- dev velocity
- revenue-bearing features
- bad bets killed early
- end-to-end TTM
- quality (fewer incidents, faster recovery)
▸ This deck moves the conversation to the right.
Feature hit rate — what % of your bets paid off?
- ~700 engineers × ~€100k fully-loaded ≈ ~€70M / yr
- ~100 features shipped per year
- ~5–10 generate significant revenue
- Lead time idea-to-production: ??? — Why’s this the single most outcome-predicting metric?
Same team. Same budget. That’s the prize this deck is chasing.
What % of your bets paid off? What can we do to increase it?
Option (c) is the one we don’t want to assume. Reorgs and restructuring often show up at growth-stage companies but rarely make the slide. Worth naming what’s actually in scope before this conversation closes.
Same capacity, raise hit-rate and throughput = more revenue-bearing features. Two things need to move at the same time.
- faster product discovery to experiment results
- engineering enables product leaders to experiment faster
- Look at: lead time from idea to production
- more code = more quality problems
- no way around it except improve the system
- Look at: incidents, time to recover, rework eating capacity
Both are measurable. Both respond to the same intervention: seeing the value stream clearly and managing it together.
Lever 1 in action. One pattern — yours may be different. Mapping the value stream is how you find out.
One pattern. Yours may be different — mapping the value stream is how you find out which structural waste is actually costing you. Local dev speed is not the bottleneck; the cadence between phases is a cross-stream constraint.
Same items, no sprint trap. Below the Gantt: where the measurement hooks would attach to a generic delivery stack.
Start by talking to teams about what data matters and how to collect it (this will change over time and is never finished). You’ll likely need to build custom aggregations on top of the tools you already have. Improve them continuously. Start small. Big-bang adoption fails.
Lever 2 in action. How more AI-generated code, without system-level feedback, leads to more incidents and harder recovery.
This is a system problem. You can’t fix it by reviewing harder. You fix it by making the system visible: where defects enter, how long they take to find, what the rework costs. That’s a value stream conversation.
A real company. Complex release involving six teams in sequence. Touch time per stage is a few hours. Cycle time per stage is weeks. That gap is the waste.
Business complains: “development and testing take too long.” “Deployments always go wrong.” But we are all causing the problems we complain about.
Why we invite all these teams to talk to each other.
Each team optimizing locally was hurting the whole company — despite seemingly helping themselves.
- faster feedback
- kill bad bets earlier
- team time goes to features, not firefighting
- fewer production incidents
- less rework eating capacity
- happier, returning customers
- necessary tech investments become visible + justified
Siloed visibility: each team thinks they’re doing fine.
End-to-end visibility: teams pursuing local goals are hurting each other and the company.
Value stream metrics + talking to your teams show you where the problems are — and help teams find the solutions. I’m not a product coach. You need to get great at product discovery. But the value stream shows you where to look.
The path from a VSM conversation with your teams to the numbers your CTO is asking about.
Flow metrics aren’t a dashboard you admire. They’re how you kill bad bets earlier, ship the good ones faster, and walk into the quarterly review with the receipts.
Two phases to instrument first. Let’s pick them before our next session.
Two modes. A 90-day shape. How outcomes get measured. No price on this slide — that’s a conversation.
- remap one stream with your team
- agree the hooks together
- define what gets measured
- ~3 days/week in your standups
- instrument the hooks across your stack
- run the loop with one team · handoff at the end
Outcomes measured by:
- Lead time idea-to-production (leading)
- Quality: incident rate, time to recovery (leading)
- % features generating revenue (lagging, 12-month window)
- The band of hooks itself (the artifact you keep)
No price on the slide. The conversation about scope, sequence, and commercials lives in the call.
Closing figure — the mechanism slide 02 promised, drawn. Step 4 is the bridge.