Undercover at an exclusive AI conference for Leaders in Tech in Berlin.
Independent VSM consultant
16 Apr 2026
A day of listening at an AI engineering conference. Five problems people kept naming, three patterns I'd push back on — and the real question underneath, which isn't only mine: I keep hearing the same signal from colleagues far more experienced than me — technology consultants, agile and Kanban coaches, DevOps experts. What are they buying? For which problems? And why, when the answer looks obvious, is the cheque still so hard to get?
Across the panel, the talks, and the coffee-line conversations. Ranked loosely by how often they surfaced.
Output is multiplying. Outcomes aren't.
An engineer shipped five microservices with AI instead of one. Nobody pays per microservice. How do we know we're building the right things?
The exception that proved the rule: Intercom was the only talk that tied engineering effort to a single business metric — support cases resolved with the user's problem actually solved.
Picking what to build.
Getting the why and the what before the how. Working closer with customers to decide what actually matters.
Burnout and overload.
Context switching is up. More agents, more concurrent work. One CEO prescribed longer hours. Others pushed back.
Harness engineering (the weeds of agents, PRs, CD).
Everyone is building a version of this: orchestration, guardrails, feature flags, evals, the loop that keeps agents shipping safely. Different names, same problem.
Quality is degrading.
More code, more bugs. Speakers named agents addressing quality in production as an unsolved problem.
Everyone agreed outcomes matter more than output.
Nobody said how.
My take, not consensus. I'd love to be wrong on any of them.
"More concurrent agents means more throughput."
Little's Law says the opposite. More work in progress, more context switching — longer lead times, longer feedback loops, worse customer outcomes. A concrete test: ask your team how many things each engineer has in flight right now. If the answer is more than two, your lead times are longer than they need to be.
"We need to work longer hours to compete with the US."
This one actually gets under my skin. Everyone in the room agrees we should favour outcomes over output. When push comes to shove, we keep valuing output — and staying vague on how we'd even move from one to the other. "Work longer hours" assumes more hours produce better outcomes. They don't. It's the same output-over-outcome trap. The better question: what feedback loops would tell us we're working on the right thing in the first place?
The exciting bet of 2026 — and what it actually means underneath the buzzword. My take, informed by Dave Mangot and Bryan Finster.
Harness engineering = continuous delivery + the agent loop that keeps agents running the software delivery cycle.
— My working definition
The exciting part isn't building features. It's building the loop — and the continuous delivery primitives that let agents run it safely. Get this right and you get more features with fewer developers, faster time to market, and (if you invest in the infrastructure) genuinely good quality.
Supporting pillars (table stakes).
Platform engineering. Self-serve environments, evals, CI. Without this, individual productivity gains don't translate into organisational throughput.
Good DevOps practices. Trunk-based development, small batches, fast feedback, blameless incident response — the boring stuff.
Continuous delivery. Every commit is a candidate for production. If the pipeline isn't safe, the agent loop amplifies your dysfunction.
I've run my VSM business for a year. VSM addresses every one of the five problems above. People nod. Nobody buys. That might not be a discovery problem, but a buying problem.
People don't buy what hurts. They buy what their company already values enough to fund — what they can get a budget line for without a fight.
— A hypothesis, held loosely
If people see the problems, agree VSM solves them, and still don't buy — the pattern fits one of three explanations. The goal of discovery now is to figure out which.
Category mismatch. They'd buy "something" but not "that thing." VSM doesn't fit their mental model of what they purchase.
Budget mismatch. They believe in it, but their org doesn't value it enough to fund it. There's no internal budget line it naturally lands in.
Pain mismatch. They complain the way people complain about the weather. The hurt isn't excruciating enough to fund.
The same five problems, re-examined through a buying lens instead of a pain lens.
I made this table sitting in the back of the last talk. It's uncomfortable. The pains closest to my heart are the ones with no procurement category. The pains with real budget are exactly the ones I used to solve as a DevOps engineer — not as a VSM consultant. Harness engineering sits squarely in the first column.
| Pain | Do they buy for this? | Who sells it today |
|---|---|---|
| Output → outcomes | Rarely | Product coaches. Marty Cagan books. Hard to monetise. |
| Picking what to build | Indirectly | They buy "product strategy" or hire a CPO instead. |
| Burnout / too busy | No | Nobody buys their way out of burnout. They hire headcount. |
| Harness engineering | Yes | Thoughtworks. Platform engineering hires. Harness. CircleCI. Now: agentic-CD vendors. |
| Quality / more bugs | Yes | Testing vendors. SRE contractors. QA platforms. |
VSM is a meta-solution. People don't buy "find your bottleneck." They buy "migrate from manual Kubernetes to Terraform" or a "Product Management training". The object-level thing is what goes on the purchase order. The meta-level thing can be what I deliver inside it.
Holding the pivot lightly. The discovery questions below are designed to falsify, not confirm.
Is this a positioning problem I can reframe my way out of — or is it a product problem I have to rebuild my way out of? And underneath both: is harness engineering something companies will buy as a service, or something every serious team builds in-house?
"Are you buying harness engineering as a service, or building it in-house — and who owns it?"
The load-bearing question. If nobody buys it, I'm not investing in building the capability. If they do, I want to know who signs the cheque."If you were going to solve this, what would you Google?"
Tests whether VSM is in anyone's category. If zero people say anything close, I have a language problem."Last time you brought in outside help for something like this — what was it called on the invoice?"
Literally tells me the line item that gets approved. The procurement-shaped answer."Is this a tool problem, a training problem, or a people-in-the-room problem?"
If they say "tool," VSM is dead on arrival. If "people-in-the-room," I'm alive."If I showed up Monday and fixed one thing, what would it be?"
The answer is the product I should be selling — not the one I want to be selling."What's your CTO asking you to prove about the AI spend, and what are you actually able to show them?"
The gap between ask and answer is where flow measurement consulting lives.The most useful exchange of the day didn't happen on a stage. A Nordic engineering leader named the problem "moving from outputs to outcomes" and asked me what I'd do about it.
I gave her the answer I keep coming back to in my own work:
Align on a common goal that has clear benefits for both the organisation and the customer.
Align on how we're delivering that outcome — and, on what's currently preventing us from delivering it.
Make the preventing forces visible, then remove the most impactful one. Rinse and repeat. This is where flow measurement earns its keep: the gap between the outcome we said we wanted and the work we're actually doing.
Concrete next steps out of the day. The headline one sits at the bottom — everything else feeds it.
Follow up with the Nordic attendee to continue the outputs-to-outcomes conversation.
Test the three-step framing against someone who already named the problem out loud.Research the companies and follow up with the people I met. Understand their problems, the metrics they actually track, and any events they're running next.
Turn warm hallway contact into structured discovery.Run a lean-coffee-style focus group on these findings with my network plus a few event attendees.
Cheaper than ten 1:1s. Surfaces disagreement faster.Collect industry signal: DORA 2025, McKinsey on developer productivity, Bryan Finster on harness engineering and agentic CD, Dave Mangot on agentic proficiency as the new SaaS premium.
Ammunition for the outputs-vs-outcomes conversation. Links in Further reading below.Visit the community / incubator the growth guy mentioned. Also the crypto community he flagged.
Different rooms, possibly different problems, possibly different budgets.Scan open roles at event sponsors and the companies whose speakers I heard.
Reveals where they're actually investing — often a better signal than what they say on stage.Explore Merantix itself — open positions, partnerships, other angles worth a conversation.
They hosted. They have reach. Worth understanding what they're actually building.Keep doing sales. Keep talking to people. Keep diagnosing their problems. Keep evolving the service offering until it matches something they'll actually fund.
The only one that matters. Everything above is in service of this.Threads to pull on while I test the hypotheses above. Held lightly — I haven't digested all of them yet.
The clearest current framing of why harness engineering is a premium bet, not a feature. Mangot names four pillars underneath it — quality internal platforms, a healthy data ecosystem, AI-accessible internal data, and strong version control — and argues agentic proficiency is what now commands the private-equity SaaS multiple. Complementary to Finster's CD canon: Finster tells you how the pipeline has to behave; Mangot tells you why the market is starting to price it in.
Analysis of over 28 million CI workflows. The headline: teams are writing dramatically more code, but fewer changes are reaching production. AI is accelerating development while exposing the delivery bottleneck — validation, integration, and recovery aren't keeping pace. A small group of top performers turn higher change volume into faster, more reliable releases; most teams fall further behind. The empirical backbone for the harness-engineering argument.
Three decades of CI/CD experience, distilled into a minimum-viable definition — and, more recently, applied directly to agentic workflows. The clearest articulation I know of what "harness engineering" is actually rediscovering.
- — Clarity Was Always the Bottleneck — on why AI didn't create the output-vs-outcome problem, it just made it impossible to hide.
- — Agentic Architecture Patterns — single responsibility, explicit interfaces, and separation of concerns applied to skills, agents, and commands.
- — minimumcd.org and bryanfinster.com — the underlying canon.
Nearly 5,000 responses. Headline finding: AI amplifies the organisational system you already have — good or bad. Useful ammo for the outputs-vs-outcomes conversation.
The productivity-gains framing that's driving CTO expectations right now. Worth reading so I know what the ask from above actually looks like — and where the measurement gap opens up.
425 responses across 37 countries, from a friend's survey. Flags strategy being sidelined by reactivity and discovery being under-prioritised — the same gap I keep hitting in sales calls.