The journal
Field notesMarch 202612 min

Why most enterprise AI pilots never reach production — and the four conditions that change that.

Eighteen months of pilot work across three sectors taught us that the gap between a working demo and a deployed system is not a technical gap. It is an organisational one. Here is what we have learned to look for before we agree to build.

By
Graham Head
Chief Executive
Why most enterprise AI pilots never reach production — and the four conditions that change that.

There is a number that gets repeated at every AI conference: roughly 80 per cent of enterprise pilots never reach production. The figure is broadly correct. The diagnosis is almost always wrong. People assume the failure is technical — the model was not good enough, the data was too messy, the cost did not work. In our experience the technical work is rarely the thing that fails. What fails is everything that surrounds it.

We have spent the last eighteen months conducting post-mortems on stalled pilots — both ours, in the early years, and engagements we have inherited from other firms. The pattern is depressingly consistent. So is the prescription.

§02Condition one — a single owner who is paid to ship

Pilots stall when no one in the building has authority to put the system into production. There is usually a sponsor who funded the experiment and a team that built it, but the route to live operation requires a third person — typically the operational owner of the workflow the system would change — and that person has not been engaged from week one.

The fix is unglamorous. Before we agree to a build, we ask who will be holding the new headcount budget when the pilot ends. If the answer is 'we will work that out later', we know what is going to happen later. We will not start an engagement without that person in the room.

§03Condition two — a measurable definition of better

A pilot without a metric is a hobby. The metric must exist before the work starts; it cannot be invented retrospectively to justify the investment. It must be a number the business already cares about — not a derivative of a derivative. And it must be measurable on the timeline of the pilot, not 'eventually'.

We have killed three of our own engagements early because we could not define this number to the satisfaction of the client's CFO. That is the right outcome. Killing a project at week four is a victory; killing it at week thirty is a disaster.

A pilot without a metric is a hobby. Killing one at week four is a victory; killing one at week thirty is a disaster.

§04Condition three — a path through compliance, drawn before we build

In regulated industries — and increasingly outside them — the path to production runs through legal, compliance, security, and data protection. If those teams meet the system for the first time at user acceptance, the project is over. We have learned to insist on a 'go-live walk-through' with each of those functions during the design phase, before any code that will end up in production has been written.

It costs us six weeks of calendar time. It saves us six months of legal review. We have not had a deployment blocked by a compliance objection in three years; we have had eleven deployments where the design changed materially because of one.

§05Condition four — an interface the actual user does not resent

The least technical and most important of the four. Most pilot interfaces are built for the demo, not for the desk. They make the system's capability visible because that is what gets the budget unlocked. Production interfaces have the opposite problem: they should make the system's capability invisible, because the user is busy and has a job to do.

We have stopped accepting 'we will productionise the UI later' as an answer. The interface is the system. Six lines of latency, two clicks too many, a label that uses the wrong word for the team — those are the things that decide whether the model ever gets used.

§06What this means in practice

When a board asks us why their pilot has stalled, we ask four questions. Is there a single owner with the budget to ship? Is there a number the business already cares about? Has the compliance path been walked? Does the user actually want to open the tool?

If the answer to any of those is no, the model quality is not the problem. Fix the four conditions and a great many pilots quietly become production systems. Ignore them and the best model in the world will not help.

About the author
Graham Head
Chief Executive

Pieces in the Journal are written by senior practitioners on the work that prompted them. If a paragraph here resonates with a problem you are looking at, the author is the person to reply to — direct lines beat anonymous inboxes.

Get in touch with the practice