From pilot to production: the bridge most companies never build

IBM's 2024 research found that the average enterprise AI initiative is delivering a return of around 5.9% on investment

Ijan Kruizinga·2026-05-06

The pilot graveyard is bigger than the vendor decks suggest

IBM's 2024 research found that the average enterprise AI initiative is delivering a return of around 5.9% on investment, well below the cost of capital for most organisations. The RAND Corporation reported that more than 80% of AI projects fail, twice the rate of non-AI IT projects. Boston Consulting Group's 2024 AI at Work survey of 1,800 executives found that only 26% had moved beyond pilots to generate tangible value.

Pick whichever number you find most credible. The picture is the same. Most pilots don't make it.

The reason isn't the model. It isn't the platform. It isn't the data, although the data is rarely clean. The reason is that an AI pilot and an AI production deployment are completely different beasts, and most organisations resource them as if they were the same project at different stages.

They aren't. A pilot is a research exercise with a deadline. A production deployment is an operational system with users, SLAs, audit trails, change processes, and a roadmap. The skills, the team, the governance, and the budget are different. If you don't acknowledge that, you don't get from one to the other.

What "production" actually means

When a vendor says your pilot is "ready for production," ask them what they mean. In most cases, they mean the model works on a sample dataset and the demo runs without crashing. That's not production. That's a working prototype.

Production means:

The system runs reliably for real users on real data, every day
There is monitoring for accuracy, latency, cost, and drift
There is a defined process for handling failures, including who gets paged at 2am
The output is integrated into a workflow that someone is accountable for
There are controls for data privacy, IP, and regulatory exposure
There is a budget line for ongoing operation, not just build
There is a person, or ideally a team, whose job is to keep it running and improving

If your pilot can't tick those boxes, it isn't ready. And the work to get from "demo runs cleanly" to "all of the above" is usually two to four times the work of the pilot itself. This is the bridge most companies never build, because they didn't budget for it and didn't realise it existed.

Why pilots get stuck

In our AI implementation work with enterprise clients, the same patterns show up again and again. We've covered some of these in why most enterprise AI pilots stall, but the production-specific ones are worth pulling out.

No production owner. The pilot was run by an innovation team or a data science group. Nobody in the operating business has signed up to own the system once it's live. When you ask "who runs this in production?", you get blank looks.

No integration plan. The pilot output sits in a standalone interface. Getting it into the CRM, the case management system, or the agent's daily workflow is a separate project that nobody scoped or funded.

No change management. The people who will use the system day-to-day weren't involved in the pilot. They first hear about it when they're told to start using it next Tuesday. Adoption craters.

No governance signoff. Risk, legal, privacy, and security weren't engaged early. When the pilot is ready to scale, they raise concerns that should have been answered six months ago, and the project goes back into the queue.

No cost model. The pilot ran on credits or a discounted licence. Nobody calculated what 5,000 users at production volume actually costs, and when finance sees the real number, the business case collapses.

Each of these is a separate failure mode, but they share a root cause. The pilot was treated as a destination, not a stepping stone.

The bridge: five things to do before you start the pilot

The single biggest predictor of whether an AI pilot reaches production is what you decide before you start it. Once the pilot is running, it's almost too late to fix structural problems.

1. Name the production owner on day one. Not the sponsor. The person whose team will operate the system once it's live. They sit in pilot reviews. They have veto rights on scope. If no operating leader will own it, you don't have a real use case.

2. Define the production success metric, not the pilot success metric. "The model achieves 85% accuracy on the test set" is a pilot metric. "Average claim handling time drops from 14 minutes to 9 minutes for the customer service team" is a production metric. If you can't write the production metric down, the pilot won't have a target to aim at.

3. Engage governance before the pilot, not after. Bring risk, privacy, legal, and security into the room when you're scoping the use case. Get their early read on what would block production. We covered the practical version of this in our AI risk management framework. Most governance teams will work with you if you bring them in early. If you bring them in late, they'll block you.

4. Cost the production deployment, not the pilot. Build a model for what 12 months of production operation looks like at full user volume. Include licences, infrastructure, integration work, monitoring, retraining, and the full-time equivalents needed to run it. If the production cost kills the business case, you've learned that before you spent six months on a pilot.

5. Plan the workforce change. Who needs to learn what for this system to actually be used? When? How? This is where most rollouts quietly fail. The system goes live, the training is a 20-minute video, adoption is 12%, and six months later someone quietly turns the project off. We've written about how to avoid that in our work on enterprise AI training.

The team you need is different from the team you had

The pilot team is usually three to five people: a data scientist or ML engineer, a domain expert, a product or project manager, and maybe a designer. The production team needs more.

You need an operating owner. You need MLOps or platform engineering capability to run the system, not just build it. You need an integration engineer to wire the model into the workflow tools your users actually use. You need a change lead to drive adoption. You need a measurement function that reports to both the CIO and the CFO on whether the thing is working. We've laid this out in detail in the five roles every enterprise AI initiative actually needs.

If you don't staff the production team, the pilot team ends up doing it. They burn out, they can't run new pilots, and the pipeline collapses. The insurer with nineteen pilots and three production systems? That's exactly what happened. Their data team was running everything, and there were only so many of them.

What to do this quarter

If you're sitting on pilots that haven't shipped, run a triage. For each one, answer four questions in writing.

Who is the production owner in the operating business? What is the measurable production outcome and by when? What does it cost to run for 12 months at full scale? What capability does the workforce need to use it? If you can't answer any of those for a given pilot, kill it or pause it. Don't keep it on life support.

For pilots that aren't running yet, don't start them until those four questions are answered. The discipline of answering them upfront is worth more than the speed of starting fast and stalling out. We see this in our implementation work every month: the projects that move quickest to production are the ones that moved slowest at the start.

The bridge between pilot and production is not technology. It's accountability, measurement, governance, cost, and capability. Build the bridge first. Then walk across it.

Ijan Kruizinga

Co-founder of Better People. 20+ years across technology and marketing leadership. Previously CEO of Crucial, CEO/COO of OMG and Jaywing.

Ready to talk?

30-minute discovery call.

Book a 30-minute discovery call →