Most large organisations have now approved an AI budget. The pressure to do so arrived from multiple directions at once, board expectations, competitor announcements, vendor roadmaps, and a genuine belief that the window for advantage was closing. Approving the budget felt like the decision. In many cases, it turned out to be the easier one.
The harder decision, whether the data is actually in a state that the AI initiative depends on, often surfaces later. Sometimes much later, and usually when the project is already in motion.
Budget approval opens a familiar sequence. A tool is selected, a vendor is engaged, and a timeline is set. The team is motivated and there is executive visibility. Delivery starts well.
The data work is where the picture typically changes. The tool requires inputs that are cleaner than what currently exists. The pipeline that was built to feed it was designed for a different purpose. A dataset that looked suitable turns out to include test accounts, historical anomalies, or fields that are no longer actively maintained. The people who know why the data looks the way it does have moved on, or are hard to get time with.
The timeline adjusts. The first milestone slips. The team that was building the capability is now rebuilding the foundation underneath it. The vendor is cooperative, but the core constraint is not in the vendor's product, it is in the data the product depends on.
By the time twelve months have passed, the organisation has spent its AI budget and has something that partially works, a better understanding of what its data actually looks like, and a revised sense of how long this kind of work takes.
The direct costs are visible: delayed timelines, extended vendor contracts, additional engineering time spent on remediation that was not in the original scope.
The indirect costs tend to be larger but take longer to surface.
When AI outputs are not trusted, decisions do not change, teams continue using the manual processes they relied on before, while also managing the AI project alongside them. The efficiency gain the initiative was meant to deliver does not materialise, and the period during which both processes run in parallel is expensive in ways that do not show up cleanly in a project budget.
Remediation also tends to expand. What starts as a focused effort to fix the data inputs for one use case typically uncovers downstream dependencies that were built on assumptions that no longer hold. Each fix reveals the next one. The scope that was meant to be contained becomes broader than anyone initially anticipated.
There is also a longer-term credibility effect. When a project delivers less than the timeline suggested it would, the data team's ability to secure support for the next initiative is more constrained, even if the next initiative is better scoped and more realistic. Scepticism built in one cycle carries into the next.
The budget conversation and the data readiness conversation tend to happen in the wrong order, partly because they involve different kinds of decisions.
The budget conversation is legible to leadership. It has a proposal, a vendor, a timeline, and a business case. It fits into existing procurement and approval processes. It is the kind of decision organisations are structured to make.
The data readiness conversation is harder to surface in a budget discussion. It involves honest assessments of quality gaps, unclear ownership, undocumented pipelines, and decisions that were reasonable at the time they were made and have not been revisited since. It does not fit neatly into a slide. It also tends to produce answers that complicate rather than accelerate the timeline — which makes it easy to defer.
Organisations that are getting consistent results from AI projects are not necessarily the ones with the largest budgets or the most sophisticated tools. They tend to be the ones that asked the data readiness questions before the project scope was set: which datasets does this depend on, who owns them, how do they move, and what are the known quality issues. When the answers to those questions are documented and addressed before the build begins, the project has a realistic foundation to work from.
For organisations already mid-project where the data work is taking longer than expected, slowing down to assess the foundation clearly and resetting the scope around what the data can actually support usually produces better outcomes than continuing to push for the original timeline.
For organisations earlier in the process, where the budget has been approved but the build has not started, there is a genuine opportunity to get ahead of the pattern. A data readiness assessment at this stage costs a fraction of what course correction costs twelve months in, and produces a much more accurate picture of what the project timeline should actually be.
That is the kind of work we do before organisations start building and often with teams that are partway through and want to understand what they are actually dealing with.