Thinking AI & Technology

Why your AI strategy needs a data strategy first

Most AI projects fail not because of the model, but because the data isn't ready. Here's why the data strategy has to come first, and what building it actually involves.

Stewart Masters · 29 Mar 2026 · 6 min read
Diagram showing the relationship between data strategy and AI strategy, with data as the foundation layer

The majority of organisations approaching AI adoption start in the wrong place. They start with the model, which AI tool to use, which vendor to partner with, what use cases to prioritise. This is understandable. The model is the exciting part. The data is the unglamorous part that nobody wants to fund or talk about. But getting the order wrong is why most AI projects deliver so little.

The order matters more than the technology

AI, whether you're using foundation models, building custom systems, or deploying off-the-shelf tools, depends entirely on data. The quality of your AI output is a direct function of the quality, completeness, and structure of your data. A world-class model with mediocre data produces mediocre results. A well-designed AI application with clean, consistent, well-governed data can produce remarkable results.

Most organisations don't have good data. They have data spread across disconnected systems, with inconsistent definitions, poor governance, and significant gaps. They've never had to confront this because the previous generation of reporting tools could be configured to work around these problems. AI can't. It exposes them immediately.

What data readiness actually means

Data readiness is not just about volume. Bigger datasets don't automatically produce better AI. What matters is:

The four data problems that kill AI projects

After working through multiple AI implementations, the same failure patterns recur. Understanding them early saves significant time and money.

Siloed systems with no single truth. The most common problem: customer data in CRM, financial data in ERP, operational data in a separate system, and no reliable way to connect them. Every AI project that requires a unified view of the business, which is most of the valuable ones, hits this wall immediately.

Data that's collected but not structured for use. Organisations often have vast quantities of data they can't use because it was collected without thinking about downstream applications. Unstructured notes, inconsistent tagging, free-text fields where structured fields were needed. Cleaning this data is laborious and expensive.

Governance gaps that create legal and ethical risk. AI that uses personal data creates compliance obligations that many organisations haven't addressed. Understanding what data you hold, where it came from, whether you have the right to use it for AI, and how to manage it appropriately is a legal requirement, not just good practice.

Data pipelines that don't exist yet. Some AI use cases require data that simply isn't being collected. Before you can build the AI, you have to build the infrastructure to capture the data, which may take months and require changes to operational systems.

What building a data foundation looks like

A data strategy for AI doesn't need to be comprehensive from day one. It needs to be sequenced correctly. Start with the use cases you're planning to pursue first, and work backwards to understand what data those use cases require. This is a much more tractable problem than trying to fix all your data before starting.

The practical steps: audit what data you have and where it lives; identify the gaps relative to your priority AI use cases; establish clear data ownership and governance; build or procure the infrastructure to move, clean, and serve data to AI systems; and define the data quality standards the organisation will hold itself to.

How long it takes — and why that matters

This is where most AI roadmaps underestimate the timeline. Data foundation work, depending on the complexity of your systems landscape, takes months, not weeks. Organisations that haven't done this work and are promising AI outcomes in 60 days are either working on very narrow, low-risk use cases or setting themselves up for disappointment.

The honest planning assumption for a meaningful AI implementation is: 3–6 months of data foundation work before you can reliably build on top of it. That doesn't mean you can't run experiments or learn during that period. But it does mean that the AI use cases that will matter to the business will take longer than the vendor demos suggest.

Starting before you're fully ready

The data foundation work and the AI strategy work don't have to be sequential. The most effective approach is to run them in parallel: start scoping your priority use cases while simultaneously auditing your data. Use the use case scoping to inform which data problems to fix first. Use the data audit to inform which use cases are actually viable in a realistic timeframe.

The key is to be honest about what's a proof of concept and what's a production system. Proofs of concept can work with imperfect data. Production systems can't. Build the data foundation before you commit to production timelines, and include the data investment in your AI budget from the start.

SM
Stewart Masters
Chief Digital Officer · Honest Greens · Barcelona

20 years building and running digital operations inside real businesses. I write about AI, digital systems, and the leadership decisions that determine whether transformation actually happens.

Related posts

Newsletter

Practical thinking, twice a week

AI adoption, digital strategy, and what actually changes organisations. No fluff.