I’ve been in the game for a long time. I’ve even had the opportunity to work for the first true digital analytics platform. Even as a digital analyst for many years at the start of my career, I have loved data layers. I understood the power of having clean, governed, organized, contextually-relevant data collected at every point in my clients’ ecosystems. That was my job and I loved it.
Now, the conversations have changed. I have the privilege to see so many different clients attempt, rather ambitiously, to keep up with their competitors in every way possible through the use of AI. This was in my role as CTO of Choreograph, working with many of the largest brands in the world in advertising, continuing now with prospects as Global Field CTO at Tealium.
In nearly every client meeting I’ve had this year, the conversation is the same: AI. Everyone is investing in AI-powered personalization, AI-influenced advertising, AI-driven agents, and AI-native analytics. And in almost every follow-up conversation a few months later, I’m hearing the same frustrated question: Why isn’t it working?
The demos were stunning. The promise was revolutionary. But the reality is that the “hyper-personalized” emails are still saying “Hello, [FNAME],” the predictive audiences are underperforming, and the generative agents are confidently producing nonsense.
Here’s the hard truth: this isn’t an AI failure. It’s a data failure.
We are trying to run sophisticated, Formula 1-level AI engines by filling their tanks with mud.
For the last decade, my clients and prospects have been buying new tools and stacking them on top of a data foundation made of digital sand. That foundation, the “data layer,” has always been the most critical and most neglected piece of the entire marketing stack. Now, as AI exponentially increases the demand for clean, contextual data, that foundation is collapsing.
A data layer isn’t just your database or CDW. It also isn’t just a JavaScript object that lives on your site or app. It’s the living, breathing infrastructure—the plumbing and wiring—that collects, standardizes, and transmits data between all your systems in real-time. It’s the universal translator that ensures “user purchased a product” means the exact same thing to your website, your CRM, your ad platform, and your analytics.
For years, everyone has been papering over the cracks of a bad data layer. Clients who haven’t taken a strategic view of their data collection and organization have always just accepted the “cost of doing business.”
I see it constantly in e-commerce. A customer adds a “Men’s Blue V-Neck Cashmere Sweater” to their cart.
- The website’s event is event: ‘add_to_cart’, sku: ‘123-BLU’.
- Your analytics tool, set up by a different team, records it as event: ‘cart_add’, product_name: ‘Cashmere V-Neck (Blue)’.
- Your email platform just gets event: ‘item_in_basket’, product_id: ‘123-BLU’.
Because no system can agree, the customer gets a generic “You left something in your cart!” email instead of a targeted ad for that specific sweater. Or even worse, maybe no triggered message because the systems couldn’t agree and the journey errored out. The impression was wasted and the sale lost.
The stakes are even higher in B2B.
A high-value prospect from a target account downloads a critical “Cybersecurity Implementation Guide.” This is a burning-hot buying signal. But the form data jane.doe@bigcorp.com can’t be automatically matched to the “BigCorp Inc.” account in the CRM because of a data-entry typo from six months ago. So, instead of routing this lead to the Account Executive, the system creates a new, orphaned lead. That multi-million dollar opportunity gets put in a generic newsletter sequence and goes cold.
These were always costly problems and avoidable with a consistent, well-formed data layer. Now, AI has turned these small, costly problems into catastrophic, system-wide failures.
Let’s look at what actually happens when your new AI tools try to run on this broken data.
First, you get the silent failure of predictive audiences.
A marketer asks their new AI, “Build me an audience of high-value customers who are at risk of churning.” The AI tries to query the data, but the customer_ltv data is in a mix of USD and CAD with no currency field. The last_session_date from your analytics is a timestamp, but your support ticket data is tied to an email address, not a unified user_id. The AI doesn’t stop. It makes a “best guess.” It improperly joins the data, treats 500 CAD as 500 USD, and builds you an “audience” of 10,000 users. You trust it and launch a $100,000 retention campaign, only to discover you gave steep discounts to a random assortment of low-value, happy customers. The AI gave you a confident, expensive, and completely wrong answer.
Second, you get the contextual hallucination from generative AI.
You want to send a push notification: “Send a note to users who viewed the ‘Men’s Wool-Blend Cardigan’ but didn’t buy.” But your data payload is a mess. It’s product_name: ‘MWBCardigan_v2_FINAL’ and category: ‘null’. The AI, forced to fill in the gaps, sends a message that damages your brand: “Still thinking about that MWBCardigan_v2_FINAL? Check out our other items in null.” It looks robotic, broken, and erodes customer trust instantly.
Most damagingly, you get total agent failure.
This is where the entire promise of an AI-powered future falls apart. You give a marketing agent what seems like a simple command: “Find all ‘Gold’ tier users in New York or Chicago, check the weather, and if it’s dropping below 40°F, send them the ‘Winter Coat’ campaign.” The agent fails immediately. It can’t find ‘Gold’ because the loyalty field is an integer, not a string. It can’t query by “New York” because the city field is free-text with “NY,” “NYC,” and “new york” all as different values. It can’t check the weather because the weather API needs a postal code, which you only have for half your users. The agent’s response? “I’m sorry, I cannot complete that request.” Your powerful, expensive AI has been reduced to a useless error message.
If clients want AI to “just work,” and for their investments to pay off then they have to correct their data layer. And context is as important as the quality of the data. AI powered tools don’t have all of the business context unless you give it to them through your data layer.
We have to stop blaming the tools.
AI isn’t the problem. It’s the plumbing. The data layer.
Your first-party data is your single most valuable, defensible asset in a cookie-less world. Unfortunately right now, for most companies, that asset is a liability. It’s an unusable, toxic mess that is actively sabotaging your most expensive new investments.
The most innovative, high-ROI “AI project” you can fund this year is not to buy another tool. It’s to do the “boring” work. It’s to appoint an owner for your data layer. It’s to build a cross-functional governance team. It’s to finally define your taxonomy—your business’s core language—and enforce it across every single system.
Allow me to share a recent, practical example.
I worked with a client who, after migrating to server-side GTM (sGTM), hard-coded their Floodlight tags directly with JavaScript. This decision had severe financial consequences.
Google’s DV360 platform uses AI-driven bidding that relies entirely on clean, accurate measurement from these Floodlights. If the data is flawed, the AI’s bidding decisions—which impact millions of dollars in ad spend—will also be flawed.
The specific problem was a failure to capture currency codes correctly. To give just one example, a purchase of ¥154,000 (approximately $1,000 USD) was mistakenly recorded as $154,000 USD. This was not an isolated case; this type of error was replicated across multiple global currencies. Before the issue was caught, it had resulted in over $10,000,000,000 (ten billion dollars) worth of incorrectly recorded transactions.
This massively distorted data completely invalidated the performance metrics being fed to the AI, causing their DV360 bidding strategies to fail. This catastrophic, multi-billion dollar mistake was entirely preventable and highlights the critical importance of a properly developed data layer.
Fix the foundation and the fuel. Only then can you win the race using a Formula 1-level AI engine.
I love data layers.
-Nick Albertini
Global Field CTO, Tealium