In the last blog post in our series about ML and AI Readiness we considered the fundamental principles of a strong foundation to an independent data supply chain that will ultimately feed Machine Learning (ML) and Artificial Intelligence (AI) capabilities.
We established there were five mandates to enabling a functional data supply chain:
To develop a successful data supply chain it is important to:
In this blog, we will look closer at how companies are building these concepts into a repeatable and timely process that seamlessly creates a unified data onboarding strategy.
Data onboarding is oftentimes one of the biggest bottlenecks to organizations wanting to make their interactions (electronic or in person) more fluid and responsive. Precious resources and time are lost to the constant one-off builds to get data from each of the growing sources into a structure and place ready for distribution anywhere.
We are facing unprecedented growth in data sources. A framework that reduces the load on the business while introducing a sense of uniformity and control is a must, especially with the growing perception of the value of that data by our consumers, legislators, and competitors. A single, flexible approach to collecting and normalizing data across all areas of human engagement is now a must to achieving the scale and speed our businesses demand.
To build a solid ML/AI Readiness framework you first need to build a uniform understanding of all sources. This framework is a process by which you move from a channel-centric to customer-centric dataset.
It may be helpful to start with definitions to the common language we use throughout this post and about the space in general so everyone is talking the same language:
To give you a view of where we are heading with this:
The data layer concept developed for websites is fundamental to data momentum and data democracy. Rather than the traditional, centralized ETL process at the point of unification, a data layer brings a new found momentum removing the ETL bottleneck by applying standardization at the point of data creation.
The use of the business language in the data layer definition makes it accessible to all (data democracy) which is ultimately needed for scale. The data layer also enables the decoupling of the data from the data source so that it can be managed as an independent entity, separated from your current toolset which will surely change. Best practices for its definition have already been well documented by Tealium in our TLC here.
The first step in data source onboarding is the initial definition of the data layer, based on the early data sources to be on-boarded and the events they emit.
Once you have a data layer on which to onboard each first party data source you need to build an agile schedule for streaming to the Customer Data Hub that manages all customer data in real-time. Data source onboarding should no longer be viewed as independent projects, but rather as an ongoing series of agile projects, which depending on your number of sources, could stretch out two years, and in truth, is just part of your new Business as Usual (BAU). The priority for this schedule is generated from an assessment of the downstream business outcomes and the comparative return of these needs to the organization.
Let’s be clear about this, this is the start of a new discipline. The processes we are discussing here are becoming a new speciality. Some businesses are even developing a concept where data is delivered as service (Data As A Service, DaaS) by a specialist team – a Data Center of Excellence if you will. We’ll provide details of this in a future blog series.
To recap, building a data supply chain will feed a consistent, unified CX and ultimately Machine Learning and Artificial Intelligence projects.
After reading this blog we’re hopeful you’re confident in building the steps to build a scalable framework for bringing on all data sources in a uniform and timely manner as a foundation to a neutral data supply chain.
Stay tuned for our next blog on how to structure a strategic approach to Identity Resolution across all sources to build a stateful profile across any device or property in real-time.