[rt_reading_time label=”Reading Time:” postfix=”minutes” postfix_singular=”minute” padding_vertical=”4″]
In the last blog post in our series about ML and AI Readiness we considered the fundamental principles of a strong foundation to an independent data supply chain that will ultimately feed Machine Learning (ML) and Artificial Intelligence (AI) capabilities.
We established there were five mandates to enabling a functional data supply chain:
- The Customer at the Center
- Real-time
- Access and Ownership
- Governance by Design
- ML/AI Ready
To develop a successful data supply chain it is important to:
- Build a scalable framework by which you can repetitively introduce new data sources in an efficient manner
- Develop an agile onboarding schedule
In this blog, we will look closer at how companies are building these concepts into a repeatable and timely process that seamlessly creates a unified data onboarding strategy.
Data onboarding is oftentimes one of the biggest bottlenecks to organizations wanting to make their interactions (electronic or in person) more fluid and responsive. Precious resources and time are lost to the constant one-off builds to get data from each of the growing sources into a structure and place ready for distribution anywhere.
We are facing unprecedented growth in data sources. A framework that reduces the load on the business while introducing a sense of uniformity and control is a must, especially with the growing perception of the value of that data by our consumers, legislators, and competitors. A single, flexible approach to collecting and normalizing data across all areas of human engagement is now a must to achieving the scale and speed our businesses demand.
To build a solid ML/AI Readiness framework you first need to build a uniform understanding of all sources. This framework is a process by which you move from a channel-centric to customer-centric dataset.
It may be helpful to start with definitions to the common language we use throughout this post and about the space in general so everyone is talking the same language:
- Event – something that happens that changes a person’s state. Moments that matter. Some events are electronic, others are real-world, examples could be the view of a page, entry to a store, purchase at a till, a touch of a screen.
- Data Source – data sources are the emitters of events and are generally described as anywhere someone experiences your brand that can be tracked.
- Data Layer – a central definition of all data, from all sources in the language of the business, applied at the point of data creation. Think of this as a two column spreadsheet – one column holding the description, the other column the actual value. This is a living definition that will grow with the addition of new sources but experience has shown investing time early in the process to develop this well minimizes the time needed for these reviews, and ultimately, optimizes the usability of the data collected for all downstream processes.
To give you a view of where we are heading with this:
- The Customer at the Center
- Real-time
- Access and Ownership
- Governance by Design
- ML/AI Ready
The data layer concept developed for websites is fundamental to data momentum and data democracy. Rather than the traditional, centralized ETL process at the point of unification, a data layer brings a new found momentum removing the ETL bottleneck by applying standardization at the point of data creation.
The use of the business language in the data layer definition makes it accessible to all (data democracy) which is ultimately needed for scale. The data layer also enables the decoupling of the data from the data source so that it can be managed as an independent entity, separated from your current toolset which will surely change. Best practices for its definition have already been well documented by Tealium in our TLC here.
The first step in data source onboarding is the initial definition of the data layer, based on the early data sources to be on-boarded and the events they emit.
Once you have a data layer on which to onboard each first party data source you need to build an agile schedule for streaming to the Customer Data Hub that manages all customer data in real-time. Data source onboarding should no longer be viewed as independent projects, but rather as an ongoing series of agile projects, which depending on your number of sources, could stretch out two years, and in truth, is just part of your new Business as Usual (BAU). The priority for this schedule is generated from an assessment of the downstream business outcomes and the comparative return of these needs to the organization.
Let’s be clear about this, this is the start of a new discipline. The processes we are discussing here are becoming a new speciality. Some businesses are even developing a concept where data is delivered as service (Data As A Service, DaaS) by a specialist team – a Data Center of Excellence if you will. We’ll provide details of this in a future blog series.
To recap, building a data supply chain will feed a consistent, unified CX and ultimately Machine Learning and Artificial Intelligence projects.
After reading this blog we’re hopeful you’re confident in building the steps to build a scalable framework for bringing on all data sources in a uniform and timely manner as a foundation to a neutral data supply chain.
Stay tuned for our next blog on how to structure a strategic approach to Identity Resolution across all sources to build a stateful profile across any device or property in real-time.