While the Experience Data Model (XDM) schema within Adobe Experience Platform (AEP) has proven powerful at enabling Adobe’s products to share data from the platform in a common data model, its open-ended design has thrust marketers and analysts into the unfamiliar role of amateur data modeler. With limited experience modeling data, the task can be intimidating and result in mistakes that can impact the value of data for years to come.
Adobe has documentation to help with the exercise; however, it can feel a little like being thrown into the deep end of the pool. Additionally, web and mobile digital analytics data can be a unique beast that can cause even more complexities. With this article, we hope to equip you with the knowledge and best practices to help you be a good steward and maximize the value of your organization’s digital data in Adobe Experience Platform.
A data modeler designs databases to efficiently access and use data for business processes. It may not always seem clear in AEP’s self-service user interface, but under the hood, this is the role enabled by XDM. In fact, with support for objects and nested arrays, the resulting tables can be complex enough to give even the most experienced data modeler heartburn. Without standards and best practices, AEP’s data lake can quickly become the dreaded data swamp.
Adobe has taken a unique approach that doesn’t fully adopt some of the traditional principles of modeling data with SQL. Instead, they have applied a framework called XDM. To understand the XDM schema, we’ll first need to cover some key terms.
At the most basic level, a schema is a defined list of fields. The fields should combine to describe a real-world object or experience as a data structure. Additionally, it serves as a contract to guarantee that incoming data follows a consistent and defined format with basic validation for each field (data types, lengths, etc.).
AEP additionally applies the concept of a class to assign schemas a set of required fields that inform Adobe products whether the data contained within that dataset will be compatible with that product. For example, Customer Journey Analytics (CJA) expects at least one dataset with a schema class of XDM Experience Event due to the required timestamp utilized for reconstructing journeys. Classes are the primary way to identify the three main types of datasets expected to exist in AEP:
Also, unique to AEP, fields are created inside field groups to encourage the reuse of commonly combined fields and to standardize schema creation. You may see documentation or API references to “mixins”, which was an early name for field groups that was eventually replaced in the user interface.
Adobe has made a number of pre-built, standard field groups available. While these field groups may help with some very basic attributes that you will collect along a customer journey, they are not comprehensive and will miss critical fields needed for many web and mobile experiences (e.g., forms, navigation, errors, etc.). Additionally, they will often contain extraneous fields that will be unnecessary for your implementation. As a result, almost every implementation will involve a considerable amount of custom field group creation, which is where the bulk of your newfound data modeling skills will become handy.
The hardest decision point when designing XDM schema is if and when to leverage the Adobe standard field groups. What are the differences between standard and custom field groups?
Additionally, there is an ability to extend a standard Field Group with custom fields, creating a hybrid field group. This approach allows you to add custom fields to standard objects without having to use the “_tenant” namespace.
It can be helpful to think of field groups as data building blocks that can be stacked to create schemas. In the below example, we have a retail experience in which a customer can navigate a website and make purchases. There are two datasets intended to be used in Adobe CJA: website clickstream data and transactional data from a back-end order management system.
You can see a standard Commerce Details field group block that exists across both schemas. The common field group is essential for reporting on fields across these datasets in a continuous way. For example, a use case may be to analyze whether products with certain behaviors on the website (e.g., viewing a video, reading reviews, etc.) are less likely to cancel or return an order. If the schemas are constructed with mismatched blocks, they won’t be able to correlate the product activity from the website to the outcomes in the transaction dataset.1
1 There are options within Adobe CJA to work around the ‘mismatched blocks’ scenario with Derived Fields, but it should only be considered as temporary until the data model can be updated appropriately.
As you dive into creating your first schema, it is important to be aware of breaking changes and their impact on schema design choices. As your business grows, your schema can and should grow, too. Any modification that could cause existing data stored against the schema to become non-conforming is considered a breaking change and is strictly not allowed. If you haven’t collected data yet, you can make any changes you need. Once that dataset is populated, you will be subject to breaking changes, which include, but are not limited to:
If you need to make one of these breaking changes, you will have to create an entirely new schema and reprocess your data to conform to the new schema—a painful process no one wants to endure. For that reason, you will want to be deliberate about additions to your schema, as once data is stored against it, there are limited options for going back.
While following standards and best practices provides a foundation for governance and compliance, it’s just as much about improving the user experience for data consumers. The more uniform and consistent the data is, the easier it is to explore and contextualize. Just as you want a digital experience to be effortless and intuitive to use, working with data should be no different.
Now that you have a basic background on AEP schemas let's get down to some best practices. As this guide is more focused on web and mobile behavioral data, these recommendations will be aimed at those types of data structures. That said, a lot of these principles do apply to any type of data in AEP.
When it comes to naming conventions in data modeling, the most important thing is that you have one, regardless of the details of what it is. The conventions below primarily take a nod from the out-of-the-box field groups provided by Adobe while applying some other common conventions to help with readability. Here are some suggestions, but feel free to add on with your own:
Data modeling with XDM can be overwhelming. To make it more manageable, break it into pieces. Do a trial run of your data models with sample data, and it can help you see how they come together. Most importantly, know that there is no perfect data model, but with a good foundation and focus on semantics, you can ensure that your organization’s Adobe CJA use cases can be met.
This article is the first in a series to help you lay the foundation you need to be successful with Adobe Customer Journey Analytics. Additionally, we geek out about data models and would be excited to schedule a free 30-minute consultation to discuss your implementation of Adobe CJA.