Giving the Modern Data Stack a Boost With Adobe CJA

Subscribe to our monthly newsletter to get the latest updates in your inbox

Being in the data and analytics field rarely inspires excitement. You can only rearrange data, charts and tables in so many ways before things start to get repetitive. When the latest and greatest hot new skill everyone should learn is SQL (btw happy 50th birthday!), it becomes obvious that real change and innovation is a rare occurrence in this field.

To that point, a big part of why I get so excited about Adobe Customer Journey Analytics (CJA) is because it may be the most innovative analytics product in the market right now. To be clear, that’s not specific to digital analytics and adjacent tools — that’s all analytics and BI tools.

The State of the Modern Data Stack

Why do I feel so strongly? It has a lot to do with the current state of the buzzwordy “modern data stack” and where data teams are currently experiencing challenges. To oversimplify, the modern data stack is the set of tools enabling data warehousing in the cloud.

With the migration to the cloud, many teams took the opportunity to reassess the classic Extract-Transform-Load pattern, or ETL, to bring data into a data warehouse for analytics use. Instead, many opted to flip the last two steps into ELT. This shortened the path to getting data loaded into the cloud and took advantage of the expanded amount of storage and compute offered by the cloud. Additionally, it offered flexibility by decoupling transformations from the loading of data. In the ETL days, combining columns and applying logic to create a new column involved complicated redeployments of entire pipelines. With ELT, using a tool like dbt, it can be a few SQL updates.

In the modern data stack, turning raw data into usable data models has become the new gold, but because it requires a high level of domain expertise, it’s also people and process-intensive. This has resulted in the bottleneck moving downstream to the SQL and analytics engineers preparing data for BI use. As data in the cloud proliferates, the questions increase, the mashups of datasets become more complex and there are only so many SQL developers. The reality remains that not everyone who needs insights can learn SQL. Because of this, data exploration, discovery and insight generation haven’t been able to scale with the cloud warehouses.

What does any of this have to do with CJA? CJA’s predecessor, Adobe Analytics, is a self-contained, purpose-built digital analytics tool with a built-in data model optimized to rapidly explore and visualize a specific type of data: streaming events from digital experiences, AKA clickstream. Its biggest strength has been a self-service, no-code Workspace user interface that allows analysts and business users to rapidly generate and test hypotheses about user behaviors, uncovering insights across a sprawling clickstream dataset without any knowledge of SQL.

With CJA, Adobe has blown out the backend of the Adobe Analytics data model and opened it to event datasets of any schema. As long as there is a consistent Person ID and a timestamp, even physical events can be brought in, including appointments, phone calls, in-store purchases, etc. Here, you can start to see where the two worlds of the modern data stack and CJA may converge. By constructing a highly interactive and optimized data model with the sole purpose of rapidly finding relationships between dimensions and metrics, CJA solves a lot of transformation problems. Those relationships are pre-calculated as they’re processed into the data model to keep reading the results performant.

Not all data are events, and CJA is not an answer for those. However, the innovation that it signals is the idea that purpose-built data models can be placed over cloud data warehouses and, without much more than some schema mapping, suck up data and create use case-specific tools to aid rapid data exploration. Additionally, with Amplitude announcing a Snowflake-native version of its tool, it becomes obvious that this is a space that analytics vendors see as an opportunity. Outside of event data and customer journeys, I could see tailored tools for other domains, such as supply chain, people/HR, health care and more.

Transformational Derived Fields

Where CJA further differentiated itself this past year was with the introduction of derived fields. The feature allows data ingested into the tool’s data model to be transformed on the fly. If you think about the earlier task of combining a few columns and adding some logic, this is the perfect application of derived fields.

In SQL, if the change is simple enough, it might be able to be done with a view, but it will likely require changes to the data model and a replay. In Adobe Analytics, you can achieve this through some combination of a tag manager or processing rules, but this would only be applied going forward from when you made the change.

Using derived fields in Adobe CJA

In CJA, using derived fields, this type of transformation is constructed in a drag-and-drop user interface and is fully retroactive and applied instantaneously. You can create your derived field by manipulating other fields within an event using a set of predefined functions and see a live update of how it affects the output. Then, publish it as a dimension or metric for immediate use across the full timeline of events contained within CJA for any user to pick up and use in their analysis. It’s an innovation that’s truly groundbreaking. Use cases that stand out include:

  • Data cleansing
  • Applying business logic (e.g., marketing channels, product finding method, etc.)
  • Classifications or data grouping
  • Combining or parsing data (e.g., URL query strings)

In the modern data stack context, this feature alone significantly closes the gap in time-to-insight. Data transformations are rarely right the first time and often need to be applied to an entire dataset before an analyst can really see whether they behave as expected. This results in a lot of iterations and time wasted. With the ability to adjust transformations on the fly, fine-tune them and uncover an insight without an engineer or SQL developer, data and analytics teams can both enjoy efficiencies as a result. As a repeatable pattern, it has incredible potential power.

Opportunities for Iteration

As this space matures, I expect competition to start looking at how each other are approaching the problems and adapting. For example, there is an opportunity with how CJA is currently architected. It currently requires data to be loaded into Adobe Experience Platform in the form of their JSON Schema-based XDM specification. There are a plethora of connectors to facilitate that ingest, but it’s a necessary step. To slot CJA nicely into the ideal data stack, it could connect directly to the cloud data warehouse and consume it into the CJA data model.

Additionally, those slick, fully retroactive derived fields have limits on the number of times you can apply a function to a single derived field, the number of fields you can evaluate and the number of derived fields in a connection. These limits are to be expected to keep a feature like this performant and don’t significantly impact the majority of transformations. The good news is that the full weight of Adobe’s product and engineering is behind CJA and it’s iterating quickly with new features and optimizations each month.

Lastly, this doesn’t replace the need for hardened and vetted data models that serve as the semantic layer for BI tools. Those are necessary for financial and business reporting that requires a higher level of rigor. Instead, CJA can relieve the strain on the teams supporting them, keep them focused on what matters, and divert many of the ad hoc, deep-dive and exploratory tasks to CJA.

Conclusion

Going forward, a tool like Adobe CJA has the potential to be the tip of the spear. Facilitating self-serve and immediate exploration within a purpose-built data model, consuming from the cloud. Being tuned for speed and on-the-fly transformations, it can become a complementary component of the modern data stack. It’s still in a nascent stage and has some kinks to iron out, but the move toward more rapidly visualizing and exploring data in the cloud can be the next big advancement in the field of analytics.

Do you have more questions or would like to learn more? Schedule a 30-minute consultation to discuss how Adobe CJA can fit your needs.

placeholder_200x200