An Intro to ML Process Design: Framing the Problem to Solve
Machine learning (ML) practitioners used to come with a Ph.D. Thanks to technologies like BigQuery ML (BQML), your developers and business analysts can quickly train and deploy ML models. However, the vast majority of the models will never make it to a production environment. We hope this article will help improve the success rate of your models by assuring that they align with value. ML process design is the first step to success.
For context, Google asked us to test a new BQML feature they announced at NEXT ’22. They hoped we could develop a production modeling pipeline using unstructured data (images) in a few weeks’ time. We identified our client Twiddy as a great candidate with in-house BQML and SQL skills, a wonderful visual search appliance, and—most importantly—a test-and-learn culture. We implemented the ML process design that Google recommended and delivered the model on time!
The article will guide you through a rubric that we used and that you can apply to your business challenge to accelerate ML process design and delivery. The rubric consists of four sections with a total of ten steps. We’ll explain each step and then show how we used it to help Twiddy improve their website by selecting the best images to feature in search results (this is similar to optimizing product listing pages for e-commerce PLPs). Each section ends with a peer review that you should complete before continuing.
Section 1: Problem Definition
This section covers what you take into your executive meetings. It’s how you describe what you’re working on when the CEO walks into the elevator. If you can’t articulate the problem, outcome and what success looks like, then you shouldn’t be trying to solve it with machine learning.
Let’s get started!
In simple terms, what is it you’d like this machine learning model to do for your business?
|For all searches, the model should predict a listing’s probability of being clicked. This will allow us to deliver better search experiences, increasing clicks and bookings for the business.|
Pro tip: Address business objectives and initiatives and speak directly to the project stakeholders.
How will the predictions from the model improve your business?
|We’d like the team to be able to see a measurable lift in search clicks for A/B treated images. Also, we’d like the content creators to incorporate the model findings into their photo compositions and editorial strategy.|
Pro tip: Don’t describe the model output; that comes later. Focus on why you’re doing this over how.
You’re building the project’s executive dashboard. Success metrics are measurable business outcomes. They’re not the model evaluation metrics – no AUC scores! Be sure to include what failure looks like.
- Team NPS > baseline
- Incremental treatment lift > 0%
- Incremental booking revenue lift > 0%
- Inverse of success
Pro tip: Design your metrics to quantify success. Failure is typically the inverse.
Design Review: Problem Definition
For each section, we ask that you go through a design review with your peers. To execute the review, walk through the section questions below and collect feedback on your answers.
Section 2: Outputs
This section details the model’s outputs and how the outputs will integrate into the business.
Clearly state a quantifiable output of the model. Be sure to define terms; giving examples is helpful.
|Given an input row, the model will output a probability to get a click – expressed as a float between 0 and 1.|
Pro tip: This is something the model itself will output, this is not subsequent analysis or interpretation.
Using the Output
How will the above output be integrated into the business? Where is the output made available?
|The model can run ad-hoc or periodically. We want to add click probability scores as an attribute to the `Warehouse.PropertyImages` table. From there, we’ll integrate the signal into the CMS.|
Pro tip: Consider the latency of data availability and how that impacts dependent business processes.
Without using ML, how would you solve this problem?
|Without ML, we would run A/B split tests through all possible image combinations for all listings. The split test could take a long time to reach statistical confidence.|
Pro tip: Assume you had to deliver this capability immediately. How could you do this without ML?
Design Review: Outputs
Walk through the following Output questions below and collect feedback on your answers.
Section 3: Modeling
We now transition to the more technical part of the exercise. We need data scientists to weigh in on the ML methods’ specifics. We’ll first detail the modeling solution and then simplify it.
What is the specific model type you’ll be training? Will you use supervised, unsupervised or other methods? Be specific and include details around the number of classes or dimensions used. Are there additional details you could provide to help implement the model? Any novelties to consider?
|This is a binary classification problem. Given the search context, we’re predicting the likelihood of a click. In this case, the input includes unstructured data as property listing images (and their ML embeddings). There is a novelty in that we write and run the pipeline wholly using SQL.|
Pro tip: If multiple methods and models are at play, be sure to provide those details.
Time to simplify. You must ensure you can reduce the problem to its simplest form.
|We will predict the click-through rates for all searches. This is binary classification.|
Pro tip: This summary should be clear to non-technical business partners.
Design Review: Modeling
Walk through the following Modeling questions below and collect feedback on your answers.
Section 4: Data
Finally, we’ll prototype the data we’ll use to train our model, including features and prediction targets. Additionally, we’ll identify where we’ll get the data and assess its obtainability.
Design the Data
Identify the features that are critical to training the model and specify the target. Provide examples where possible, or do your best to summarize the underlying data. Remember that all features need to be available when you make the prediction.
|Image embedding||Search context||Unit context||TARGET|
|<ResNet 50, et al.>||<listing context>||<unit metadata>||0|1 (no click|click)|
Pro tip: The data teams responsible for ETL & delivery must understand the description.
Where do you get the data used for training?
|Image embedding||Search context||Unit context||TARGET|
|TF model on BQML||GA360 export tables||Data Warehouse||GA360 export tables|
Pro tip: Make sure you plan on establishing security and access controls to manage data ETL.
You have two ways to approach this step. You can estimate the difficulty of developing and deploying the pipelines to prepare the data for training and prediction. You can also reduce the number of features to those that are easiest to obtain while still being able to model signals to predict the target.
|Image embedding||Search context||Unit Context||TARGET|
|Low effort; in SQL||Med. effort, requires data mining; in SQL||Low effort; in SQL||Med. effort, research data quality; in SQL|
Pro tip: Revisit the heuristics step and adjust as needed to tackle obtainability.
Design Review: Data
Walk through the following Data questions below and collect feedback on your answers.
Now that you’ve answered all the questions in the rubric, you’re ready to divide and conquer:
- Problem Definition – goes to the business team for validation and to the BI team to build out performance dashboards based on the provided KPI (success & failure metrics).
- Outputs = product and experience teams begin prototyping and integrate modeling outputs into existing designs and processes.
- Modeling – data science team begins their iterative modeling function. They should work from the heuristics, then a simplified model and finally deliver a production model.
- Data – data engineering team develops the training and prediction pipelines from the identified lakes and warehouses.
Adswerve has found this framework useful as a first step in designing and delivering production machine learning models. This exercise shouldn’t take more than a few hours before you send it to design review. Questions? Feel free to contact us.