Data Science, Machine Learning and Strategy

Written by Pat Grady | Nov 7, 2022 7:00:00 AM

An Intro to ML Process Design: Framing the Problem to Solve

Machine learning (ML) practitioners used to come with a Ph.D. Thanks to technologies like BigQuery ML (BQML), your developers and business analysts can quickly train and deploy ML models. However, the vast majority of the models will never make it to a production environment. We hope this article will help improve the success rate of your models by assuring that they align with value. ML process design is the first step to success.

For context, Google asked us to test a new BQML feature they announced at NEXT '22. They hoped we could develop a production modeling pipeline using unstructured data (images) in a few weeks' time. We identified our client Twiddy as a great candidate with in-house BQML and SQL skills, a wonderful visual search appliance, and—most importantly—a test-and-learn culture. We implemented the ML process design that Google recommended and delivered the model on time!

The article will guide you through a rubric that we used and that you can apply to your business challenge to accelerate ML process design and delivery. The rubric consists of four sections with a total of ten steps. We'll explain each step and then show how we used it to help Twiddy improve their website by selecting the best images to feature in search results (this is similar to optimizing product listing pages for e-commerce PLPs). Each section ends with a peer review that you should complete before continuing.

Section 1: Problem Definition

This section covers what you take into your executive meetings. It's how you describe what you're working on when the CEO walks into the elevator. If you can't articulate the problem, outcome and what success looks like, then you shouldn't be trying to solve it with machine learning.

Let's get started!

The Problem

In simple terms, what is it you'd like this machine learning model to do for your business?

Example answers:

For all searches, the model should predict a listing's probability of being clicked. This will allow us to deliver better search experiences, increasing clicks and bookings for the business.

Pro tip: Address business objectives and initiatives and speak directly to the project stakeholders.

Ideal Outcome

How will the predictions from the model improve your business?

We'd like the team to be able to see a measurable lift in search clicks for A/B treated images. Also, we'd like the content creators to incorporate the model findings into their photo compositions and editorial strategy.

Pro tip: Don't describe the model output; that comes later. Focus on why you're doing this over how.

Success Metrics:

You’re building the project's executive dashboard. Success metrics are measurable business outcomes. They’re not the model evaluation metrics – no AUC scores! Be sure to include what failure looks like.

Success:
- Team NPS > baseline
- Incremental treatment lift > 0%
- Incremental booking revenue lift > 0%
Failure:
- Inverse of success

Pro tip: Design your metrics to quantify success. Failure is typically the inverse.

Design Review: Problem Definition

For each section, we ask that you go through a design review with your peers. To execute the review, walk through the section questions below and collect feedback on your answers.

Do we understand the purpose of the model?
Would an outside team be able to assess success or failure using the defined metrics?
Is there more to be said about what failure looks like?
Did we miss anything in defining and measuring success?

Section 2: Outputs

This section details the model's outputs and how the outputs will integrate into the business.

Model Output

Clearly state a quantifiable output of the model. Be sure to define terms; giving examples is helpful.

Given an input row, the model will output a probability to get a click – expressed as a float between 0 and 1.

Pro tip: This is something the model itself will output, this is not subsequent analysis or interpretation.

Using the Output

How will the above output be integrated into the business? Where is the output made available?

The model can run ad-hoc or periodically. We want to add click probability scores as an attribute to the `Warehouse.PropertyImages` table. From there, we'll integrate the signal into the CMS.

Pro tip: Consider the latency of data availability and how that impacts dependent business processes.

Heuristics

Without using ML, how would you solve this problem?

Without ML, we would run A/B split tests through all possible image combinations for all listings. The split test could take a long time to reach statistical confidence.

Pro tip: Assume you had to deliver this capability immediately. How could you do this without ML?

Design Review: Outputs

Walk through the following Output questions below and collect feedback on your answers.

Are the outputs usable and valuable?
Can we useheuristics to test the concepts ahead of modeling?
Are dependent business teams informed and included in this process?

Section 3: Modeling

We now transition to the more technical part of the exercise. We need data scientists to weigh in on the ML methods' specifics. We'll first detail the modeling solution and then simplify it.

ML Problem

What is the specific model type you'll be training? Will you use supervised, unsupervised or other methods? Be specific and include details around the number of classes or dimensions used. Are there additional details you could provide to help implement the model? Any novelties to consider?

This is a binary classification problem. Given the search context, we’re predicting the likelihood of a click. In this case, the input includes unstructured data as property listing images (and their ML embeddings). There is a novelty in that we write and run the pipeline wholly using SQL.

Pro tip: If multiple methods and models are at play, be sure to provide those details.

TL;DR

Time to simplify. You must ensure you can reduce the problem to its simplest form.

We will predict the click-through rates for all searches. This is binary classification.

Pro tip: This summary should be clear to non-technical business partners.

Design Review: Modeling

Walk through the following Modeling questions below and collect feedback on your answers.

Will the proposed models solve the stated problem?
Is the TL;DR brief enough? Does it still convey the objective?
Could the process be simplified further?

Section 4: Data

Finally, we’ll prototype the data we'll use to train our model, including features and prediction targets. Additionally, we'll identify where we’ll get the data and assess its obtainability.

Design the Data

Identify the features that are critical to training the model and specify the target. Provide examples where possible, or do your best to summarize the underlying data. Remember that all features need to be available when you make the prediction.

Image embedding	Search context	Unit context	*TARGET*
<ResNet 50, et al.>	<listing context>	<unit metadata>	0\|1 (no click\|click)

Pro tip: The data teams responsible for ETL & delivery must understand the description.

Data Sources

Where do you get the data used for training?

Image embedding	Search context	Unit context	*TARGET*
TF model on BQML	GA360 export tables	Data Warehouse	GA360 export tables

Pro tip: Make sure you plan on establishing security and access controls to manage data ETL.

Obtainability

You have two ways to approach this step. You can estimate the difficulty of developing and deploying the pipelines to prepare the data for training and prediction. You can also reduce the number of features to those that are easiest to obtain while still being able to model signals to predict the target.

Image embedding	Search context	Unit Context	*TARGET*
Low effort; in SQL	Med. effort, requires data mining; in SQL	Low effort; in SQL	Med. effort, research data quality; in SQL

Pro tip: Revisit the heuristics step and adjust as needed to tackle obtainability.

Design Review: Data

Walk through the following Data questions below and collect feedback on your answers.

Simple Inputs: Could these "easy features" realistically predict the target?
Generating Labels: Are you able to obtain the label for training?
Understanding Bias: Enumerate how your target label could be biased. Is it also fair and equitable?
Risks & Complexity: List design aspects that are difficult, risky or overly complicated.
Detecting Signal: Is it realistic to think your model will learn? Can you hypothesize model metrics?

Next Steps

Now that you've answered all the questions in the rubric, you're ready to divide and conquer:

Problem Definition - goes to the business team for validation and to the BI team to build out performance dashboards based on the provided KPI (success & failure metrics).
Outputs = product and experience teams begin prototyping and integrate modeling outputs into existing designs and processes.
Modeling - data science team begins their iterative modeling function. They should work from the heuristics, then a simplified model and finally deliver a production model.
Data - data engineering team develops the training and prediction pipelines from the identified lakes and warehouses.

Conclusion

Adswerve has found this framework useful as a first step in designing and delivering production machine learning models. This exercise shouldn't take more than a few hours before you send it to design review. Questions? Feel free to contact us.

View full post