Challenges in productionising machine learning models

Challenges in productionising machine learning models

There are three main steps in productionising a machine learning model:

  1. Getting the data into the model
  2. Running the model algorithm against the data
  3. Taking the outputs of the model into the operational process

However, actually productionising machine learning models into operational processes is often trickier than you might think.

To prove its value, machine learning needs to be used to drive decisions. One way that’s more prevalent today than five years ago is to integrate machine learning models directly into operational decisions.

This can be recommending products (think Amazon Books), predicting customer churn or even predicting when to schedule services for machines. Whatever the scenario, you will need to integrate an analytical algorithm into a business process— an endeavour which can be entirely new to a number of stakeholders in your organisation. 

a flowchart displaying the operational process of a productionised machine learning model

1. Getting the Data into the Model

When designing the data interface between the operational system and the machine learning model, we already assume the model has been tested with appropriate data. This means extensive testing during model development is a must to ensure the model either behaves or fails predictably and appropriately to all possible data inputs.Somehow, the model needs to get the data it requires—the input payload. Let’s say the model needs the following data to make a prediction:

  • Customer date of birth;
  • Customer state of residence; and
  • The customer’s account balance.

An example payload will look like

                    "id": 5973628,
                    "date-of-birth": "1966-12-08",
                    "state": "VIC",
                    "account-balance": 1200

We could hard-code the process to extract that data and call a model’s API with this payload to obtain a customer segment. However, if we do this, we have hard-coded the model into the operational process. What would happen if we wanted to change to the next version of the model that requires postcode in addition to the other data?

We would need to change the operational process—this has implications for maintaining the model and maintaining the process.

A better approach is to call an intermediate process that abstracts getting data for the model, calling the model and returning the model outputs to the operational process. Doing this also means we can better track which models are called and track model version history. The call will now look like:

                    "model-id": "customer-segmentation",
                    "customer-id": 5973628

The intermediate process extracts the data the model needs from a store of metadata, then makes the model API call with the original payload.

model-id data input data output API







From the table above, if a new segmentation model includes postcode, we simply add it to the metadata along with the new API address. No changes are needed to the operational process. 

2. Running the Model Against the Data

a flowchart showing the scoring module of a productionised machine learning model

For most productionised machine learning models, three phases need to be included:

1. Feature preparation

Take the raw data (such as customer’s date of birth) and transform it to something the model can consume (such as customer’s age). This transformed data is known as a feature. Features can be simple like age from date of birth, or machine-learning specific, such as principal components.

2. Applying the algorithm

This phase takes the features and applies the saved modelling algorithm over them. An example is to retrieve the weights (parameters) of a trained neural network and apply them to features from new data to produce model scores.

3. Applying the decision logic

The final post-processing stage takes the model score and applies decision logic to them. This could be a threshold. If the score is greater than the threshold, perform action A (accept application, for example); if it is less than the threshold, perform action B (decline application). Or it could provide the SKU details of the product to be recommended to the customer. 

Anyone who has looked at examples of machine learning models from the Internet will have found that it can be quite involved to replicate results when trying them yourself in your own environment. Why? There are difference versions of R, Python, libraries and even operating systems to contend with.

This problem manifests as an operational and productivity risk when applied to production models. Making sure the model runs as intended can be a nightmare! The model may be executing on a different machine than the one it was developed on, or maybe it was developed a few months ago. For whatever reason it occurs, this scenario can lead to weeks of wasted development and testing time. 

There is a solution: containers, of which the most popular and best supported is docker.

The promise of docker in machine learning is this: replicate the environment that you develop models in to the environment that production models will run in. Containers replicate the environment independently to operating system and hardware.

A year or so ago doing this still required a lot of messing around with docker containers and orchestration. However, a new breed of tools has made it straightforward. A few are Azure Machine Learning Workbench, Domino Data Lab and dataiku.

With these tools the data scientist builds models using standard languages such as Python and deploys them to a virtual environment in Azure (or another cloud). This virtual instance replicates the modelling environment to guarantee models will run as intended regardless of their complexity, without recoding. This makes deploying models easier and quicker, meaning that more models can be deployed by the same number of data scientists. It also allows the productionised models to scale, as resources can be added. 

3. Applying the Outputs

This part of the process integrates the output of the model decisioning back into the operational system. One ‘gotcha’ here is that the model needs to supply the same type of information consistently back to the process. This is not usually a problem, as we don’t expect this to change often if at all.

An example is:

                    "model-id": "customer-segmentation",
                    "customer-id": 5973628,
                    "score": 0.8653,
                    "application-decision": "accept"


Reliably productionising machine learning models is definitely achievable with the right disciplines and tools. It is made easier by

  • Using Azure Machine Learning Workbench or similar with docker containers
  • Good data governance and communication between IT, business and data science team members
  • Keeping metadata on models, versions, inputs, APIs and outputs required
  • Having an orchestration layer the abstract model data requirements from operational processes. 

Happy modelling!

If you feel like your technical skills could use a little sharpening, check out the selection of training courses and seminars that are offered by BizData:

written by: James Pearce | Senior Advanced Analytics Advisor