Why we do — and don’t — need explainable AI

Why we do — and don’t — need explainable AI

Why we do — and don’t — need explainable AI

Why explainable AI (known as XAI) is becoming a must-have component of data science and why we may not have come as far as we think.

A black-box model is no longer good enough for your data scientists, your business or your customers.

A data science history perspective

Back in the old days (really just a few years ago for some) building models for classification was simple. Generalised linear models (GLMs), and in particular, logistic regression models were the only algorithms we used.

And this is the real, statistical version of logistic regression as you would find in the Python library statsmodels, rather than the scikit-learn version. In the statistical version, you get neat statistical parameter information including coefficient estimates, standard errors and estimates of fit such as deviance or log-likelihood estimates. Our models contained a few, essential features. Those features we did use were handcrafted to be meaningful, stable and predictable.

Our models contained a few, essential features. Those features we did use were handcrafted to be meaningful, stable and predictable.

The new kid in town

Then things changed. They got more sophisticated and more complex (possibly two words meaning the same thing). We were given a choice of many different algorithms, such as neural networks, random forests, gradient boosting machines, support vector machines and ensemble methods. We could now use hundreds — or even thousands — of features. In many cases our algorithms would select the best features and even pre-process them, meaning we no longer had to hand-craft them. We started to use a whole bunch of techniques that, when combined, would wring every drop of predictive power from our models and data.

The mantra was sometimes heard that if the model predicted well, why should we care about how the model arrived at its predictions?

And our debugging and diagnostic techniques evolved accordingly. They did not try to explain why the model produced a prediction. Rather, they quantified the extent to which the algorithm would generalise to a set of unseen data.

Constraints can be good for creativity

Admittedly, in the scenario I presented in the old days, a lot of these constraints were thrust upon us. Our software did not support advanced algorithms. The lack of computational power made thousands of features impractical and time-consuming. Putting models into operational systems was hard; it often required hand-coding the algorithm into a mainframe (and yes, this is something I have done in the last decade). Altogether this pushed the data scientist towards simpler models and implementations that were easier to test.

So, with all these constraints in place, the models and algorithms we developed had understanding and interpretability first and foremost. Contributing to why this needed to so be was what the models were used for: things like determining whether to grant an applicant a credit card, for example. It just felt safer that humans could understand these models, regardless of the checks and balances in the process. We were risking hundreds and thousands of dollars if we got it wrong. But for now, enough blah, blah, blah. I will show you what I mean using the classic Titanic data set.

Spoiler alert

The ship Titanic hits an iceberg and sinks.

We want to build a model that will predict who would survive on the Titanic


To build a model, I use a logistic regression in R using the glm package. You can see my code here.

Comparing our predictions with what transpired, we get the following misclassification table (or confusion matrix).

Predicted did not survive survived
did not survive 480 84
survived 69 258

If you add up the numbers, you can see this model gives an accuracy on the data set it was trained on of 82%. This is the number of times we made a prediction that matched with what happened.

Great! Our model is somewhat accurate. But it does not help me in my quest for surviving when Titanic II hits an iceberg. For that we must delve into the internals of the fitted model and do some simple maths.

Analysis of Deviance

First, I look at the analysis of deviance table. This tells me that all the variables except Fare, Parch and Embarked contribute to reducing the variation or error. It also tells me the bulk of predictive power lives in two variables: Sex and Pclass.


Table of Coefficients

Fitting a generalised linear model also gives us a table of coefficients. We can use this to see which values lead to greater or lesser chances of survival.


This tells us:

  • Pclass of 3 is associated with non-survival;
  • Sex of ‘female’ is associated with survival;
  • Children (Age less than 18) are more likely to survive; and
  • Cabin designations beginning with G are unlikely to survive,

as well as some other, less predictive, nuances.

explaining Predictions

Now we have managed to understand what is happening in our Titanic model. The next step is to understand why the model makes individual predictions.

Let’s start by looking at the prediction of the passenger who was predicted to me most likely to survive. Here are the details of that passenger, who has been predicted as 99.2% likely to survive.


All we need to do is a bit of mathematics to understand why this was predicted.


So, it is because she is female, Pclass 1, under 18 and paid fare of over 40, along with some other details.

Now let’s look at the passenger predicted to be least likely to be a survivor. Here are the details of that passenger, predicted with a 1.0% chance of surviving.


Again, if we use a bit of maths, we can see why this prediction was made.


It is because

  • the passenger was male;
  • he was aged 65 or over;
  • he was in Pclass 3; and
  • paid a low fare,

as well as other attributes that contribute to the low prediction.

Interpretable and explainable

It is interesting to note that the old-school approaches grounded in statistics gave us models that were interpretable and produced predictions that were explainable.

Modern Machine Learning

Move forward to today. Machine learning is moving to the mainstream and there is a plethora of tools we can use to build a predictive model. We are no longer confined to ‘just’ the regression family (although there is a new, improved machine-learning version of regression, too).

The data scientist can choose from a dazzling array of algorithms. As an example, lazypredict will automatically fit LinearSVC, SGDClassifier, LogisticRegression, RandomForestClassifier, GradientBoostingClassifier, GaussianNB and another twenty-five more!

dazzling array of algorithms

A list of classifiers from lazypredict’s documentation.

What a long way we have come …

… Or have we?

My original analytical training was as a statistician. This means I have a bias, so please keep that in mind.


With the classical, statistical techniques the focus was on inference; the new machine learning techniques focus on prediction. For a while there was a sentiment—and indeed, a junior aspiring data scientist said this to me—that so long as a model predicts well, it does not matter what happens inside the algorithm’s black box.

The end justifies the means.

Of course, now we know that it does matter what happens; we do need to be able to understand what we are doing.

First, we need to be sure that an algorithm predicts well. If we do not understand the algorithm, how can we know this?

We can look at the predictions made against our validation data set. (I am sure some business stakeholders relying on models performing well would be horrified to learn they might have been tested only against a couple of hundred records to make sure they behaved as expected.) But this will not tell us if we have made a catastrophic error in the selection of our data. It will not tell us if there are subsegments of our customer base for which the predictions do not work well. When the algorithm is a ‘black box’ which simply takes inputs and returns outputs with no clue as to its inner workings, the only debugging possible is to run many cases through and make sure the prediction is close to the truth.

Second, we need to be able to explain why we made individual predictions. I am sure you would agree that it is not satisfying to explain a refusal to accept a loan application on the grounds of ‘computer says no’. Neither the customer nor the customer service agent would have a reason to trust the algorithm. They are left to guessing why the application was not accepted.


But that is what we are asking people to do when we say ’trust the machine’. We are asking humans in a real process making operational decisions to use the predictions from a machine learning algorithm. Surely it would be in the interest of everyone invested in the model’s success to ensure the users trust and understand why the predictions are model. They need to get reasons of why this customer has been selected to receive this discount, or why a particular machine has been selected for maintenance. If they cannot trust it, adoption rates and compliance rates will remain low.

Third, there is an increasing focus on models being well governed and responsible. Initiatives like GDPR prescribe how machine learning is governed, how users’ consent is managed and how interpretable a model’s predictions are. In parallel, there is increasing focus on ethical AI and responsible AI, which include a set of principles to ensure machine learning models used by organisations are

  • fair;
  • reliable and safe;
  • inclusive;
  • transparent;
  • private and secure; and
  • have clear accountability.

What have we lost?

With the classical statistical techniques of yore, we had measure of uncertainty around the parameters of the algorithm. We could tell where it was accurate and where it was less so. We could calculate the predictions with a modest amount of matrix mathematics.

The new techniques kind of forgot about these things. Or they added them as an afterthought.

Now, though, there is something of a renaissance brewing under the name of XAI—explainable AI. ‘XAI’ typically refers to a suite of tools and technique that let us interpret fitted models and explain predictions of almost any model.


The new hero of the day is a set of libraries called SHAPSHapley additive predictions.²

What SHAP does, in essence, is to run a series of test observations through a model’s prediction algorithm to see what happens. The reason its use is increasing so rapidly is that it outputs a series of additive outputs. (You know, like the regression models do.)

And additive outputs are easy to interpret. You can add them up, take the mean (a statistician’s way of saying ‘average’); they behave sensibly and intuitively.

Back on Board the titanic

So to show you how we can apply SHAP to the same data set as before, I used the Python XGBClassifier against the data. I used the default parameters and the same feature transformations I used with the logistic regression model.

Instead of an analysis of deviance table, all the model fitting process gives me is a series of sequential trees that reduce the variation of error at each iteration. This is what GBMs do. I can find out which features contribute to the model using one of the ‘afterthought’ techniques included with the algorithm.


The most important features as shown by the GBM afterthought.

Note that in what is produced above, we get no sense of the bounds of error or variation.

To get some more information, we can turn to SHAP. It turns out that SHAP has a similar plot that also shows the individual points in the set of test observations we have used.


This is a bit more informative. We can see perfect separation based on Sex; that there are some bad values of Pclass; that being old was not a good thing.

Explaining Predictions

As we did before, we can look at explaining individual predictions. SHAP can explain the prediction for the passenger most likely to survive.


Now that chart is easy to understand (for a data scientist).

Similarly for the unluckiest passenger.

logistic regression

Note: with a bit more mathematics, we could have represented the outputs of the logistic regression identically.

Comparing the old and the new

So now you have seen what we used to do and contrast that with what we can do now with SHAP and modern techniques.

“So,” I hear you saying, “you are building a second, additive and interpretable model to explain a model that used a technique that is inherently difficult to explain.”

Yes, that is right—now we have two models. One to predict, and one to explain the predictions and interpret the model.

“Wow,” you must be thinking, “these new-fangled machine learning techniques must be really something for you to add the complexity of having two models instead of one.”

back to you, kaggle

To show you how far we have come, I tested these models on Kaggle (and there are many, many models on Kaggle that predict better on this data set). Kaggle provides a test set of data that contains all the variables of the training data set, but it does not include the information on whether an individual survived or not. Kaggle uses this unseen information to calculate a performance score. It uses a measure of accuracy, and higher means better.

First, I ran the XGBClassifier model’s predictions through Kaggle. The result: 0.734 accuracy.

Next, I tried the logistic regression. The result: 0.754 accuracy.

Looks like we need to do more work to explain our new, fancy machine learning techniques but get little gain in this instance.

The takeaway might be to think about your model development lifecycle:

  1. Build an interpretable model using classical, interpretable techniques.
  2. Once you are happy with this, use a modern machine learning technique to see if you get a significant gain in performance.
  3. If you do, consider whether you want to spend more time tweaking the interpretable model or explaining the machine learning model. Most times you won't.


Increasingly data scientists and the systems that use the models produced by data scientists are under pressure to be well-governed, interpretable and well-understood. Their predictions need to be, well, predictable.

The SHAP library is very useful and gives us insight into models that would otherwise have been obscured from our view. But for some use cases, you are better off using the classical techniques and understanding your model as you develop it. The alternative road can be a lot of complexity and effort for less understanding and no gain in prediction accuracy.

¹ A prediction that will be really helpful if you want to book safe passage on Titanic II’s maiden voyage.

² An in-depth treatment of this can be found in Christoph Molnar’s e-Book Interpretable Machine Learning.