A recent working paper released by the Bank of England (BoE) discusses the importance of explainable AI (xAI) in finance, particularly in relation to mortgage default risk. Dr Scott Zoldi writes

Although AI is able to perform analytic tasks with much greater efficiency and accuracy than humans, longstanding concerns about its usability centre on AI’s reputation as a “black box” technology unable to explain its decision-making processes. xAI has been around for decades, but it only recently entered mainstream discussion alongside Europe’s General Data Protection Regulation (GDPR), which mandates that customers need to be given clear-cut reasons for why they were adversely impacted by an AI-driven decision.

Naturally, each xAI method has its strengths and drawbacks. However, none of the methods mentioned in the BoE paper are able to expose the non-linear relationships between data variables – which is problematic given that machine learning models draw on data features and interactions that are highly non-linear to make their decisions. Let’s do a quick deep dive into these.

Unpacking the BoE’s report: xAI techniques

The methods described in the paper either add random noise or explore distributions of input variables and monitor their impact on machine learning model scores, or test multiple models on the same instances to determine the relative importance of each input variable. These are namely:

Sensitivity-based measures

Sensitivity measures how the model score changes with variation in model inputs.  These measurements suffer most when the perturbed data inputs are in areas of data values where the model has not been trained. In addition, they do not explain the non-linear relationships used by the model.  A variable may only be sensitive because of the precise settings of the other data values – if those data values change, the original variable may no longer be sensitive. As a result, sensitivity-based measures can calculate the average importance of an input, but will be unable to provide a detailed explanation for their observations.

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

Quantitative input influence (QII)

QII is an inspection approach that claims to detect bias in machine learning models. Here, models are run through randomisation of inputs to determine possible causal relationships. Like sensitivity-based measures, this method suffers from artificial inspection of the model with non-observed data combinations.  Although QII tries to understand how different combinations of variable relationships collectively influence the final scores, it does so at a global level, and is therefore unable to explain a record from a single customer.

Shapley values

Although Shapley values are mentioned, it is unclear how this technique was utilised specifically in the paper. In brief, Shapley values use different combinations of models, both with and without input features to determine the average importance of each, but does not explicitly explain the non-linear relationships in the model data. They focus on the importance of data variables in the wider decision landscape instead of the specific relationships that drive an outcome for a subset of customers.

Understanding each AI model’s structure and the relationships it creates between variables is hugely important, given that these relationships drive both model outputs and model explanations.

Unfortunately, the techniques detailed above do not account for the specific non-linear relationships learned by models in their training data, which are crucial to identifying and studying potential biases formed by the specific interactions between different variables. An example would be the non-linear relationship between an account’s oldest credit line and its credit limit, which might indicate the age of the account holder. Furthermore, all these methods work on a global scale in terms of sensitivity and the importance of variable combinations. As a result, they are unable to explain a specific customer data record and unsuitable for use under GDPR.

That’s why at FICO, we focus on explainability first and predictability second. This means that we build machine learning models that are completely interpretable; where specific driving interactions can be identified, reviewed, and tested for bias and spuriousness; and if needed, unacceptable interactions can be specifically removed.

Our studies have shown that further sensitivity-based methods such as Local Interpretable Model-agnostic Explanations (LIME) are incorrect in their explanations more than a third of the time, and are therefore not suitable for explanations compared to alternative methodologies.

What does the future hold for explainability?

In light of these shortcomings regarding explainability, two promising developments are taking place that should improve this aspect of AI at present, and in the years to come.

First are the models that change their entire form to expose all latent features. This requires completely redesigning training methods for AI models. By constraining connections of the neural network so they are sparse, these models can be made interpretable – that is, explanation gets embedded in the model architecture itself. Such an approach is entirely different from how native neural network models are built today. Once trained, these new models are directly deployable, as the sparse nature of the network allows specific relationships to be inspected, tested for bias or spuriousness, and if necessary, banned in neural network training by humane domain experts. Although advanced, these models are already being used in production systems where xAI has been made compulsory.

Second are Ethical AI models. As its name suggests, Ethical AI is AI that is transparent, designed and tested to make unbiased decisions. It works by closely examining each dataset with respect to the latent features driving model outcomes. In contrast to conventional machine learning models where the complex relationships between data points remain hidden, Ethical AI algorithms expose latent features, thus allowing them to be tested for bias and remediated.

Making AI models that are explainable is a very positive development, and we are glad that it is finding a voice and resonance with more groups than ever before. However, now is also the time to enlarge the region of awareness – although there are more xAI methods available now, it is important to choose a model architecture design at the outset that will allow you to provide (not just merely infer) explanations for non-linear and latent relationships, to ensure full compliance with xAI regulations.

Dr Scott Zoldi is Chief Analytics Officer at analytics software firm FICO.