Explainability and Interpretability in AI

6 min readJan 26, 2023

When it comes to implementing any ML model, the most difficult question asked is how do you explain it. Suppose, you are a data scientist working closely with stakeholders or customers, even explaining the model performance and feature selection of a Deep learning model is quite a task.

How do we deal with this? How can we explain it in simple terms?

Feature, importance and context

For any model, feature selection and its importance plays a major role in explainability. Feature importance is available in some models while others may need to use LIME or SHAP to explain the feature and model performance.

LIME and SHAP are very popular feature attribution methods in interpretable machine learning. These methods compute the attribution of each input feature to represent its importance. In this article I will take you through another interesting method of Contextual Importance and Utility based on paper “Explanations of Black-Box Model Predictions by Contextual Importance and Utility”. LIME and SHAP differs from contextual Importance and Utility(CIU) in their methods.

Kary Framling in the year 1996 proposed the idea of explaining the black box models of neural nets using Contextual Importance and Utility in the context of Multiple Criteria Decision Making (MCDM). MCDM is a domain where mathematical models are used as Decision Support Systems (DSS) for human decision makers. CIU is model-agnostic and provides uniform explanation concepts for all possible DSS models, ranging from linear models such as the weighted sum, to rule-based systems, decision trees, fuzzy systems, neural networks and any machine learning-based models.

SHAP and LIME has originated from game theory and the method is called Additive Feature Attribute (AFA) method. The idea of CIU comes from Decision Theory and Multi-Attribute Utility Theory

Let’s discuss both the methods here.

Additive Feature Attribution Methods

For complex models, such as ensemble methods or deep networks, we cannot use the original model as its own best explanation because it is not easy to understand. Instead, we must use a simpler explanation model, which could be defined as an interpretable approximation of the original model.

Let f be the original prediction model to be explained and g the explanation model. Here, we focus on local methods designed to explain a prediction f(x) based on a single input x.Explanation models often use simplified inputs x` that map to the original inputs through a mapping function,

Local methods try to ensure,

whenever z 0 ≈ x 0 .

Additive feature attribution methods have an explanation model that is a linear function of binary variables:

The variable φ here is called the effect or influence.

The Shapley value is an AFA method originating from cooperative game theory. The method distributes the difference between the prediction output f(x) and the reference level3 φ0 to the input feature influences φi according to Equation above.

Local Interpretable Model-agnostic Explanations (LIME) is a popular AFA method, which creates a linear surrogate model g that locally approximates the behaviour of the model to explain around the neighbourhood of the instance being explained . The sign of φi determines if the influence of the input feature i is negative or positive. The magnitude of φi expresses how great the influence is.

Decision Theory and Multi-Attribute Utility Theory

A decision problem should be formulated in terms of initial conditions and outcomes or courses of action, with their consequences. Each outcome is assigned a utility value based on the preferences of the decision maker(s). An optimal decision is one that maximizes the expected utility.

It was proven already in 1947 that any individual whose preferences satisfy four axioms has a utility function, u, by which an individual’s preferences can be represented on an interval scale.

If preferences over choices on attributes or input features 1, . . . , n depend only on their marginal probability distributions, then the n-attribute utility function is additive according to

where u and the ui are normalised to the range [0,1], and the ki are normalisation constants.

CIU estimates the values ki and ui(yi) in the above Equation for one or more input features {i} in a specific context C and any black-box model f, where the context is defined by the instance or situation to be explained.

Contextual Importance

For clarity, {i} is the set of indices studied and {I}is the set of indices relative to which we calculate CI. When {I} = {1, . . . , n}, CI is calculated relative to the output utilities uj . For instance, CIj (C, {2}, {1, . . . , n}) is the contextual importance of input x2, whereas CIj (C, {1, 2, 3}, {1, . . . , n}) is the joint contextual importance of inputs x1, x2, x3 and CIj (C, {1, . . . , n}, {1, . . . , n}) is the joint contextual importance of all inputs. uminj () and umaxj () are the minimal and maximal utility values uj observed for output j for all possible x{i} and x{I} values in the context C, while keeping other input values at C.

When uj (yj ) = Ayj + b, then CI can be directly calculated as:

where yminj () and ymaxj () are the minimal and maximal yj values observed for output j. The values of uminj and umaxj can only be calculated exactly if the entire set of possible values for the input features {i} is available and the corresponding uj values can be calculated in reasonable time. For categorical input features this is feasible as long as the number of possible values doesn’t grow too big.

Contextual Utility

The Contextual Utility (CU) corresponds to the factor ui(xi). CU expresses to what extent the current value of a given input feature con- tributes to obtaining a high output utility uj

where yumin = ymin if A is positive and yumin = ymax if A is negative. This definition of CU differs from CI by handling negative A values correctly.

CIU versus LIME/SHAP

LIME and SHAP comes from AFA method and CIU from decision and utility theory.
CIU does not use or create any explanation model g
CI and CU can be used for calculating φ but not vice versa
CI and CU provides absolute values in the range [0,1] , whereas φ values express relative influence between relative features.