This Changes to That : Combining Causal and Non-Causal Explanations to Generate Disease Progression in Capsule Endoscopy

The need to understand the decision-making mechanisms of deep learning networks has led to a growing effort in exploring both modal-dependent and model-agnostic research methods. Although both of these ideas provide transparency for automated decision making, most methodologies focus on either using the modal-gradients (model- dependent) or ignoring the model internal states and reasoning with a model's behavior/outcome (model-agnostic) to instances. In this work, we propose a unified explanation approach that given an instance combines both model-dependent and agnostic explanations to produce an explanation set. The generated explanations are not only consistent in the neighborhood of a sample but can highlight causal relationships between image content and the outcome. We use the Wireless Capsule Endoscopy (WCE) domain to illustrate the effectiveness of our explanations. The saliency maps generated by our approach are competitive on the softmax information score.


INTRODUCTION
There has been a rapid integration of deep learning based models in real-world applications, including high risk ones such as healthcare and defence owing to their unparalleled predictive performance [5].Such real-world deployment and usage of models accompanies with it the moral obligation to make their decision processes transparent.This is necessary not only for accountability of high stake decisions but also for the identification and mitigation of algorithmic or societal bias [14,7].This has led research to continue attempts at opening the black boxes, to gain insight in decision making processes [4,20,10] while also considering that useful explanations could emerge through model-agnostic explainer methods ( [15,23,6,24]).Although both of these approaches are suited to explaining model predictions, dominant explanation approaches today focus on one or the other.
Factual explainers that reason with gradients [19,10,20] aim to identify regions or pixels within an image that most Thanks to Research Council of Norway for funding (Project no: 300031).authors contribute equally.significantly contributed to the prediction and thereafter visualize these attribution weights in saliency maps [4,20,19].For example, given an endoscopic image with an ulcer, a saliency map would highlight the ulcer region in response to a question such as "Why did you make that decision?".However, although popular saliency methods [19,20,10] are fairly easy to implement they also have limitations.
One limitation of gradient-based saliency methods is the use of a baseline image and the sensitivity to the choice of that baseline [13].Here a baseline helps contrast the query scenario from a "baseline" scenario and typically marks an absence against which the "presence" can be measured, e.g a black image.Since attribution maps are then accumulated over a classical linear path from baseline to the query; the choice of baseline is crucial to the success of the explanation.A second limitation is that the pixel perturbations done to arrive from a baseline to query are typically blind to image content.We argue against such pixel perturbations to create images between the baseline and query image as well as the baseline itself.Because not only are the images in between not natural but are also prone to abnormal gradient behaviours from irrelevant pixels as identified in [10].
Consider our ulcer example from before, the perturbations with respect to clinical biomarkers relating to ulcer abnormality are more meaningful than individual pixels.For example take perturbations that cause "more or less inflammation around a suspected ulcer", knowing that an ulcer is often accompanied with inflammation is important, as a lack of it might suggest incorrectly to the doctor that the suspected ulcer is just intestinal debris stuck to the surface.Such perturbations are not only more meaningful but every image resulting from them is directly interpretable.This also implies that a more apt baseline would be one that marks absence of the biomarker (here the ulcer) and not complete absence of the signal.A third limitation of such methods is their singlepointwise explanation mode of operation [3,1], whereby an explanation to a given image is made in isolation of its locality, i.e. without considering its neighborhood (i.e, how explanation changes as the input changes slightly).
Counterfactual reasoning has gained popularity [6] as a locality-aware explainer that is model agnostic (i.e.does not need access to a network's internal mechanism (gradients, layer activations, etc).Often these provide causally under-standable explanations which have been argued be GDPR compliant [23] and help address questions on fairness, trust and robustness [8].These explanations generate a counterfactual as an alternative scenario with a desirable outcome that counters the observed (real) outcome.As such they generate explanations through relationships like: "If the ulcer had not been present, this image would not be abnormal."In other words, it pinpoints how the input must change to flip the outcome.It is clear how such explanations might seem intuitive and interesting [6] to a doctor in our context.In fact, counterfactual thinking is very natural to how humans reason especially in response to negative outcomes in order to prevent them in the future [16].
In vision, explaining an instance with its corresponding counterfactual [8,2] has become common for highlighting changes that would most easily flip the prediction.In [8], authors perform minimal edits by swapping regions of a query image from a distractor image till a decision flip occurs.However, the choice of a suitable distractor image is crucial for quick convergence but this choice can be unintuitive for some domains such as the medical domain or when little information is available for the dataset.Further, the image resulting from such edits can be unnatural looking at times and therefore lack explainability.For such explanations to be efficient, the changes applied to the image for a different prediction must be minimal and human interpretable [23].Alipour et al. [2] use the latent space of a pretrained styleGAN for retrieving counterfactual latent codes and is similar to our approach in idea but differs in implementation (their method produces causal explanations only unlike ours, while also employing pretrained attribute detectors in latent space that are largely unavailable for medical domains.)Recently, semi-factuals have been argued to offer advantages similar to counterfactuals [12].As opposed to counterfactuals that propose explanations as 'If only' clause, semi-factuals propose explanation of type 'even if' i.e. what changes to the situation would still lead to the same outcome.In our earlier example, a semifactual image might illustrate the inflammatory changes that occur right before an ulcer starts forming, as this point the doctor will still identify the image as abnormal.
Despite advantages, one of the biggest challenges in using counterfactual and semi-factual explanations (together referred to as contrastive explanations) lies in generating instances that not only expose realistic and progressive visual changes smoothly (as to be directly understandable), but also ensuring progression alignment with the expected class prediction behaviour (congruous change in softmax score) [2].Addressing this need for aligned progression both in the image and classifier space is precisely the problem we propose to solve in this work.We argue that in favor of human interpretability and algorithmic transparency, explanations that support both the aformentioned modes (causal and non-causal) are better than either one.We demonstrate the effectiveness of our explanations in the domain of WCE with focus on Ulcerative Colitis (UC).We use the UC biomarkers used by experts in diagnosis such as inflammation and ulcerations as progression attributes to manage counterfactual explanations.The main contributions are: • a unified framework that generates both causal and noncausal explanations for each decision; • a method to control progression along a specific UC biomarker such that the counterfactual relationships inferred are causal as opposed to being adhoc; and • a formal algorithm to generate saliency maps that are comparable to (or better) than others on the Softmax Information Curve (SIC) metrics 3.

METHODOLOGY
Given an attribute of choice (e.g., a UC biomarker like inflammation, vascular pattern etc.) and the query image i q a , the goal is to retrieve two instances that are closest to the decision boundary as semifactual (on the same side) and counterfactual (on the other side), while preserving visual interpretability along a path of images directed by the attribute (Figure 2).Regions of importance is highlighted by a saliency map (can be generated for each image on the path, including the query).
Given a classifier C that outputs label y ∈ {0, 1} through a prediction function f : R n → [0, 1] for an image x i ∈ R 512 * 512 , an explanation set is produced, X = {i sm , i cf , i sf }, along attribute a.Here i sm is the saliency map, i cf is the nearest counterfactual and i sf the semifactual along a.We use this to generate an explanation: " image i q a is abnormal with probability p due to signs/regions highlighted by the saliency map i sm .The least amount of abnormality required for the prediction to be abnormal is seen in i sf (semifactual).However, if the abnormal signs change to as in i cf (counterfactual) the image would no longer be classified as abnormal".Importantly the changes along the single attribute, a, is also directly visually interpretable by a user (e.g., a doctor).
Attribute discovery in latent space: We use Style-GAN2 [11] and train it on WCE images (discussed in Dataset and Training details sec).StyleGAN2 uses a mapping network between a latent variable and the network generator, G, which transforms the latent variable to an intermediate d-dimensional space, W , of latent vectors, w ∈ R d , where style attributes are known to be more amenable to control.We use SeFA [17] for the unsupervised discovery of attributes in the intermediate W space.In the natural image domain, pretrained attribute detectors can be utilized for labeling these attributes however for our case of pathological and anatomical variations of the colon such attribute detectors are not available a priori.We perform clustering on images using TSNE [21] for isolating attributes relevant to pathological changes.This is done by planting seed images before clustering that had been identified by a doctor as good representatives of UC pathological changes.Upon clustering we sampled the attributes closest to seed images and have used these as explanation attributes.Generating the explanation set X : Once relevant attributes are identified, to explain a query i q a with latent w q a ∈ R d such that i q a = G(w q a ) along attribute a, a set of k local images i a is created, i a = {i 1 a , i 2 a ...i k a } from latents w a = {w 1 a , w 2 a , ...w k a } where w j a = w q a − α j * a and α j varies linearly in [A, B] and a ∈ R 512 is the aforementioned attribute vector.In Figure .3,attribute a corresponds to (reddish) inflammatory regions and set i a can be understood as images with decrease in severity of such inflammation as α progresses from A = 0 to B = 30.i cf , i sf in X are retrieved based on the classifier output for i a such that i cf = argmax(σ(C(i j a ))) ∀ σ(C(i j a )) < 0.5 and i sf = argmin(σ(C(i j a ))) ∀ σ(C(i j a )) > 0.5 where σ is the softmax function.For the saliency map, to avoid the ness observed in previous literature, we use the latent space to curate a neighborhood such that every image in the neighborhood of a query varies only along the chosen attribute.In other words, the pixel changes that occur in this neighborhood are neither uniform nor content blind [10,20], but  targeted towards those pixels that most strongly affect the attribute/biomarker.We use directional derivatives in i a along attribute a for identifying these regions and weight them based on semantic similarity with i q a to generate the saliency map.The directional derivative Dif f (i q a , i j a ) between the query and i j a = G(w j a )∀{w j a } k j=1 is given by: The directional derivatives Diff (i q a , i j a ) over i q a and i a exposes pixels with consistent change in the direction of increasing/decreasing attribute (see Figure .1), in other words it is a measure of semantic similarity to the image being explained.We use these derivatives to measure the contribution of each image in i a .A formal algorithm is described in algorithm 1.
Dataset and Training Details: The dataset consists of approximately 200k unlabeled WCE images.The majority of images come from WCE examinations of 10 patients with varying UC activity, as well as other pathologies with PillCam Colon 2 Capsule, Medtronic.The images are 576x576 in resolution with varying degree of bowel cleanliness.In addition to this we use PS-DeVCEM dataset [22] with 80k images of the same capsule modality.Remaining images come from the OSF-Kvasir Dataset [18] with 3478 images from seven classes taken with the capsule modality Olympus EC-S10.We use StyleGAN2 without progressive growing and work exclusively on the original inter-Algorithm 1: Saliency generation https://www.overleaf.com/project/632c20113b1a4fb9b68a2cfbInput:Classifier C, query i q a ; ia = {i 1 a , i 2 a ...i k a , i q a }; Output: ism foreach i q a do predict output class probabilities for ia output ← C(ia) backpropagate and collect gradients wrto ia [grad 1 a , grad 2 a ...grad q a ] ← ia.grad() directional derivatives along attribute a foreach ism ← meanT hresholding(S(iq, a)) end mediate latent space W and not the extended space W + .The model was trained on TwinTitan RTX for 30 days. 1

RESULTS
Qualitative Comparison: GuidedIG [10] produces noisy saliency maps (as only pixels with low partial derivatives are moved towards their original intensity at each step to avoid high gradient regions and thus abnormal behavior), but if the pixels affecting the decision are not localized but spread globally across the image, (as in WCE), the resulting saliency map can appear to be noisier.Similarly, while Smoothgrad [19] captures the right regions, the saliency maps are overall noisy.Integrated Gradients [20] correlates very closely with our maps.Figure .5 shows X for various query images.

Query SmoothGrad
GuidedIG IG Ours Fig. 4. Qualitative comparison of saliency maps between our approach and other approaches.Integrated Gradients (IG) [20], Guided integrated gradients [10], SmoothGrad [19] Quantitative Comparison: We use Softmax Information Curve (SIC AUC) [9] for quantitative comparison.SIC AUC  measures the softmax score of a model against salient regions indicated by the saliency map. Figure 6 shows the SIC AUC for different approaches averaged over 50 images.Integrated gradients achieve the best score followed by our approach.We suspect this to be due to the SIC score's preference for smallest regions of effect (as in IG) instead of identifying all contributing regions (as in ours).To the best of our knowledge, this is one of the first works to propose a single framework for generating both causal as well as non-causal explanation for deep learning based models.

Fig. 1 .
Fig. 1. Figure shows i a and the corresponding directional derivatives.The derivatives expose the semantic similarity between the query and it's neighbors.We use this similarity to weigh in the contribution of each neighbor towards the saliency map.

Fig. 2 .
Fig.2.The approach explains a query image along the ulcer attribute path together with a semifactual and counterfactual along the same path.Here the query exhibits an abnormality with inflammation.Even with inflammation reduced down to as in (c) the prediction would still be abnormal (semifactual).However, if only the visual signs change from (c) to as in (b), the prediction would be normal (counterfactual).

Fig. 3 .
Fig. 3. Images in i a along attribute a. Top left corner shows softmax score.Notice how apart from effected region (for attribute a), other regions in the image undergo only minimal changes.As a result, the generated explanations are consistent in the locality of a query.

Fig. 5 .
Fig. 5. Figure shows X generated with this approach on different query images (column 3).Best viewed in color.