<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tova A</title>
    <description>The latest articles on Forem by Tova A (@tova501).</description>
    <link>https://forem.com/tova501</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3647054%2F23507e95-9b21-4581-9527-ab073e2fb1c9.png</url>
      <title>Forem: Tova A</title>
      <link>https://forem.com/tova501</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/tova501"/>
    <language>en</language>
    <item>
      <title>Cleaning Up Complexity: Preprocessing Attribution Maps for Better Evaluation</title>
      <dc:creator>Tova A</dc:creator>
      <pubDate>Tue, 10 Feb 2026 11:33:26 +0000</pubDate>
      <link>https://forem.com/tova501/cleaning-up-complexity-preprocessing-attribution-maps-for-better-evaluation-6oi</link>
      <guid>https://forem.com/tova501/cleaning-up-complexity-preprocessing-attribution-maps-for-better-evaluation-6oi</guid>
      <description>&lt;p&gt;I wanted to compare attribution maps from different XAI methods for vision models, using the Complexity metric from the Quantus library.&lt;br&gt;&lt;br&gt;
The idea was simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If a heatmap looks clean and focused, it should have lower complexity than a noisy, scattered one.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In practice, that’s not what happened.&lt;br&gt;
Some maps that were visually sharp and localised got &lt;strong&gt;high&lt;/strong&gt; (Bad) Complexity scores.&lt;br&gt;
Other maps that looked messy or stretched over the whole image got &lt;strong&gt;surprisingly low&lt;/strong&gt; scores.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkd6pgrzmjvt8nvh6nzw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkd6pgrzmjvt8nvh6nzw.png" alt="Heatmaps Comparison" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the left is &lt;strong&gt;Guided Backprop&lt;/strong&gt;, which spreads activation all over the image.&lt;br&gt;
On the right is &lt;strong&gt;Fusion Grad&lt;/strong&gt;, which is much more sparse and focused on the relevant structures.&lt;br&gt;
But in our initial setup, the Quantus &lt;strong&gt;Complexity&lt;/strong&gt; metric actually gave Fusion Grad a &lt;em&gt;worse&lt;/em&gt; (higher) complexity score than Guided Backprop – a clear mismatch between what we see and what the metric reports.&lt;/p&gt;

&lt;p&gt;The metric was doing exactly what it was defined to do — but it was reacting to things like scale, padding, resolution, and sign conventions, not just to the “shape” of the explanation.&lt;/p&gt;

&lt;p&gt;That’s when it became clear: before evaluating attribution maps, you need to &lt;strong&gt;standardise them&lt;/strong&gt;. Otherwise, you’re mostly comparing formatting differences between methods, not their actual behaviour.&lt;/p&gt;

&lt;p&gt;In this post, I’ll show how I preprocess raw attribution maps into a canonical, evaluation-ready form before passing them to Quantus metrics.&lt;/p&gt;

&lt;p&gt;At first I tried to “fix” this by using Quantus’s built-in &lt;code&gt;normalize_func&lt;/code&gt;, but it didn’t change the ranking in a meaningful way.&lt;br&gt;
The real issue wasn’t the overall scale – it was the &lt;strong&gt;pedestal&lt;/strong&gt;:&lt;br&gt;
both methods produced a low but non-zero activation almost everywhere in the image.&lt;br&gt;
Guided Backprop had a noisy background plus a pedestal, while Fusion Grad had a very thin, sharp signal &lt;strong&gt;on top of its own pedestal&lt;/strong&gt;.&lt;br&gt;
Complexity only sees “how much structure lives above zero”.&lt;br&gt;&lt;br&gt;
If you keep the pedestal, Fusion Grad’s thin signal sits on a wide plateau and ends up looking &lt;em&gt;more&lt;/em&gt; complex numerically than the noisier Guided Backprop map.&lt;/p&gt;

&lt;p&gt;That’s why the next step was not “better normalisation”, but &lt;strong&gt;explicitly removing or reducing the pedestal&lt;/strong&gt; before computing Complexity.&lt;/p&gt;
&lt;h2&gt;
  
  
  Baseline-Subtraction Normalization
&lt;/h2&gt;

&lt;p&gt;Instead of relying on the default &lt;code&gt;normalize_func&lt;/code&gt;, I implemented a custom one that does two things per attribution map:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Baseline removal (pedestal):&lt;/strong&gt;&lt;br&gt;
Compute a low percentile (for example, the 5th percentile) and treat it as a baseline.&lt;br&gt;
Subtract this baseline from all values and clamp negatives to zero. This removes the global “pedestal” while keeping the meaningful peaks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;0–1 normalisation:&lt;/strong&gt;&lt;br&gt;
After baseline removal, rescale the map to the [0, 1] range so that Complexity sees something closer to a probability distribution per sample, instead of raw arbitrary units.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import numpy as np

def baseline_subtraction_norm(attr_map: np.ndarray,
                            baseline_quantile: float = 0.2) -&amp;gt; np.ndarray:
    """
    Normalize an attribution map for evaluation:
    1) subtract a low quantile as baseline (pedestal removal),
    2) clamp to &amp;gt;= 0,
    3) rescale to [0, 1].
    """
    # 1. pedestal removal
    baseline = np.quantile(attr_map, baseline_quantile)
    x = attr_map - baseline
    x = np.clip(x, a_min=0.0, a_max=None)

    # 2. scale to [0, 1]
    max_val = x.max()
    if max_val &amp;gt; 0:
        x = x / max_val
    return x

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;And than you can simply use quantus complexity metric with your custom &lt;code&gt;normaliza_func&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import quantus

complexity_metric = quantus.Complexity(
    abs=True,
    normalise=True,
    normalise_func=baseline_subtraction_norm,
)

scores = complexity_metric(
    model=model,
    x_batch=x_batch,      # input images
    y_batch=y_batch,      # targets
    a_batch=attr_maps,    # attribution maps
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Below you can see the value distribution of Fusion Grad before and after pedestal removal.&lt;br&gt;
After subtracting the baseline, most background pixels are exactly zero, and the Complexity metric reacts much more to the actual structure around the defect line and contact.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxd7u42ye8puv7xr4fql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxd7u42ye8puv7xr4fql.png" alt="Distributions of fusion_grad heatmap, before vs after normalization" width="774" height="288"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Best practice&lt;/strong&gt;&lt;br&gt;
Before applying quantitative metrics to attribution maps, make preprocessing explicit and consistent. Remove method-specific pedestals, standardize sign conventions, and rescale per sample. Otherwise, metrics like Complexity primarily measure implementation artefacts (background mass, padding, resolution) rather than explanatory structure.&lt;/p&gt;

</description>
      <category>explainableai</category>
      <category>ai</category>
      <category>analytics</category>
      <category>python</category>
    </item>
    <item>
      <title>Turning Any Model into an XAI-Ready Model: Formats and Gradient Flow</title>
      <dc:creator>Tova A</dc:creator>
      <pubDate>Tue, 10 Feb 2026 11:32:50 +0000</pubDate>
      <link>https://forem.com/tova501/turning-any-model-into-an-xai-ready-model-formats-and-gradient-flow-4a9j</link>
      <guid>https://forem.com/tova501/turning-any-model-into-an-xai-ready-model-formats-and-gradient-flow-4a9j</guid>
      <description>&lt;p&gt;This post is based on work done during a joint &lt;strong&gt;Applied Materials&lt;/strong&gt; and &lt;strong&gt;Extra-Tech&lt;/strong&gt; bootcamp, where I built an XAI platform. &lt;br&gt;
I’d like to thank &lt;strong&gt;Shmuel Fine&lt;/strong&gt; (team leader) and &lt;strong&gt;Odeliah Movadat&lt;/strong&gt; (mentor) for their guidance and support throughout the project.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why Gradient-Based XAI Sometimes “Works” but Tells You Nothing
&lt;/h2&gt;

&lt;p&gt;Gradient-based explainability methods (Grad-CAM, Guided Backprop, Integrated Gradients, etc.) are everywhere. &lt;br&gt;
In tutorials, you call a function, get a pretty heatmap, and move on. &lt;br&gt;
In a real project, it’s different. &lt;br&gt;
I was building an internal XAI platform that needed to work across: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different ML frameworks (PyTorch, TensorFlow) &lt;/li&gt;
&lt;li&gt;Different vision tasks (classification, segmentation, regression, and custom industrial models) &lt;/li&gt;
&lt;li&gt;Different stored model formats and exports collected over time
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In theory, any gradient-based method should “just work” on top of these models. &lt;br&gt;
In practice, once we started running them, things got messy. Sometimes we got blank or obviously wrong heatmaps with no warning. Other times it failed loudly with: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;RuntimeError: element 0 of variables does not require grad and does not have a grad_fn &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The forward pass was correct, the predictions made sense — but the value we were backpropagating from was no longer connected to the gradient graph. &lt;/p&gt;

&lt;p&gt;That’s when it became clear: the main problem wasn’t any specific XAI algorithm, but the combination of model formats, conversions, and gradient flow.&lt;br&gt;
In other words, not every model file we could load was actually explainable or measurable. &lt;/p&gt;
&lt;h2&gt;
  
  
  What an XAI-Ready Model Actually Needs
&lt;/h2&gt;

&lt;p&gt;Very quickly, here’s the minimum a model needs in order to produce meaningful, measurable gradient-based explanations: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gradients that actually exist&lt;/strong&gt; When we choose a scalar score (for example, a class logit) and call &lt;code&gt;backward()&lt;/code&gt;, the gradient must flow back to what we care about – either the input image or some internal layer. 
If that path is broken, the explainer can still return a heatmap-shaped tensor, but it’s not telling us anything real.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A score that stays inside the graph&lt;/strong&gt; The value we backpropagate from has to be a tensor that’s still part of the computation graph. If it was turned into a Python number, passed through NumPy, or detached along the way, we’ve already lost the information XAI needs. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Access to internal features for CAM-style methods&lt;/strong&gt; For methods like Grad-CAM, we also need a way to read activations and gradients from a chosen internal layer – but that comes after the basic gradient path is in place. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post is about how to work with real-world models &lt;strong&gt;without breaking these requirements&lt;/strong&gt;. &lt;/p&gt;
&lt;h2&gt;
  
  
  How Common Formats Behaved in Our Platform
&lt;/h2&gt;

&lt;p&gt;Once we knew what an XAI-ready model needs, we looked at what we actually had: ONNX exports, TorchScript files, and some legacy TensorFlow models. All of them were fine for inference. For gradient-based XAI, the picture was very different. &lt;/p&gt;

&lt;p&gt;This complexity really shows up when you’re not just loading your &lt;em&gt;own&lt;/em&gt; training code, but building a platform that has to accept &lt;strong&gt;unknown model architectures&lt;/strong&gt; from different teams.&lt;br&gt;
If you control the architecture, you can usually rebuild a clean eager model and just load a state_dict. If all you get is a stored artifact (ONNX, TorchScript, legacy TF graph), then the &lt;strong&gt;format itself&lt;/strong&gt; decides how much structure and gradient information you still have. That’s exactly the situation we were in. &lt;/p&gt;
&lt;h3&gt;
  
  
  ONNX – Great for Inference, Not for Gradients
&lt;/h3&gt;

&lt;p&gt;We had models already deployed as ONNX. It was tempting to reuse them for Grad-CAM and Integrated Gradients. In practice, ONNX runtimes are optimised for &lt;strong&gt;forward passes&lt;/strong&gt;, not for autograd: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You get fast, correct predictions. &lt;/li&gt;
&lt;li&gt;You don’t get a PyTorch/TF-style gradient graph or easy hooks into internal layers. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So ONNX became our conclusion: &lt;strong&gt;perfect for deployment, but not a reliable source for gradient-based explanations or metrics&lt;/strong&gt;. For XAI, we need the original framework model, not just the ONNX file. &lt;/p&gt;
&lt;h3&gt;
  
  
  TorchScript – fine for simple gradients, fragile for CAM-style methods
&lt;/h3&gt;

&lt;p&gt;In our experience, TorchScript models do support gradients: if the export wasn’t heavily frozen or over-optimised, we can reliably backpropagate from a scalar score to the input and obtain meaningful gradient-based heatmaps.&lt;/p&gt;

&lt;p&gt;The problems appear when CAM-style methods require access to internal convolutional features. Some TorchScript exports fuse layers, inline modules, or alter module boundaries, so the convolutional blocks we want to hook are no longer explicitly exposed. In these cases, forward and backward hooks become fragile, and optimisation steps can effectively make internal activations inaccessible even though gradients still exist.&lt;/p&gt;

&lt;p&gt;Because of this, we treat TorchScript as acceptable for &lt;strong&gt;gradient-w.r.t-input&lt;/strong&gt; methods, but for CAM-style explainers we require the original eager &lt;code&gt;nn.Module&lt;/code&gt;, where internal layers remain cleanly and reliably accessible.&lt;/p&gt;
&lt;h3&gt;
  
  
  Native PyTorch / TF – Our XAI-Ready Baseline
&lt;/h3&gt;

&lt;p&gt;After this, we decided &lt;br&gt;
that for explainability the “ground truth” formats are: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PyTorch nn.Module in eager mode &lt;/li&gt;
&lt;li&gt;TF2/Keras models or SavedModels that work cleanly with &lt;code&gt;GradientTape&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All other artifacts (ONNX, unknown TorchScript, legacy TF graphs) are welcome for inference, but we don’t assume they are explainable until proven otherwise. &lt;/p&gt;

&lt;p&gt;We also learned that simply “converting back” from a non-differentiation-friendly format doesn’t magically fix things. &lt;br&gt;
You can end up with a PyTorch &lt;code&gt;nn.Module&lt;/code&gt; or a TF2 SavedModel that &lt;em&gt;looks&lt;/em&gt; clean, but was rebuilt from ONNX or an old TF1 graph using a script full of &lt;code&gt;.numpy()&lt;/code&gt; calls and manual tensor operations. &lt;br&gt;
On paper the format is now “good”, but the gradient path is still broken. &lt;br&gt;
For a deeper dive into how we converted legacy models without breaking gradients, see the companion post.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model format&lt;/th&gt;
&lt;th&gt;Gradient support&lt;/th&gt;
&lt;th&gt;CAM compatibility&lt;/th&gt;
&lt;th&gt;Recommended usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ONNX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No autograd graph&lt;/td&gt;
&lt;td&gt;❌ Not supported&lt;/td&gt;
&lt;td&gt;Inference / deployment only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TorchScript&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full (in correct loading)&lt;/td&gt;
&lt;td&gt;⚠️ Fragile (layers may be fused or hidden)&lt;/td&gt;
&lt;td&gt;Simple gradient methods when eager model is unavailable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Native PyTorch (eager)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full&lt;/td&gt;
&lt;td&gt;✅ Full&lt;/td&gt;
&lt;td&gt;Gradient-based XAI and quantitative metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Native TF2 / Keras&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full (&lt;code&gt;GradientTape&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;✅ Full&lt;/td&gt;
&lt;td&gt;Gradient-based XAI and quantitative metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Converted models (from ONNX / TF1)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Depends on conversion&lt;/td&gt;
&lt;td&gt;⚠️ Works if gradients are supported&lt;/td&gt;
&lt;td&gt;Treat as inference-only unless gradients are explicitly verified&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  Don’t Break Gradients in Your Own Code
&lt;/h2&gt;

&lt;p&gt;Even with an XAI-ready format, it’s still easy to kill gradients in the parts we control: preprocessing, adapters, and forward passes. We saw a few recurring “self-inflicted” problems: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrapping the whole explanation call in &lt;code&gt;torch.no_grad()&lt;/code&gt; &lt;/li&gt;
&lt;li&gt;Calling &lt;code&gt;.detach()&lt;/code&gt; on tensors that are still needed for the XAI score &lt;/li&gt;
&lt;li&gt;Converting tensors to NumPy (.cpu().numpy()) and then using those values to compute the score &lt;/li&gt;
&lt;li&gt;Using .item() on logits and doing the rest of the logic in pure Python&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these are harmless for plain inference, but they quietly break the gradient path that explanations rely on. &lt;br&gt;
Our rule of thumb became: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The main path from &lt;strong&gt;input → model → XAI score&lt;/strong&gt; must stay inside the framework’s autograd system. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If we really need to log or serialize something, we do it &lt;em&gt;after&lt;/em&gt; we’ve computed the score we’ll backpropagate from. &lt;/p&gt;
&lt;h3&gt;
  
  
  Bonus: A Tiny Sanity Check for New Models/Adapters
&lt;/h3&gt;

&lt;p&gt;Whenever we plug a new model or adapter into the platform, we run a quick  check: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the model in an XAI-ready format (eager PyTorch / TF2). &lt;/li&gt;
&lt;li&gt;Pick a simple score (for example, one class logit). &lt;/li&gt;
&lt;li&gt;Call backward. &lt;/li&gt;
&lt;li&gt;Verify that gradients at the input (or at the adapter boundary) are non-zero. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In PyTorch, this is just a few lines of code. If this fails with &lt;code&gt;does not have a grad_fn&lt;/code&gt; or always gives zero gradients, we usually don’t look at the explainer first – we look at the model format or the forward path we’ve built around it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import torch

model = torch.load("model.pt", map_location = "cpu", weights_only = False)
model.eval()

x = sample_image.to("cpu") 
x.requires_grad_(True)

logits = model(x)
score = logits[:, target_class].mean()

model.zero_grad()
score.backward()

print("Grad norm on input:", x.grad.norm().item())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this prints a reasonable, non-zero gradient norm, the model is at least technically explainable for gradient-w.r.t-input methods.&lt;/p&gt;

&lt;p&gt;In practice, this became our first filter: If the model is fully differentiable, we keep going with gradient-based explanations and metrics. &lt;br&gt;
If not, we still allow inference – but we deliberately treat that model as &lt;strong&gt;inference-only, not explainable.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>explainableai</category>
      <category>captum</category>
    </item>
  </channel>
</rss>
