Among other Machine learning (ML) techniques, deep neural networks have become crucial components for various applications, including image classification, audio recognition, and natural language processing (NLP). In most circumstances, these approaches have attained predicted accuracy that is on par with human performance. As a result, techniques for evaluating and comprehending what the model has learned have become an essential component of a thorough validation method. In reality, it’s critical to ensure that the measured accuracy results from using an appropriate problem representation rather than exploiting data artifacts.
Facebook AI has released a new version of Captum, a powerful, user-friendly model interpretability library for PyTorch. With Captum, researchers and engineers can swiftly design, develop and debug advanced AI models. In addition, it also assists users in understanding how their AI models work. This enables users to assess whether the models reflect their values and provide accurate predictions that meet the needs of their organizations.
Version 0.4 introduces new tools for evaluating model robustness, new attribution methods, and enhancements to existing attribution methods.
Removing Biases With Concept-Based Interpretability
CAVs (concept activation vectors) are extensively used to explain a neural network’s internal state by linking model predictions with concepts that people can understand (such as “apron,” “cafe,” and so on).
Testing with concept activation vectors (TCAV) is now available in Captum 0.4, allowing researchers and engineers to see how different user-defined ideas affect a model’s prediction. TCAV can also be used to check for algorithmic and label bias in fairness analyses.
TCAV goes beyond currently available attribution methodologies, allowing researchers and engineers to quantify the importance of numerous inputs and the impact of concepts like gender and race on a model’s prediction. TCAV has been implemented generically in Captum 0.4, allowing users to build custom ideas with sample inputs for various modalities, such as vision and text.
The researchers used a sensitivity analysis methodology described in one of Captum’s tutorials to illustrate the distributions of TCAV scores. They used a list of movie ratings with positive sentiment as their data set. The findings show how important positive adjectives are in predicting pleasant feelings. Across all five neutral idea sets, the positive adjectives notion was much more relevant for both convolutional layers.
Building Robust AI Models
Deep learning approaches can be affected by hostile inputs that can trick an AI model but go unnoticed by humans. Captum 0.4 offers robustness tooling to aid in a better understanding of a model’s limitations and vulnerabilities. Under prescribed parameters, a resilient AI system should regularly generate safe and dependable findings. Unexpected situations will cause the AI system to respond and make required changes to avoid harming or negatively affecting individuals.
The library also provides novel model robustness tools, such as adversarial attack implementations (fast-gradient sign technique and projected-gradient descent) and robustness metrics to assess the impact of various attacks on a model.
Robustness metrics in the new release include:
- Assault Comparator that lets users measure and compares the impact of any input perturbation (such as text augmentation, etc.) or adversarial attack on a model.
- Minimal perturbation, for determining the smallest amount of perturbation required to cause a model to misclassify a perturbed input.
Model developers can use this robustness software to help understand potential model flaws and examine counterfactual scenarios to understand a model’s decision boundary better.
Layer-wise relevance propagation (LRP), a new attribution technique developed by Facebook AI in partnership with Technische Universität Berlin, provides a new perspective for explaining model predictions.
LRP and a layer-attribution variation (layer LRP) are both included in Captum 0.4. A backward propagation technique is used sequentially to all layers of the model in layer-wise relevance propagation. The initial relevance is represented by the model output score, which is decomposed into values for each neuron in the underlying layers.
Finally, Captum 0.4 includes enhancements and bug corrections for current attribution methods.
Captum is compatible with the Fiddler explainable AI platform, which allows engineers and developers to get actionable insights and evaluate the decision-making behavior of AI models.