Anomaly detection is one of the most common machine learning applications in various areas, from industrial defect identification to fraudulent financial detection.
One-class classification is beneficial for anomaly detection. It determines whether an instance belongs to the same distribution as the training data by assuming that the training data are all normal examples. However, representation learning is not available to these old methods. Furthermore, self-supervised learning has made significant progress in learning visual representations from unlabeled data, including rotation prediction and contrastive learning.
New Google AI research introduces a 2-stage framework that uses recent progress on self-supervised representation learning and classic one-class algorithms. This framework is simple to train and shows SOTA performance on various benchmarks, including CIFAR, f-MNIST, Cat vs. Dog, and CelebA. Following that, they offer a novel representation learning approach for a practical industrial defect detection problem using the same architecture. On the MVTec benchmark, the framework achieves a new state-of-the-art.
A Two-Stage Framework for Deep One-Class Classification
End-to-end learning for deep one-class classifiers often suffers from degeneration in which the model outputs the same results regardless of the input. Therefore, for one-class classifiers, the researchers applied their two-stage framework.
In the first step, the model was trained deep representation with self-supervision. The team employed one-class classification techniques (such as OC-SVM or kernel density estimator) in the second phase using the learning representations of the first stage.
This two-stage approach is robust to degeneration and also allows for the creation of more precise one-class classifiers. Additionally, the framework is not confined to specific representation learning and one-class classification algorithms.
Semantic Anomaly Detection
By testing with two sample self-supervised representation learning algorithms: rotation prediction and contrastive learning, the researchers examine the usefulness of the two-stage framework for anomaly detection.
The ability of a model to predict the rotated angles of an input image is referred to as rotation prediction. The current approach often employs the built-in rotation prediction classifier to build representations for anomaly detection. However, this approach is inefficient because such built-in classifiers aren’t trained for one-class classification.
In contrastive learning, a model learns to gather together representations from modified copies of the same image while pushing representations from distinct images away. As photos are drawn from the dataset during training, they are modified twice with simple augmentations. Typical contrastive learning converges to a solution in which all normal example representations are uniformly spread across a sphere. This is an issue as most one-class algorithms seek outliers by using normal training examples to compare the proximity of a tested example. Still, when all the normal instances are evenly distributed in space, outliers will always seem close to some of the normal examples.
The team proposes distribution augmentation (DA) for one-class contrastive learning to tackle this problem. Instead of learning representations just from the training data, the model learned from a combination of the training data and augmented training examples. The augmented instances were deemed distinct from the original training data. For distribution augmentation, they use geometric changes like rotation and horizontal flipping.
They evaluate the performance of one-class classification with regard to the area under the curve (AUC) on the widely used computer vision data sets such as CIFAR10 and CIFAR-100, Fashion MNIST, and Cat vs. Dog. On all of these benchmarks, the 2-stage architecture achieves state-of-the-art performance.
The framework’s performance remarkably improves 86 to 91.3 AUC by replacing the built-in rotation classifier employed in the first stage with a one-class classifier in the second stage.
Texture Anomaly Detection for Industrial Defect Detection
In many applications, the anomaly is frequently defined by localized faults rather than entirely different semantics. The detection of texture anomalies, for example, is beneficial for detecting a variety of industrial defects.
For texture anomaly detection, the team proposes a new self-supervised learning technique. The entire anomaly detection approach is two-staged. First, the model learns deep image representations and is specially trained to predict whether the image has been enhanced by simple CutPaste data augmentation.
CutPaste augmentation revolves around the idea that a given image is enhanced by cutting a local patch at random and pasting it back to a new spot inside the same image. Learning to differentiate between normal and CutPaste-enhanced instances promotes representations to be sensitive to picture local irregularity.
CutPaste method can also be used to locate the anomaly, i.e., “patch-level” anomaly detection. They visualized the patch anomaly scores in heatmaps that show where the anomaly occurs by upsampling with Gaussian smoothing, which improves anomaly localization significantly.