CLEANN : A Framework That Protects Artificial Neural Networks From Trojan Attacks

The demand for Artificial Intelligence tools and machine learning algorithms have gained importance in various sectors in the past few years. Despite all the benefits offered by AI models, critical threats are endangering their safety and integrity. The AI models and algorithms are trained on large online datasets and third party databases, making them vulnerable to cyberattacks. It is crucial to detect these attacks and mitigate their impact on the system. 

The neural Trojan attack

One type of cyber threat is the neural Trojan attack. The neural Trojans are the malicious inputs that deliberately cause AI models to make mistakes. Hackers employ a neural Trojan attack, and cyber-thieves tricks the users through social engineering into loading and executing Trojans on their systems. This enables the attacker to create a backdoor to the user’s system and classify data incorrectly. The intruder can download and steal data and even upload malware to the infected system.

CLEANN framework

CLEANN is an end-to-end framework designed by the researchers at the University of California, San Diego. It is a lightweight and practical system that protects the embedded artificial neural networks against Trojan attacks. It monitors the deployed AI model, ensuring that the Trojan inputs do not trigger unwanted behavior. The framework identifies the characteristics of safe input data. Subsequently, it analyzes new data based on these characteristics to spot Trojan triggers and correct their mistakes in the infected AI models.

CLEANN learns a sparse reconstruction of the innocuous inputs. It uses sparse recovery to project malicious samples into the benign learned space, allowing us to detect Trojans and stop their malignant effect. Therefore, applying sparse recovery techniques to AI models’ selected signals can shield them from Trojan attacks. 

(a) Example Trojan data with watermark and square triggers, (b) reconstruction error heatmap, and (c) output mask from the outlier detection module. Credit: Javaheripi et al. Source:

CLEANN has achieved positive results in initial evaluations using neural network-based image classification models. It is the first lightweight defense framework that has achieved both high detection and high decision correction rates. The previously proposed Trojan mitigating methods induce a high execution overhead that hinders their applicability to embedded neural networks. Unlike most Trojan defense methods, CLEANN does not require annotated data or a targeted AI model to retrain, which can be costly and time-consuming.

The researchers have also developed specialized hardware that supports their framework. This hardware efficiently executes the framework in real-time, mitigating the hazards caused by Trojan attacks.


🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...