MIT Researchers Unveils A New Way Using ‘Adversarial Attacks’ to Quantify The Uncertainty in Molecular Energies Predicted by Neural Networks


Neural networks are being used to predict new materials, chemical reactions and drug-target interactions. For these applications, they’re faster than traditional methods like quantum mechanical simulations by orders of magnitude. However, the price for this agility is reliability. Machine learning models only interpolate and may fail when used outside of their domain sample data.

The part that worried MIT researchers was the tedious and labor intensive task of establishing limits for machine learning models. This is especially true in predicting potential energy surfaces (PES), a map which encodes complexities into flatlands, valleys peaks, troughs and ravines. These are molecular configurations with higher stability typically found deep within their chasms from where atoms or molecules rarely escape–quantum mechanical pits.

The researchers from MIT have presented a technique (Nature Communications paper) that can help in demarcating the “safe zone” of neural networks. They did so by using an adversarial attack on different molecules, which has been studied for other classes of problems such as image classification before this time.

The Gómez-Bombarelli lab at MIT is working on a synthesis of first-principles simulation and machine learning that greatly speeds up the process. The actual simulations are run for only a fraction of these molecules, which then feed into neural networks to learn how to predict property data for all other molecules. The researchers have successfully demonstrated these methods for a growing class of novel materials that includes catalysts to produce hydrogen from water, cheaper polymer electrolytes for electric vehicles, zeolites as molecular sieves, magnetic materials and more.

The problem is, neural networks are only as smart as the data they’re trained on. The PES map has 99% falling into one pit and completely missing valleys of more interest. That’s bad news for self-driving cars that can’t see people crossing streets!

There are many different ways to understand the uncertainty of a model. One way is to run data through multiple versions. The researchers used multiple neural networks to make energy surface predictions. When the network is confident, its outputs converge and are near each other. Still, when they’re uncertain of their prediction, there can be a wide variation in output which could either indicate that none of them have produced an accurate solution or all three may contain elements that represent parts correctly chosen by different models.

This new approach only samples data points from regions of low prediction confidence. These molecules are then stretched or deformed slightly so that the uncertainty is maximized. Additional simulations are computed for these molecules and added to the initial training pool. The neural networks are trained again to define the uncertainties associated with various points on the surface. This process is repeated until it becomes impossible to decrease any further because of well-defined uncertainty values at each point.

In this paper, the researchers present several examples of their approach in predicting complex supramolecular interactions in zeolites. These materials are cavernous crystals that act as molecular sieves with high shape selectivity and find applications across multiple fields, including catalysis, gas separation, and ion exchange, to name a few.

The researchers show how their method for performing simulations of large zeolite structures can provide significant savings in computational simulations. To predict potential energy surfaces for these systems, the team used more than 15,000 examples to train a neural network. The final results were mediocre, with only around 80 percent of the neural network-based simulations being successful despite spending a large amount to generate the dataset. The researchers added 5,000 new data points to improve the performance of neural network potentials with 92 percent.

Instead, when they used adversarial training to retrain the neural networks with only 500 extra points, it jumped up to 97% in terms of performance. That’s a remarkable result especially considering that each of these extra points takes hundreds of CPU hours, according to researchers.

This new method could be the most realistic way to probe the limits of models that researchers use to predict how materials will behave and chemical reactions progress.