Cornell Physicists And Computer Scientists Collaborated To Build An Unsupervised And Interpretable Machine Learning Algorithm, XRD Temperature Clustering (X-TEC)

Modern X-ray facilities have acquired a significantly higher fraction of this data during the past ten years, thanks to advancements in source brightness and detector technology. However, these research breakthroughs generate massive amounts of data. According to researchers, a piece of material’s information easily surpasses 20 terabytes. Researchers have been looking into ways to understand and comprehend scientific principles from such large datasets, which are limited by the traditional form of analysis capacity, which is mostly manual.

A group of Cornell physicists and computer scientists developed an unsupervised machine learning method called X-ray diffraction temperature clustering (X-TEC). This method can automatically extract charge density wave order parameters and detect intraunit cell ordering and its fluctuations from high-volume X-ray diffraction measurements taken at various temperatures. Using X-TEC, the researchers then studied the major components of the pyrochlore oxide metal, Cd2Re2O7. 

Their paper, “Harnessing Interpretable and Unsupervised Machine Learning to Address Big Data from Modern X-ray Diffraction,” demonstrates that machine learning can generate a fair and thorough analysis of such data that combines long-range and short-range structural correlations as a function of temperature.

Their findings show that X-TEC quickly examined 15,000 Brillouin zones (individually defined cells, BZs) spread across eight gigabytes of X-ray data. The manual inspection would not have been possible; therefore, this extraordinary level of microscopic information they have uncovered via the XRD is ideal for such detailed data. 

A particular arrangement of many atoms known as the unit cell repeatedly occurs in complicated crystalline materials like that of high-rise residential buildings. Their research indicates that the repositioning occurs across the entire complex at a scale of each housing unit.

The researchers state that spotting this repositioning from the outside is challenging because the unit arrangement remains constant. A pseudo-Goldstone mode is produced due to the repositioning, which nearly violates a continuous symmetry. The team could see the existence of pseudo-Goldstone mode using X-TEC, which is hard to see otherwise. 


According to the team, this is the first time an XRD has been used to discover a Goldstone mode. They believe that this atomic-scale understanding of fluctuations in a complicated quantum substance will pave paths for more scientific discoveries of new phases of matter by employing extensive, information-rich diffraction data.

To facilitate study at the Advanced Photon Source and Cornell High Energy Synchrotron Source, the scientists have made X-TEC, a software package, available to researchers.

In future studies of such phase diagrams, X-TEC offers a method to include the complete data volume by clustering peak intensities from thousands of BZs instead of establishing crucial exponents by fitting a small number of peaks. X-TEC can direct the measurements through real-time analysis of the temperature dependencies once it has been integrated into the experimental workflow at the beamline. To efficiently find the underlying microscopic models in the inverse scattering problem, it is an interesting prospect to direct the X-TEC retrieved data toward automated procedures. The researchers believe that X-TEC will have a wide range of applications outside XRD, given its generic structure.

This Article is written as a summary article by Marktechpost Staff based on the paper 'Harnessing interpretable and unsupervised machine learning to address big data from modern X-ray diffraction'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, article.

Please Don't Forget To Join Our ML Subreddit