Newswise — Machine learning method for harnessing large volumes of X-ray data will speed materials discovery.

Color coding makes aerial maps much more easily understood. Through color, we can tell at a glance where there is a road, forest, desert, city, river or lake.

Working with several universities, the U.S. Department of Energy’s (DOE) Argonne National Laboratory has devised a method for creating color coded graphs of large volumes of data from X-ray analysis. This new tool uses computational data sorting to find clusters related to physical properties, such as an atomic distortion in a crystal structure. It should greatly accelerate future research on structural changes on the atomic scale induced by varying temperature.

“Our method uses machine learning to rapidly analyze immense amounts of data from X-ray diffraction,” said Raymond Osborn, senior physicist in Argonne’s Materials Science division. ​“What might have taken us months in the past, now takes about a quarter hour, with much more fine grained results.”

“Because of machine learning, we are able to see materials behavior not visible by conventional XRD.” — Raymond Osborn, senior physicist

For over a century, X-ray diffraction, or XRD, has been one of the most fruitful of all scientific methods for analyzing materials. It has provided key information on the 3D atomic structure of innumerable technologically important materials.

In recent decades, the amount of data being produced in XRD experiments has increased dramatically at large facilities such as the Advanced Photon Source (APS), a DOE Office of Science user facility at Argonne. Sorely lacking, however, are analysis methods that can cope with these immense data sets.

The team calls their new method X-ray Temperature Clustering, or XTEC for short. It accelerates materials discoveries through rapid clustering and color coding of large X-ray data sets to reveal previously hidden structural changes that occur as temperature increases or decreases. A typical large data set would be 10,000 gigabytes, equivalent to roughly 3 million songs of streaming music.

XTEC draws on the power of unsupervised machine learning, using methods developed for this project at Cornell University. This machine learning does not depend on initial training and learning with data already well studied. Instead, it learns by finding patterns and clusters in large data sets without such training. These patterns are then represented by color coding.

“For example, XTEC might assign red to data cluster one, which is associated with a certain property that changes with temperature in a particular way,” Osborn said. ​“Then, cluster two would be blue, and associated with another property with a different temperature dependence, and so on. The colors tell whether each cluster represents the equivalent of a road, forest or lake in an aerial map.”

As a test case, XTEC analyzed data from beamline 6-ID-D at the APS, taken from two crystalline materials that are superconducting at temperatures close to absolute zero. At this ultralow temperature, these materials switch to a superconducting state, offering no resistance to electrical current. More important for this study, other unusual features emerge at higher temperatures related to changes in the material structure.   

By applying XTEC, the team extracted an unprecedented amount of information about changes in atomic structure at different temperatures. Those include not only distortions in the orderly arrangement of atoms in the material, but also fluctuations that occur when such changes happen.

“Because of machine learning, we are able to see materials’ behavior not visible by conventional XRD,” Osborn said. ​“And our method is applicable to many big data problems in not only superconductors, but also batteries, solar cells, and any temperature-sensitive device.”

The APS is undergoing a massive upgrade that will increase the brightness of its X-ray beams by up to 500 times. Along with the upgrade will come a significant increase in data collected at the APS, and machine learning techniques will be essential to analyzing that data in a timely manner.

The team published their findings in the Proceedings of the National Academy of Sciences in an article titled ​“Harnessing interpretable and unsupervised machine learning to address big data from modern X-ray diffraction.”

In addition to Osborn, Argonne authors include Matthew Krogstad, Daniel Phelan, Puspa Upreti, Michael Norman and Stephan Rosenkranz. The primary collaborating partner is Cornell University (Eun-Ah Kim, Jordan Venderley, Krishnanand Mallayya, Michael Matty, Geoff Pleiss, Varsha Kishore and Kilian Weinberger) and the Cornell High Energy Synchrotron Source (Jacob Ruff). Other partners include the University of Tennessee (David Mandrus), University of Maryland (Lekh Poudel) and New York University (Andrew Gordon Wilson).

Argonne funding was provided by the DOE Office of Basic Energy Sciences and National Science Foundation.

About the Advanced Photon Source

The U. S. Department of Energy Office of Science’s Advanced Photon Source (APS) at Argonne National Laboratory is one of the world’s most productive X-ray light source facilities. The APS provides high-brightness X-ray beams to a diverse community of researchers in materials science, chemistry, condensed matter physics, the life and environmental sciences, and applied research. These X-rays are ideally suited for explorations of materials and biological structures; elemental distribution; chemical, magnetic, electronic states; and a wide range of technologically important engineering systems from batteries to fuel injector sprays, all of which are the foundations of our nation’s economic, technological, and physical well-being. Each year, more than 5,000 researchers use the APS to produce over 2,000 publications detailing impactful discoveries, and solve more vital biological protein structures than users of any other X-ray light source research facility. APS scientists and engineers innovate technology that is at the heart of advancing accelerator and light-source operations. This includes the insertion devices that produce extreme-brightness X-rays prized by researchers, lenses that focus the X-rays down to a few nanometers, instrumentation that maximizes the way the X-rays interact with samples being studied, and software that gathers and manages the massive quantity of data resulting from discovery research at the APS.

This research used resources of the Advanced Photon Source, a U.S. DOE Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://​ener​gy​.gov/​s​c​ience.

SEE ORIGINAL STUDY