Materials that have specific properties are essential to progress in critical technologies like solar cells and batteries. A machine learning model designs new materials to order.
What’s new: Researchers at Microsoft and Shenzhen Institute of Advanced Technology proposed MatterGen, a diffusion model that generates a material’s chemical composition and structure from a prompt that specifies a desired property. The model and code are available under a license that allows commercial as well as noncommercial uses without limitation. The training data also is noncommercially available.
How it works: MatterGen’s training followed a two-stage process. In the first stage, it learned to generate materials (specifically crystals — no liquids, gasses, or amorphous solids like glass). In the second, it learned to generate materials given a target mechanical, electronic, magnetic, or chemical property such as magnetic density or bulk modulus (the material’s resistance to compression).
- MatterGen first learned to remove noise that had been added to 600,000 examples drawn from two datasets. Specifically, it learned to remove noise from three noisy matrices that represented a crystal’s shape (parallelepiped), the type of each atom, and the coordinates of each atom.
- To incorporate information about properties, the authors added to the diffusion model four vanilla neural networks, each of which took an embedding of the target property. The diffusion model added the output of these networks to its intermediate embeddings at different layers.
- Then the authors fine-tuned the system to remove added noise from materials that contained property information in their original dataset.
- At inference, given three matrices of pure noise representing crystal shape, atom types, and atom coordinates, and a prompt specifying the desired property, the diffusion model iteratively removed the noise from all three matrices.
Results: The authors generated a variety of materials, and they synthesized one to test whether it had a target property. Specifically, they generated over 8,000 candidates with the target bulk modulus of 200 gigapascals (a measure of resistance to uniform compression), then automatically filtered them based on a number of factors to eliminate material in their dataset and unstable materials. Of the remaining candidates, they chose four manually and successfully synthesized one. The resulting crystal had a measured bulk modulus of 158 gigapascals. (Most materials in the dataset had a bulk modulus of between 0 and 400 gigapascals.)
Behind the news: Published in 2023, DiffCSP also uses a diffusion model to generate the structures of new materials. However, it does so without considering their desired properties.
Why it matters: Discovering materials relies mostly on searching large databases of existing materials for those with desired properties or synthesizing new materials and testing their properties by trial and error. Designing new crystals with desired properties at the click of a button accelerates the process dramatically.
We’re thinking: While using AI to design materials accelerates an important step, determining whether a hypothesized material can be manufactured efficiently at scale is still challenging. We look forward to research into AI models that also take into account ease of manufacturing.