Skip to main content

Math matters: how big data is building the future of everything

Math matters: how big data is building the future of everything


To speed up innovation, scientists are sequencing 'the materials genome'

Share this story

Materials Genome Initiative
Materials Genome Initiative

New materials lead to new innovations. Gorilla Glass is a big selling point for smartphones. Kevlar saves lives and has worked its way into consumer products. Lithium-ion batteries have enabled a host of energy-storage applications, from planes to cars to computers. But there's a problem.

Actually creating a new, game-changing material is a glacially slow process — especially when compared to the rate at which new products relying on those materials hit the market. It took just under nine years for the Boeing 787 Dreamliner to go from a concept to commercial flight. The development of the iPhone began in 2005; the phone was on store shelves by 2007. In contrast, the creation of new materials moves far more slowly, taking about 20 years for all of the necessary research and development.

An attempt to gain a deeper understanding of how the elements interact

In an effort to overcome this innovation bottleneck, the White House two years ago announced the Materials Genome Initiative. The venture aims to halve development time for new materials and slash the monetary investment required. And if the name sounds familiar, it should: in the same way the Human Genome Project set out to map the underlying structure of human genes, the Materials Genome Initiative is an attempt to gain a deeper understanding of how the elements interact to give us a diverse set of materials and materials properties. With that foundation of knowledge, scientists and engineers will hopefully be able to create new materials tuned to the exact properties needed for a particular application — and be able to do it much, much faster.


A huge number of atomic combinations and arrangements may have useful properties. However, most arrangements won't be useful, or even able to be synthesized. Trying to explore the vast world of potential materials in a lab would be both impractical and just plain impossible. So to map out that enormous number of possible materials, several research groups working on the Materials Genome Initiative are using computers to model known and unknown materials. They mine the resulting data to find areas that deserve a more careful examination.

In the years since its inception, the initiative has brought together several successful ventures. Among them are the Materials Project at MIT and the Harvard Clean Energy Project. These two projects have similar theoretical underpinnings for different end goals. MIT's Materials Project is focused on inorganic solids, especially those for battery materials, while the Clean Energy Project is examining molecules for solar cell applications. Both are powered by huge databases that are populated with information gleaned from Density Functional Theory (DFT) calculations. DFT uses quantum mechanics to predict many properties of the real, physical substances being modeled.

A dataset of over 100,000 known and theoretical materials

MIT's Materials Project started about eight years ago, and was catalyzed by the work of Professor Gerbrand Ceder. As a consultant to several companies, Ceder would screen a large number of materials for particular applications. But working with individual companies left the data siloed and locked up. "People would be able to do really creative things with this if we gave this to the world, and this became Materials Project," he says. Now, MIT's dataset consists of over 100,000 known and theoretical materials. To make sense of the data and design new materials, MIT researchers use a combination of human intuition and machine learning designed to understand the laws of chemistry.

Similarly, the Harvard Clean Energy Project has created a huge database that can be explored by man and machine for potential solutions to materials problems. The venture started as a small proof of concept examining potential organic solar cell materials. Researchers calculated the properties of about 15 compounds to predict how well these new substances might perform in the real world without having to synthesize them first. These calculations eventually yielded a new compound with near record-breaking electrical properties. But that success was just from a few chemicals calculated by a single graduate student. What could be discovered if you increased available computing power by distributing the calculations to an army of volunteers?

Researchers have calculated millions of potential solar cell compounds

Today, the Harvard Clean Energy Project is doing just that: anyone across the globe can download a program that performs scientific calculations on their PCs and reports back the results. With this massive resource at their disposal, researchers have calculated millions of potential solar cell compounds — and they're only getting started. "Right now is an interesting time for the project," says Dr. Hachmann, a research associate involved in the Harvard project. "We are at the point where we can harvest the fruits of our hard work and hopefully get some nice results."

Currently, Harvard investigators have released 2.3 million compounds online for anyone to search through. And while these compounds have been calculated with solar cells in mind, other scientists who mine the data will be able to use the information to research other classes of materials. Similarly, researchers working on MIT's Materials Project have an online portal for anyone to explore their data.

"You cannot anticipate what people will do with it."

The greater Materials Genome Initiative aims to reduce costs and time for material development, and Ceder hopes to see it accomplish that bold goal. In fact, he's already seen it work: Ceder is in the process of patenting new materials for use in batteries, a big win for the burgeoning Initiative and the field of materials discovery. And with heaps of data being shared in online databases, those successes are likely to keep coming. Ceder hopes to see the Materials Genome Initiative will lead to big innovations in materials science in the same, unpredictable way the web transformed many facets of modern life. "When you make stuff like this available," he says, "you cannot anticipate what people will do with it."