AI for research: How machines can guide R&D

AI could be the discovery that generates countless future discoveries.
16 August 2019

Berkeley Lab’s Wang Hall (CRT/NERSC) – exterior photos at sunset – 02/04/2016. Source: Berkeley Lab

What if our inventions start inventing things? What if we imbue our curiosity into programs that seek out hidden patterns and truths? It’s a powerful idea. But there are countless problems that we need to solve on our own before we can get to the hypothesized artificial general intelligence (AGI) that has inspired so many dreams and debates.

Meaningful progress is being made. Convolutional neural networks have already exceeded expectations. Some evolutionary algorithms attempt to leverage the same mechanisms of our own biological evolution, with deliberately implemented pressures directing mutating code towards optimal design solutions. This technology can be fascinating. But let’s not forget that there is another mechanism of technological selection taking place, outside of the machines…

Research & Development

It’s the use case. In many instances, a viable business or even military use case is what connects interesting research with a funding stream.

Academics are constantly churning out discoveries that reveal how our world really works. Businesses and engineers are trying to figure out how they could make it work for us in more optimal, creative, and commercially viable ways. They seek to shepherd a basic discovery through applied scientific research and field testing and then onwards to deployment. The process sometimes takes decades. Patience and long-term thinking are required.

But it often begins with an initial scientific discovery. The good news: There are a lot of discoveries being made, across a range of disciplines, all around the globe. The bad news: There are too many discoveries to monitor and pursue.

But here’s one possible solution…

A recent study published in Nature suggests that materials science could benefit from AI input. Researchers at the Lawrence Berkeley National Laboratory used machine learning methods to capture latent knowledge from approximately 3.3 million abstracts, primarily focused on materials science, physics, and chemistry. By using nothing more than patterns in text, an algorithm made valid scientific predictions.

A solution for overwhelmed R&D

Academia is messy. Some journals restrict access to knowledge. Publisher consolidation has broad ripple effects. Article-processing charges, publication biases, and reputation concerns deter scientists from publishing null results. According to Franco et al., for example, strong results are 40 percent more likely to be published than null results and 60 percent more likely to be written up.

But even if incentives were realigned and systems were “purified,” if you will, there are still practical limitations on human attention. There’s a heap of research papers, already written and available both in open-access and behind paywalls. There are untold amounts of data, never published for the reasons mentioned. All of this represents trillions of dollars in investment— and no human being has the time or cognitive power to go through it all, even within a relatively narrow field.

AI for research

Past studies aimed to retrieve information from existing scientific papers through supervised natural language processing, but this latest experiment relied on unsupervised word embeddings. The skip-gram variation of Word2vec was integral to this approach. The researchers didn’t explicitly insert their knowledge into the process. Rather, they deployed a technological model that could explore the words and their associations, create vector representations of words, and recommend thermoelectric materials with functional applications as a consequence.

If scientific literature has embedded and neglected gold, this is a new way of mining it.

This method of using AI for research could guide R&D directions and assist materials scientists when they’re trying to design new chemical compositions for an application. It could increase the ingenuity and pace of a lab.

Some people may hear about this and rush to Twitter with the declaration that scientists everywhere will imminently lose their jobs to algorithms and robots. While this can’t be discounted as an eventuality, it’s probably a premature conclusion.

Dr. Anubhav Jain, one of the study’s co-authors, runs a research group at Lawrence Berkeley National Laboratory and aims to accelerate the process of materials design through new technologies.

When I spoke with him about these findings, he told me that the algorithm can suggest a chemical composition for a material scientist to look at in order to arrive at a new functional material, but he pointed out that there are a lot of other steps needed in order to make a material work in practice.

“You need to synthesize that material, you might need to purify that material, you might need to build that material. You might have to integrate that into a device and test things,” said Dr. Jain.

Berkeley Lab's Molecular Foundry Nanofabrication Clean Room.

Berkeley Lab’s Molecular Foundry Nanofabrication Clean Room. Source: Berkeley Lab – Roy Kaltschmidt

Apparently, humans haven’t been edged out just yet. Material properties, such as the thermal stability and lifetime, still need to be examined. Also, an algorithm’s objectivity is affected by the data on which it’s trained, and biases within datasets can skew the insights that the AI generates.

However, Dr. Jain did not discount the possibility of technological augmentation or automation across some of these other aspects.

Dr. Jain added, “I will say that in materials science, it’s actually becoming more and more feasible to have robotic laboratories.”

He continued, “You could imagine that if you did pair this technology with some of the automated laboratories, you could have a system in the future where AI reads the research papers, figures out which experiments to do, actually conducts those experiments, and maybe even pick a materials formulation that is useful for some purpose, all on its own. I think it’s theoretically possible to do something like that but there’s still a lot of practical issues in all of these things that prevent it from being right around the corner.”

Historically, we have always been dwarfs perched on the shoulders of giants. Knowledge builds upon knowledge, but we have been proactive builders. That is fundamentally different than switching to the observer’s seat and letting technology sift through numbers and papers in order to flag points of interest and probable success. AI could transform the ways that businesses approach their R&D.

Dr. Vahe Tshitoyan, an author of the paper and machine learning engineer at Google, told me that people don’t usually think of natural language processing as a tool for making discoveries. It’s mostly a tool for extracting information that is already known. These findings could potentially broaden the applications of NLP.