Machine learning and experiment

Every day in August of 2019, physicist Dimitrios Tanoglidis would walk to the Plein Air Café next to the University of Chicago and order a cappuccino. After finding a table, he would spend the next several hours flipping through hundreds of thumbnail images of white smudges recorded by the Dark Energy Camera, a telescope that at the time had observed 300 million astronomical objects.

For each white smudge, Tanoglidis would ask himself a simple yes-or-no question: Is this a galaxy? “I would go through about 1,000 images a day,” he says. “About half of them were galaxies, and the other half were not.”

After about a month, Tanoglidis—who was a University of Chicago PhD student at the time—had built up a catalogue of 20,000 low-brightness galaxies.

Then Tanoglidis and his team used this dataset to create a tool that, once trained, could evaluate a similar dataset in a matter of moments. “The accuracy of our algorithm was very close to the human eye,” he says. “In some cases, it was even better than us and would find things that we had misclassified.”

The tool they created was based on machine learning, a type of software that ‘learns’ as it digests data, says Aleksandra Ciprijanovic, a physicist at the US Department of Energy’s Fermi National Accelerator Laboratory who at the time was one of Tanoglidis’s research advisors. “It’s inspired by how neurons in our brains work,” she says—adding that this added “brainpower” will be essential for analyzing exponentially larger datasets from future astronomical surveys. “Without machine learning, we’d need a small army of PhD students to give the same type of dataset.”

Today, the Dark Energy Survey collaboration has a catalogue of 700 million astronomical objects, and scientists continue to use (and improve) Tanoglidis’s tool to analyze images that could show previously undiscovered galaxies.

“In astronomy, we have a huge amount of data,” Ciprijanovic says. “No matter how many people and resources we have, we’ll never have enough people to go through all the data.”

Classification—this is probably a photo of a galaxy versus this is probably not a photo of a galaxy—was one of machine learning’s earliest applications in science. Over time, its uses have continued to evolve.

From math to machines

Machine learning, which is a subset of artificial intelligence, is a type of software that can, among other things, help scientists understand the relationships between variables in a dataset.

According to Gordon Watts, a physicist at the University of Washington, scientists traditionally figured out these relationships by plotting the data and looking for the mathematical equations that could describe it. “Math came before the software,” Watts says.

This math-only method is relatively straightforward when looking for the relationship between only a few variables: the pressure of a gas as a function of its temperature and volume, or the acceleration of a ball as a function of the force of an athlete’s kick and the ball’s mass. But finding these relationships with nothing but math becomes nearly impossible as you add more and more variables.

“A lot of the problems we’re tackling in science today are very complicated,” Ciprijanovic says. “Humans can do a good job with up to three dimensions, but how do you think about a dataset if the problem is 50- or 100-dimensional?”

This is where machine learning comes in.

“Artificial intelligence doesn’t care about the dimensionality of the problems,” Ciprijanovic says. “It can find patterns and make sense of the data no matter how many different dimensions are added.”

Early days

Some physicists have been using machine-learning tools since the 1950s, but their widespread use in the field is a relatively new phenomenon.

The idea to use a type of machine learning called a neural network was proposed to the CDF and DZero experiments at the Tevatron in 1989, says Tommaso Dorigo, a physicist at the Italian National Institute for Nuclear Physics, INFN, who was a member of CDF. “People in the collaboration were both amused and disturbed by this.”

Amused because of its novelty; disturbed because it added a layer of opacity into the scientific process.

Machine-learning models are sometimes called "black boxes" because it is hard to tell exactly how they are handling the data put into them; their large number of parameters and complex architectures are difficult to understand. Because scientists want to know exactly how a result is calculated, many physicists have been skeptical of machine learning and reluctant to implement it into their analyses. “In order for a scientific collaboration to sign off on a new method, they first must exhaust all possible doubts,” Dorigo says.

Over the following decades, scientists at the Tevatron embraced the new methodology and made significant improvements, refining the technique and using it to make several important measurements.

Scientists found a reason to work through remaining doubts when the Large Hadron Collider came online, an event that coincided with the early days of the ongoing boom in machine learning in industry.

Josh Bendavid, a physicist at the Massachusetts Institute of Technology, was an early adopter. “When I joined CMS, machine learning was a thing, but seeing limited use,” he says. “But there was a big push to implement machine learning into the search for the Higgs boson.”

Machine learning and the Higgs discovery

The Higgs boson is a fundamental particle that helps explain why some particles have mass while others do not. Theorists predicted its existence in the 1950s, but finding it experimentally was a huge challenge. That’s because Higgs bosons are both incredibly rare and incredibly short-lived, quickly decaying into other particles such as pairs of photons.

In 2010, when the LHC experiments first started collecting data for physics, machine learning was widely used in industry and academia for classification (this is a photo of a cat versus this is not a photo of a cat). Physicists were using machine learning in a similar way (this is a collision with two photons versus this is not a collision with two photons).

But according to Bendavid, simply finding photons was not enough. Pairs of photons are produced in roughly one out of every 100 million collisions in the LHC. But Higgs bosons that decay into pairs of photons are produced in only one of 500 billion. To find Higgs bosons, scientists needed to find sets of photons that had a combined energy close to the mass of the Higgs. This means they needed more complex algorithms—ones that could not only recognize photons, but also interpret the energy of photons based on how they interacted with the detector. “It’s like trying to estimate the weight of a cat in a photograph,” Bendavid says.

That became possible when LHC scientists created high-quality detector simulations, which they could use to train their algorithms to find the photons they were looking for, Bendavid says.

Bendavid and his colleagues simulated millions of photons and looked at how they lost energy as they moved through the detector. According to Bendavid, the algorithms they trained were much more sensitive than traditional techniques.

And the algorithms worked. In 2012, the CMS and ATLAS experiments announced the discovery of the Higgs boson, just two years into studying particle collisions at the LHC.

“We would have needed a factor of two more data to discover the Higgs boson if we had tried to do the analysis without machine learning,” Bendavid says.

After the Higgs discovery, the LHC research program saw its own boom in machine learning. “Before 2012, you would have had a hard time to publish something which used neural networks,” Dorigo says. “After 2012, if you wanted to publish an analysis that didn’t use machine learning, you’d face questions and objections.”

Neutrinos

Today, LHC scientists use machine learning to simulate collisions, evaluate and process raw data, tease signal from background, and even search for anomalies. While these advancements were happening at the LHC, scientists were watching closely from another, related field: neutrino research.

Neutrinos are ghostly particles that rarely interact with ordinary matter. According to Jessie Micallef, a fellow at the National Science Foundation’s Institute for Artificial Intelligence and Fundamental Interactions at MIT, early neutrino experiments would detect only a few particles per year. With such small datasets, scientists could easily reconstruct and analyze events with traditional methods.

That is how Micallef worked on a prototype detector as an intern at Lawrence Berkeley National Laboratory in 2015. “I would measure electrons drifting in a little tabletop detector, come back to my computer, and make plots of what we saw,” they say. “I did a lot of programming to find the best fit lines for our data.”

But today, their detectors and neutrino beams are much larger and more powerful. “We’re talking with people at the LHC about how to deal with pileup,” Micallef says.

Neutrino physicists now use machine learning both to find the traces neutrinos leave behind as they pass through the detectors and to extract their properties, such as their energy and flavor. These days, Micallef collects their data, imports it into their computer, and starts the analysis process. But instead of toying with the equations, Micallef says that they let machine learning do a lot of the analysis for them.

“At first, it seemed like a whole new world,” they say—but it wasn’t a magic bullet. “Then there was validating the output. I would change one thing, and maybe the machine-learning algorithm would do really good in one area but really bad in another.

“My work became thinking about how machine learning works, what its limitations are, and how we can get the most out of it.”

Today, Micallef is developing machine-learning tools that will help scientists with some of the unique challenges of working with neutrinos—including using gigantic detectors to study not just high-powered neutrinos blasting through from outside the Milky Way, but also low-energy neutrinos that could come from nearby.

Neutrino detectors are so big that the sizes of the signals they measure can be tiny by comparison. For instance, the IceCube experiment at the South Pole uses about a cubic kilometer of ice peppered with 5,000 sensors. But when a low-energy neutrino hits the ice, only a handful of those sensors light up.

“Maybe a dozen out of 5,000 detectors will see the neutrino,” Micallef says. “The pictures we’re looking at are mostly empty space, and machine learning can get confused if you teach it that only 12 sensors out of 5,000 matter.”

Neutrino physicists and scientists at the LHC are also using machine learning to give a more nuanced interpretation of what they are seeing in their detectors.

“Machine learning is very good at giving a continuous probability,” Watts says.

For instance, instead of classifying a particle in a binary method (this event is a muon neutrino versus this event is not a muon neutrino), machine learning can provide an uncertainty associated with its assessment.

“This could change the overall outcome of our analysis,” Micallef says. “If there is a lot of uncertainty, it might make more sense for us to throw that event away or analyze it by hand. It’s a much more concrete way of looking at how reliable these methods are and is going to be more and more important in the future.”

Designing experiments of the future

Physicists use machine learning throughout almost all parts of data collection and analysis. But what if machine learning could be used to optimize the experiment itself? “That’s the dream,” Watts says.

Detectors are designed by experts with years of experience, and every new detector incrementally improves upon what has been done before. But Dorigo says he thinks machine learning could help detector designers innovate. “If you look at calorimeters designed in the 1970s, they look a lot like the calorimeters we have today,” Dorigo says. “There is no notion of questioning paradigms.”

Experiments such as CMS and ATLAS are made from hundreds of individual detectors that work together to track and measure particles. Each subdetector is enormously complicated, and optimizing each one’s design—not as an individual component but as a part of a complex ecosystem—is nearly impossible. “We accept suboptimal results because the human brain is incapable of thinking in 1,000 dimensions,” Dorigo says.

But what if physicists could look at the detector wholistically? According to Watts, physicists could (in theory) build a machine-learning algorithm that considers physics goals, budget, and real-world limitations to choose the optimal detector design: a symphony of perfectly tailored hardware all working in harmony.

Scientists still have a long way to go. “There’s a lot of potential,” Watts says. “But we haven’t even learned to walk yet. We’re only just starting to crawl.”

They are making progress. Dorigo is a member of the Southern Wide-field Gamma-ray Observatory, a collaboration that wants to build an array of 6,000 particle detectors in the highlands of South America to study gamma rays from outer space. The collaboration is currently assessing how to arrange and place these 6,000 detectors. “We have an enormous number of possible solutions,” Dorigo says. “The question is: how to pick the best one?”

To find out, Dorigo and his colleagues took into account the questions they wanted to answer, the measurements they wanted to take, and number of detectors they had available to use. This time, though, they also developed a machine-learning tool that did the same—and found that it agreed with them.

They plugged a number of reasonable initial layouts into the program and allowed it to run simulations and gradually tweak the detector placement. “No matter the initial layout, every simulation always converged to the same solution,” Dorigo says.

Even though he knows there is still a long way to go, Dorigo says that machine-learning-aided detector design is the future. “We’re designing experiments today that will operate 10 years from now,” he says. “We have to design our detectors to work with the analysis tools of the future, and so machine learning has to be an ingredient in those decisions.”

Editor's note: This article has been updated with an additional comment about the use of machine learning on experiments at the Tevatron.