Science today collects a mind-boggling amount of data. Particle physics experiments at CERN have collected more than 100 petabytes—that’s 104,857,600 gigabytes—of data about particle collisions in the Large Hadron Collider.
Taking all this raw data and turning it into something meaningful takes enormous computing power, more than any single institution can provide. That’s where the Open Science Grid comes in.
Funded by the Department of Energy’s Office of Science and the National Science Foundation in 2006 to meet the US LHC community’s computational needs, the OSG ties together individual groups’ computing power, connecting their resources to create a large, robust computing grid. This grid allows scientists to tackle vast mountains of data in a logical, efficient way.
It also creates opportunity for non-LHC scientists.
“The LHC community has a large set of resources that work most efficiently when they’re linked together,” says OSG project manager Chander Sehgal. “But they’re not always completely busy. When they’re not, we can harvest unused cycles and make those available to other researchers.”
Scientists have used the OSG to simulate interactions between DNA and proteins, investigate the human body’s response to tuberculosis, and study the agricultural impacts of large-scale drought.
OSG scientists share computing hours with researchers in many fields to enable their science, striving to make sure the computing grid is used with maximal efficiency. If a particular research group needs more computing resources than usual—say in the lead-up to summer conferences—it can use other groups’ computing power through the OSG. But then, when that group is attending one of those conferences—and so not running as many calculations as usual—others can use the otherwise unused resources to run their own calculations.
“When you have a big enough community and you harvest all the dips in usage, you can allow researchers to get their work done by using these ‘opportunistic cycles,’” says Sehgal. “These cycles would have otherwise been wasted, so we’re putting them to good use for science in the United States.”
Don Krieger, who studies brain trauma at the University of Pittsburgh, uses the OSG as part of a comprehensive group effort to better understand, diagnose and treat concussions. He says he is able to conduct his research more quickly and economically thanks to this powerful resource.
Krieger works with a large team to characterize brain anatomy and track brain function of patients with concussions and more severe head injuries. The work involves recording magnetic fields outside the head that are produced by the brain, and then using the OSG to determine the neuroelectric activity that caused these fields. It takes about 150 CPU hours to analyze just one second of these magnetic field recordings.
With its ability to dynamically portray neurological function on previously unachievable time and distance scales, the technique already surpassed the power of functional MRI scans. With more than 2 million people with concussions visiting emergency rooms in the United States each year, Krieger’s studies could have an enormous impact.
“The work we have accomplished in the past year would have literally required millennia on a single state-of-the-art workstation,” Krieger says. “And it would have cost at least $500,000 using commercial cloud computing. Instead we were able to run ‘opportunistically’ on the OSG and use cycles which otherwise would have been lost.”
In just the past 12 months, these opportunistic cycles have added up more than 100,000,000 CPU hours. (That’s the equivalent of 100,000,000 computer cores being used for one hour each.)
“There is no comparable resource.” Krieger says.