NSF grants boost Big Data capabilities at MU
The Big Data boom at the University of Missouri College of Engineering received another big boost courtesy of the National Science Foundation (NSF) and matching funds from the university.
Chi-Ren Shyu, chair of the MU Electrical and Computer Engineering Department and director of the MU Informatics Institute, recently received a one-time, $600,408 grant from NSF through its Major Research Instrumentation Program, and MU pledged to match $257,318, paid over the course of the three-year project, to fund equipment for a supercomputer to enable larger-scale Big Data research.
Meanwhile, Prasad Calyam, assistant professor of computer science, received a $399,775 grant from NSF to be paid over two years to establish a new effort to develop cyberinfrastructure engineering expertise at MU. He will lead activities that will investigate the proper roles, resources and policies to allow highly productive research collaborations. His main research focus will be to experiment with the management of campus network systems to make them adaptable and integrated with cloud computing architectures, thus maximizing their potential for Big Data researcher needs.
All told, the grants total more than $1 million toward creating a secure hybrid cloud network that will lessen the need for researchers to rely on supercomputer data centers to process their data. The new resources will allow for connections between all users regardless of location, as well as integrating public cloud infrastructures, including Amazon Web Services, IBM Bluemix and NSF iPlant, among others.
The cloud will be available not only to MU’s vast array of researchers, but also to other universities in the UM system and the state, including the University of Missouri-Kansas City, Missouri University of Science and Technology, University of Central Missouri and Truman State University, among others. In total, 16 faculty members and 147 students of various levels of study will be involved with the project.
“Having all of us work together to think about what’s the best set-up here [to transform campus supercomputing practices], I think that’s the part that got the NSF excited about it,” Shyu said.
Shyu said the supercomputing system will attempt to incorporate three different types of hardware in order to create a more efficient hybrid cloud environment to best serve large amounts of scientific data. Once the equipment is received, it should take about six months to become operational.
The goal is to eventually enable real-time, or close to real-time, results from the large swaths of uploaded data, as well as multi-modal data analysis and deeper analysis. There’s also a desire to learn how the hybrid cloud’s user experience compares to the traditional high-performance computing experience. The system additionally will service bioinformatics and computational biology as well as perform next generation user services for high-performance computing.
“That’s going to fulfill the need for our high-performance computing work here,” Shyu said.
“If we can process even five percent of the data we collect every day, I think we are good. But we don’t even process that amount [currently]. So [the question is] ‘How are we going to use this Big Data environment to provide quick results?’”
Supercomputing with that much power has a multitude of applications, and the possibilities have intrigued researchers of many different stripes and led to collaborations on how best to use such a resource and maximize its computational ability. Shyu said he and the engineering team have worked with MU researchers with data-intensive projects in a wide variety of fields, including health monitoring, eldercare, bioinformatics and genome sequencing in plants and humans, among others.
“Together, [the grants] will provide new hybrid cloud computing capabilities for our campus researchers to collaborate with their remote researchers and effectively access remote instruments,” Calyam said.
The equipment should provide future cost savings for the university and its researchers. Considering that a machine with the one terabyte of memory typically needed to sequence a genome can cost upward of $50,000, the ability to design an algorithm to enable a machine to crunch down that amount of information and make it usable should give MU more bang for its buck.
The system will be housed on two floor-to-ceiling racks of processors in MU’s Telecom Building, and Shyu is thankful for campus support in terms of housing the units.
“The campus was very supportive in terms of hosting it here and also providing the infrastructure to have that kind of equipment,” Shyu said. “Without Gary Allen’s [UM vice president for information technology and chief information officer] vision, support and commitment, it wouldn’t be possible.”