Researcher drives interest in cutting-edge field of parallel computing
High-powered graphics in video games like “Skyrim,” “Bioshock” and the “Grand Theft Auto” franchise depend on specialized processors that perform multiple separate functions side-by-side. These graphics processing units, or GPUs, perform the heavy lifting of rapidly building images for display in fast-paced, visually oriented games.
But that’s only one use for these processors — researchers have extended their abilities to perform general computing just like the CPU in your desktop computer. With this shift, researchers have found them to be more efficient at processing large amounts of data than traditional computer processing units or CPUs.
Their massive hardware parallelism makes GPUs uniquely suited to many data-intensive scientific applications. They have proven to be particularly useful for bioinformatics applications as the amount of DNA, RNA and protein sequence data available continues to increase exponentially. Michaela Becchi, an assistant professor of electrical and computer engineering at the University of Missouri, received nearly $500,000 from the National Science Foundation for research on data-intensive uses for GPU clusters.
Big Data? Meet Big Computing.
The cutting edge of increased computing speed Until a couple of decades ago, computer architects focused on achieving higher computing capabilities by building more powerful and sophisticated single-core general purpose processors. These types of units performed complex calculations in a single stream. They were fast, but researchers always wanted something faster.
The focus was on increasing the clock rate of the CPU, or the frequency at which it can run. The limitations of materials and heat, however, slowed efforts in that direction.
These old techniques are no longer feasible because of the increasing power consumption and the huge gap in speed between processors and memory. As the result of paradigm shift, scientists and engineers stopped building larger and larger processors. Instead, they began to integrate many small and simple cores within the same chip. This is the so-called many-core architecture.
“Evolution came to the point where increasing the clock rate was not feasible, so evolution to multi-core and many core machines was kind of inevitable,” Becchi said. “A few years ago people started using GPUs for other scientific applications.”
In addition to graphics applications, Becchi said GPUs began to be harnessed for complicated processes like those in computational finance, weather prediction and pattern recognition. The architecture of GPUs, with hundreds of cores rather than the single or few cores available on general purpose CPUs, means such multiple computations can be done in parallel, providing an advantage in the field of high performance computing.
Da Li is pursing his doctorate in computer engineering and works in Becchi’s lab. He compared the advantage of GPUs over CPUs to that of 1,000 smaller men over four big, strong men. The combined power of thousands of the less sophisticated cores in the GPU translates into higher computation capacity than the fewer cores in the more powerful CPU.
“It depends on what problem you’re trying to solve,” Li said. The GPU’s architecture is better for throughput-oriented problems, as the multiple cores allow the processor to execute several billions of small programs at once.
“Considering the applications for big data and high-performance computing, what people care is not getting a response immediately,” Li said. “On the contrary, the ultimate goal is to process a large amount of data with some time constraints.”
The GPU structure is better for these types of throughput oriented problems, as the multiple cores allow the processor to execute several billions of small programs at once.
Some companies have already recognized the potential of the GPU. In 2010, Amazon launched a new cloud service with GPU clusters. Hardware companies also have been investing in developing many-core architecture (Intel’s Xeon Phi Coprocessor) and making it more programmable for researchers and programmers.
Since GPUs benefit from a different design philosophy, they have become a star in the high performance computing community. Titan, which is the fastest supercomputer in the world according to the Top 500 list, relies on thousands of cutting-edge GPUs.
Harnessing speed for new research applications
Becchi has already shared her background in computer engineering to assist professors in other disciplines with the challenge of analyzing terabytes of data. She is one of the core faculty at the University of Missouri Informatics Institute (MUII), which is dedicated to creating interdisciplinary efforts to tackle the problems of information processing.
“Today there is a lot of emphasis on big data as there’s more and more data available in biology,” Becchi said. “Bioinformatics applications need to perform a lot of searches and comparisons among big data sets.”
Kittisak Sajjapongse, a doctoral student whose research is focused on distributed computing, said the ability to process these large data sets will help researchers in many fields.
“We want to be able to provide a framework to scientists which runs on a cluster with GPUs because the GPU is cost effective,” Sajjapongse said. “So we want to reduce cost and at the same time we can also speed up research for scientists.”
A paper published in February 2012, co-authored by Becchi, Dmitry Korkin, Chi-Ren Shyu and other College of Engineering researchers, presented the results of a new tool to analyze protein structures built on the power of GPUs. To demonstrate the superior performance of the multi-core GPUs, the researchers compared the speed of the new program running on a GPU card to existing solutions on CPUs.
The results are telling: the new program completed its analysis between 36 and 65 times faster than existing programs relying on the computing power of traditional CPUs.
The program, ppsAlign, is freely available to researchers interested in trying it. Although designed mainly for protein structure alignment — comparing the structures of proteins in order to discern their relationship or possible functions — the software could be repurposed to compare other biological data such as DNA or RNA sequences.
Becchi also has worked on applications for bioinformatics with her husband, Gavin Conant, assistant professor of animal sciences and a colleague at MUII. She said he had more than 30,000 DNA sequences he wanted to analyze — 450 million individual comparisons, a process that would have taken months with a CPU-based program.
“I wondered, ‘Why don’t we use the GPU to accelerate this computation?’” Becchi said.
Similar applications for varied bioinformatics purposes existed, but nothing that was applicable to Conant’s data. One of the challenges in harnessing GPUs for applications like DNA sequence comparisons, said Becchi, is the difference in structure that requires a complete rewrite of the algorithm.
“You cannot just take a serial algorithm and tweak it here and there,” she said. Instead, the programmer has to identify where it can benefit from using the parallel structure of GPUs.”
Their joint work led to a paper presented at the IEEE International Conference on Application-specific Systems, Architectures and Processors at George Washington University in Ashburn, Va., in June 2013. The work was co-authored by Li, Sajjapongse, Becchi, Conant and Huan Truong, an informatics doctoral student advised by Conant.
Beyond applications — laying the groundwork
While creating applications such as ppsAlign that use the unique structure of the GPU is a more applied research area, it is not what most intrigues Becchi. Her core research focus is more on the abstract — investigating how clusters of machines equipped with GPUs can be effectively used to maximize application performance and power efficiency, for example.
“They are projects that are not specific to a particular application problem,” Becchi said.
Rather, she wants to build a framework to utilize GPUs in clusters for high-speed computing, the focus of her NSF grant. The goal of that research project is to create a software package that will be able to use multiple nodes with both GPUs and CPUs to increase the computational power available to the user, all while providing a seamless interface on the front end.
Becchi’s background is in computer engineering, with a focus on the architecture of computer systems. She received both her doctorate and masters degrees in computer engineering from Washington University in St. Louis and then worked as a researcher in the systems architecture department at NEC Laboratories.
She said she’s brought that architectural and structure-focused view with her to Mizzou, expertise that strengthens MU’s program.
Developing a strong GPU program at MU
Becchi currently works with two doctoral students and three master’s students in her lab, as well as some undergraduate students.
Li came to Mizzou because of a previous connection with Becchi and his interest in an area of irregular applications for GPUs, graph algorithms. He has had two papers on the topic published in his first two years as a doctoral student and another is in the works.
“I applied here because I knew about Dr. Becchi’s research and I know she’s really nice,” Li said. “She gave me lots of hands-on help in my first year.” This summer Li will be interning at the NEC Research Lab in Princeton, N.J.
Becchi has worked hard to attract outside funding and interest in her research area. In February, NVIDIA named MU as a CUDA Research Center. CUDA is a parallel computing platform and programming model for GPUs. The company is leading the field in developing extremely powerful GPUs, and the designation came with a gift of the newest hardware, which she hopes will increase interest in the field.
“Basically if there is new hardware we can have access,” Becchi said. “If we have questions about the GPUs we have a privileged way to get the information.”
Sajjapongse, who got his master’s of science from MU in 2010, said he stayed on to work with Becchi, as her research interests overlapped with his. His focus has been on making groups of GPUs and CPUs work together to increase the computing power of a system and not let any component go to waste. He already has created a framework for that purpose, and has been working with computers in both Lafferre Hall and Engineering Building West, which has created some challenges.
“If we just integrate them in a naïve way, the slow computers over in Engineering Building West would slow down the faster servers in Lafferre,” Sajjapongse said.
This summer, Sajjapongse will be interning at AMD, a semiconductor company that develops computer processors. Becchi said GPU computing is a particularly high-demand, rapidly growing field and that Sajjapongse’s work in this field helped him secure the internship.
Becchi added that she hopes more students will become interested in the field by working at her lab and hearing about her research at Mizzou. Starting with repurposing algorithms to work on the parallel structure of GPUs, students can quickly learn the principles of parallel computing structures.
“They learn to deal with algorithms and computer architecture at the same time. They work at the intersection of two worlds,” Becchi said. “They learn how the architecture and design of the computer influences the algorithm.”