Researcher continues work to decode genome sequences

June 06, 2022

Genome

In the future, hospitals and clinics may be able to better manage diseases by pinpointing exactly how an individual’s body will respond to treatment. But first, they need a fast, efficient and secure way to analyze DNA, or human genome sequences.

Enter Praveen Rao, an associate professor with joint appointments in Health Management & Informatics and Electrical Engineering & Computer Science. Rao has spent the past two years developing a software system for others to analyze and compare genomes more easily. Now, he has a two-year grant from the National Science Foundation (NSF) to expand upon that work.

Portrait of Praveen Rao

Praveen Rao

Human genomes are essentially the blueprint of an individual’s biological make-up. Because of their size, decoding that information currently requires massive amounts of computational power and storage and comes with a hefty price tag. But the information is vital to treating, curing and even preventing disease.

“Once you’re able to analyze and know which genes are going to be affected, you can predict individual risks and prescribe or design better drugs and treatments based on genetic make-up,” Rao said. “This idea of precision health care revolves around taking into consideration not only factors such as demographics but also genetic makeup and lifestyle. The power of biology and genomics combined with the power of computer science can make a huge difference to human life and how we look at and prevent or treat diseases.”

In 2020, Rao received a RAPID grant to use NSF’s CloudLab testbed to begin to democratize genome sequence analysis. The current grant will allow him to leverage NSF’s new FABRIC infrastructure, which is an adaptive programmable research infrastructure consisting of cutting-edge storage, computational and network hardware nodes connected by high-speed optical links.

Within FABRIC, Rao will take advantage of graphic processing units and sophisticated programmable hardware to help accelerate analyzing mass amounts of genome data.

“We’re going to show how we can develop new techniques and algorithms to further reduce the computing time it takes to analyze human genome sequences,” Rao said. “Then, we plan to integrate that with MU research computing resources. Our eventual goal is that MU researchers and a broader community will be able to use the software platform to do large-scale, whole-genome sequence analysis.”

In addition to making the process more accessible, Rao will study ways to ensure these massive data sets are securely processed with minimal computational overhead. Additionally, he is developing ways to perform genome analysis securely and efficiently across FABRIC and CloudLab testbeds.

By the end of the project, Rao will release open source software for large-scale, cost-effective genome sequence analysis. Additionally, he plans to develop new coursework around his findings and conduct a high school camp to introduce younger students to using computer science for genomics.

Interested in using technology to improve healthcare? Learn more about the power of computer science.