Skip to Navigation Skip to Page Content

Computer science team receives $1.4M from NIH for protein structure prediction

Computer science professors Yi Shang (left) and Dong Xu have collaborated on protein structure prediction research for the last seven years. Their research group, which includes MU Physics Professor Ioan Kosztin and a team of graduate student research assistants, has placed at the top in the Critical Assessment of Protein Structure Prediction (CASP) Experiment, a biennial, international competition.

Over the past seven years, University of Missouri Computer Science Professors Dong Xu and Yi Shang have made a name for themselves in the emerging scientific field of protein structure prediction. This summer, the pair received $1.4 million grant from the National Institutes of Health to build upon their success in what has been labeled one of science’s biggest unsolved problems. It is the third time they have been funded by this country’s leading health research institute.

“It’s a very challenging problem that’s different from many traditional computer science projects,” said Shang. “It’s messy because the fundamentals are not fully understood and the computational complexity is very high.”

Protein molecules exist within the cells of every living organism. The approximately 2 million proteins in the human body’s trillions of cells are responsible for most of the body’s functions. That includes things like structure, movement, digestion, transmission of hormonal messages and transportation of molecules from one place to the other — as in oxygen to blood — and much more.

Each protein in the human body is made up of chains of 20 amino acids — small molecules consisting of carbon, oxygen, nitrogen, sulfur and hydrogen that join together in un-branched chains. Some proteins may be comprised of hundreds of amino acids, but rather than remain in single-file lines, each protein’s amino acids fold into unique structures placing some in close proximity while others are spaced far apart. Shape dictates function, and that is what makes this scientific puzzle so important.

“There’s a lot of significance,” Xu said. Once you can determine or predict the structure, you can see how the protein works.”

Scientists believe that the computational work of researchers like Xu and Shang to predict protein structures from their sequences may allow for the design of proteins that will combat some of humankind’s most dreaded diseases.

The two computer scientists have a long affiliation; both attended graduate school at the University of Illinois. They first discussed the problem of protein structure prediction in the late 1990s.

While working at the Xerox Palo Alto Research Center in Silicon Valley in 2002, Shang did some groundbreaking work in wireless sensor networks. His investigations into multidimensional scaling — data analysis techniques that uses multiple algorithms and displays data as a geometrical picture — brought a new approach to protein structure prediction.

“I had a basic scheme, an algorithm design, and he had a student who could work on it,” Shang said. “No one had ever used it for this before. I gave them the basic code, and he developed the software in 2005, which started this strike of collaboration.”

This branch of bioinformatics has its own biennial, international competition. The Critical Assessment of Protein Structure Prediction (CASP) Experiment is the Olympics of protein structure prediction.

The research group, which also includes MU Physics Professor Ioan Kosztin and a team of graduate student research assistants, first participated in CASP8 in 2008 with MUFOLD.

At the 2010 CASP9, MUFOLD was first in both the human/server prediction and quality assessment categories. The group is awaiting results from CASP10.

The research group refines their prediction methods and model quality assessment methods by accessing the database of known structures — to which their own research contributes — to collect information from the structural pool.

“One of the successful quality assessment methods is a popularity contest,” said Xu. It’s like polling the audience for an answer and the one that is selected by the majority is correct. Majority rules. We collect the best structures and refine them a little bit.”

So passionate is the research team about the vital importance of protein structure prediction they plan to make MUFOLD available via the Web as a tool for other researchers doing the same work.

“Each protein is unique,” said Xu. “You get to know them on a personal level. They are like friends. They have an intrinsic beauty. It’s like they are alive.”

To find out more about protein structure prediction and do a little of your own prediction, visit on the web. You can play the Foldit computer game and actually contribute to the research process.