New protein prediction software put to the test
“Bioinformatics is one of the most exciting fields in science because it is interdisciplinary—the marriage of computer science and biology, said Jianlin Cheng, an assistant professor in computer science at the University of Missouri, and a faculty member of the MU Informatics Institute. “Students in this field have good job opportunities and great research opportunities in higher ed and beyond.”
New to the University last year, Cheng is doing collaborative research with Charles “Bill” Caldwell, director of the University’s Ellis Fischel Cancer Center.
“Cancer cells and normal cells function differently. We are looking at the mechanisms of cancer at the molecular level using one thread of my research, which is protein structure prediction,” said Cheng.
“We know the sequences of millions of proteins, but we only know the structure of about 40,000,” said Cheng, explaining that the amino acid sequences that comprise proteins must fold into three-dimensional structures in order to function, and that it is the shapes themselves that determine function.
“How we digest food, how we defend against viruses, how we see light—all are functions that are dependent on proteins,” said Cheng. “In 2005, the Journal of Science named this one of the top 125 greatest unsolved scientific problems. If we can solve this problem, one day we may help cure diseases like cancer, Alzheimer’s and Parkinson’s.”
Cheng and his research group have developed a protein structure prediction program called MULTICOM and are currently putting it to the test in an international competition that has taken place since 1994, in alternating years. The Eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8) is sponsored by the National Institute of Health. This year 230 teams are competing.
“CASP is unique in the scientific world because it is a blind test of real predictions.” said Cheng.
From May through August, CASP sends protein sequences out to research groups around the world, two proteins per working day, for a total of 128 proteins. Participating teams must return up to five structure predictions within three days.
Independent experts will evaluate the submissions and results will be released at the post-CASP8 conference to be held in Italy in December. The top groups will be invited to give a talk at the conference and will have a paper on their work published. Cheng has been invited to speak about his prediction methods and performance.
Already participants are reviewing results and self-ranking their work. Out of the 128 targets, 113 are known and in the preliminary analyses, Cheng’s predictions appear in the top five along with—and even higher than—some of the top experts in the field.
“Personally, I like the challenge and the fundamental scientific significance of protein folding. It has huge implications for science, the economy and technology. The tools we are developing can apply to any protein in any species, from soybeans to mice to humans,” said Cheng.