Engineer uses advanced deep learning to predict where proteins will localize within cells

February 22, 2022

Dong Xu

A Mizzou Engineer is developing computational tools that can be used to predict where proteins will localize within a cell. Using highly advanced deep learning, the resource could help researchers better understand how proteins function or, if positioned incorrectly within a cell, misfire and cause problems.

Dong Xu, a Curators’ Distinguished Professor in electrical engineering and computer science, has received nearly $650,000 from the National Science Foundation for the work. Ultimately, he hopes to create informatics infrastructure such as open-source software and a web server that can be used for other protein localization studies.

Proteins are the basic building blocks of life. They’re made up of strings of amino acids that fold into three-dimensional structures. They then localize, or settle, in a part of the cell such as a cell membrane, nucleus or mitochondria.

Over the past couple of years, scientists have developed effective neural networks to predict what shapes proteins will fold into. However, it’s also important to know where a protein will be located within a cell once it forms into the structure.

“Localization plays a key role in protein function,” Xu said. “If a protein somehow localizes in a different position or incorrectly, it may cause diseases.”

Current experimental methods used to determine subcellular location of proteins — such as tagging them with fluorescent biomarkers — are costly and time consuming.

Xu’s system is the first to use graph-based neural network techniques to provide interpretable results for protein localization. Using cutting-edge machine learning technology and protein sequence data, protein-protein interaction information and single-cell data, the system is expected to provide more accurate, higher resolution insights into the localization process.

Specifically, the framework will help predict localization at the single-cell resolution. That will allow researchers to quantitatively predict the impact of protein mutation and interaction alteration for different cell types.

The work outlines general methods that can be used in other biological studies.

“We believe that by using the latest single-cell data and state-of-the-art machine learning methods, this project will provide a new generation of methodologies and bioinformatics tools for protein localization predictions, as well as a coding-free web platform for a wide range of studies including those involved in animal pathology and plant traits research,” Xu said.

Additionally, the web tools can be used for education and training. Xu envisions high school and college courses using the resources to give students the opportunity to explore machine learning techniques as they’re used in biological settings. The module will contain video lectures and online practice exercises.

Xu is collaborating with Yuexu Jiang, a postdoctoral fellow in electrical engineering and computer science, on the project. The research team outlined findings on their deep-learning framework last year in the Computational and Structural Biotechnology Journal