Mizzou Engineers Develop Algorithms to Organize Streaming Data

May 26, 2021

Portrait: Jim Keller

Mizzou Engineers have written new algorithms to better organize, or “cluster,” streaming data. The work has the potential to help businesses better use information coming in constantly or medical providers best respond to ongoing health changes.

The goal of clustering is to find meaningful structure within a set of data. But the rise of continuous information generated from sensors, the internet or other sources make it tough to rely on traditional clustering processes to understand the data.

“The data comes in — like clicks on the internet — and it’s not like you can collect a billion points and then do something with them,” said Jim Keller, Curators’ Distinguished Professor Emeritus in Electrical Engineering and Computer Science (EECS). “You need to process them on the fly so you can make decisions.”

Keller worked with Omar Ibrahim, a former PhD student and post-doctoral fellow, and James Bezdek, a visiting professor in EECS, on the algorithms and to oversee the process with novel incremental indices.

“What we discovered is that we can monitor what’s going on in this high dimensional space as it’s happening,” he said. “As you plot this incremental index, what you’re really seeing is how well the data coming in fits the model you’ve started to build and when it starts to deviate from the model. It really is giving you a look at what’s happening to your data in real time.”

Keller envisions the work will be useful in eldercare technology that relies on sensors to collect daily health information.

“You can’t wait to collect all of the data before you make a decision,” he said. “You want to be able to see what is happening when a resident starts to deteriorate physically. These are the types of things we’re looking for in this streaming data analysis.”

The algorithms are outlined in a journal paper recently published in the Institute of Electrical and Electronics Engineers (IEEE) Transactions on Emerging Topics in Computational Intelligence. The paper was also featured in the IEEE’s Computational Intelligence Society’s May newsletter.