Researchers training machines to recognize vocal fatigue

black and white image of woman holding hand over red throatEven before COVID-19 had them speaking up in online classrooms or projecting their voices from behind masks, teachers were at high risk of vocal fatigue. This condition can cause persistent hoarseness, throat pain and permanent damage to the vocal cords. Currently, diagnosing vocal fatigue requires an in-person consultation. But someday, a wearable device or smart app could detect vocal fatigue early and help sufferers prevent further problems.

Before that happens, though, a machine has to learn how to recognize the difference between a healthy voice and a fatigued voice. That’s where Gui DeSouza comes in. DeSouza — an associate professor of electrical engineering and computer science — and a collaborator from Germany have spent years training a computer to detect vocal issues by providing the system with hundreds of samples from student teachers and control groups.

“Student teachers are affected by vocal fatigue a lot more than other professionals,” DeSouza said. “We are addressing the diagnostic side because early detection can warn a person to change their habits or take corrective action.”

With funding from the National Institutes of Health, the research team has collected 160 voice samples from 90 participants. The team uses surface electromyography (EMG) sensors that are placed on the neck to detect vibrations. A participant is asked to pronounce certain vowels and consonants that tend to indicate problems in the vocal cords.

Researchers then use that data to train the system to detect changes that indicate vocal fatigue.

Finding a reliable system

Originally, the team tested the model using simulated samples, and the results showed promise. However, in a more recent study, the team intentionally left out voice samples from one human participant and saw a drop in accuracy.

“If you look at the literature, no one has done that before,” DeSouza said, referring to the “leave one out” method. “That indicates that the machine is good at learning people’s voices but not necessarily learning to recognize fatigue.”

Another complication is that there is no consistent standard by which to classify fatigue. Right now, physicians use patient surveys to collect that information. However, one person may have a high tolerance for the discomfort and report a low ranking. Someone else more sensitive could give a higher ranking for essentially the same level of pain.

“One big problem for us is how to make sense of the data when it’s very subjective,” DeSouza said. “The data set is not labeled in a way that’s reliable. Ultimately, we want to have a system that reliably says it is fatigue or it is not fatigue independent of a subjective measurement or self-assessment.”

The research team outlined their findings in the journal of Applied Sciences early this year. More recently, they presented their test results at the 14th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research.

They also have additional funding from the National Institutes of Health to see whether stress induces vocal problems.

“We’re now finalizing our original study and also looking at MRI data to see if brain activities have any correlation with the phenomena happening in the voice,” DeSouza said. “The idea is we’ll subject the patient to some sort of stressor to see whether that manifests in the voice. A lot of the fatigue in the voice could be related to emotional stress.”

Be part of an engineer program that solves real-world problems. Learn more about electrical engineering and computer science at Mizzou.

Enter your keyword