December 22, 2020
Mizzou Engineers are taking to Twitter to track COVID-19 and analyze the virus’s impact on individual health.
Yijie Ren, Jiacheng Xie and Lei Jiang are using Twitter’s built-in programming interface to search tweets for key phrases such as “I tested positive.” From there, they’re delving deeper into the Twitter user’s account to log symptoms and recovery experiences.
Over the years, Twitter has become a popular database for researchers who want to mine real-time information—making it an ideal spot to follow an unpredictable virus, said Ren, who earned her master’s degree in May.
“COVID-19 is a very important topic, and we’re still being impacted by it,” she said. “That’s why we wanted to take a look at this. Twitter is the world’s biggest database. And it’s natural: people are willing to share information.”
The research is in early stages, but so far, the information they’ve gleaned from the microblog mirrors medical findings. COVID-19 patients have tweeted out their experiences with fever, chills and fatigue, as well as reporting some lesser common symptoms, such as eye pain and ringing in the ear.
“So far, the results we’ve found are pretty similar to medical news, but we expect to find some surprising results,” Ren said. “And that’s what our project is about. We want to provide some information to the public or, if it’s very surprising, we can provide some useful information to the medical community.”
The research team is also discovering cases of reinfection and symptoms lasting beyond recovery.
“I think our project is very useful because you can get some new information about the virus,” said Xie, who’s working on a PhD in computer science. “For instance, some people who say they’ve recovered from COVID-19 continue to have symptoms.”
An Automatic Approach
Currently, the researchers are manually filtering results to ensure the data includes only tweets from those personally diagnosed. So, for instance, any political or editorial tweets are scrapped from their database.
“We actually download all of the positive diagnosis data, and then manually select those related to the patient,” Ren said. “After we identify a real positive, we track back to the person’s feed.”
That provides additional information such as a person’s physical activities, diet and access to protective equipment. Later in the project, they hope to take a closer look at demographics such as age.
Once the team has enough data, they plan to train a model to use Twitter to track COVID-19 patients automatically through machine learning.
Ren, Xie and Jiang, a PhD student, are under the supervision of Shumaker Professor Dong Xu. The team is currently seeking funding for the project and ultimately hope to publish their findings.