Skip to Navigation Skip to Page Content

Grad student selected as finalist for worldwide TEXATA competition

Cao

MU computer science graduate student Hongfei Cao was selected from over 2,000 participants as one of 12 finalists for the inaugural 2014 TEXATA Big Data Analytics World Championships, which took place Nov. 21 through 23 in Austin, Texas. Photo courtesy of Hongfei Cao.

MU computer science graduate student Hongfei Cao was selected from over 2,000 participants as one of 12 finalists for the inaugural 2014 TEXATA Big Data Analytics World Championships, which took place Nov. 21 through 23 in Austin, Texas.

Cao, one of only three student finalists, competed in the Big Data Showdown with other world-class data analysts for the title of Big Data Analytics World Champion. The series of competitions tested the participants’ ability to use problem solving, mathematic modeling, machine learning and programming using Big Data ecosystems to solve large-scale problems.

“I think this is a really good opportunity for me and also to get to know the people in the industry,” said Cao. “Also, from a business point of view, the cofounder and partners are all from the business world­­ ­— IBM, Amazon, Microsoft ­— so you do see a lot of job opportunities and also the big data market is there.”

Cao first learned of the competition when it was forwarded to him in an email by his adviser Chi-Ren Shyu, MU Electrical and Computer Engineering Department chair and director of the MU Informatics Institute. “He always encourages us to participate in these kind of events and competitions,” Cao said of Shyu, who has been leading the campus-wide effort for Big Data initiatives.

“We strive to provide our students, both undergraduates and graduates, up-to-date training in Big Data technology, just-in-time case studies, and state-of-the-art computing infrastructure in order to stay on the top of the Big Data wave,” Shyu said. “Hongfei’s selection as one of the top 12 finalists demonstrates the necessary skill sets we expect our students to have to be competitive worldwide.”

Cao decided to enter the competition and made it through the online first round of the competition that filtered out the bottom 50 percent of participants. The second round filtered out all but the top 12.

The first part of the competition was made up of multiple-choice questions. The contestants were given terabytes worth of real business data and will be asked to design their own tasks and questions for the data, come up with their own mathematic model for the data and do calculations.

The second part of the competition involved coding, where the finalists were given a real data set and were asked to program, using the web service Amazon Elastic MapReduce (EMR) or their own big data platforms, to solve the questions.

“I’m really interested in doing research related to big data because I think big data is the next big thing,” said Cao.

Before the competition, Cao spent his time preparing models and theoretic questions and generally reviewing his knowledge, as well as getting familiar with Amazon Cloud Computing, the competition’s cloud service platform.

“I think that with proper training in theoretic machine learning and data mining as well as hands on experience with big data programming, everybody can do this,” he said. “I think this would be a good competition for all students to join.”

The top two finalists from an industry background and the top student were chosen as the winners of the final round of the competition.