Researchers use software to predict success of D.C. restaurants
For the study, the researchers identified slightly more than 2,000 Washington, D.C., restaurants that were open as of December 2013. From various sources, they then identified roughly 450 that had closed from 2005 to 2014. To identify linguistic patterns that foretold closure, they paired restaurants according to such factors as price and cuisine type, and looked at how the descriptions varied.
It doesn't take a Ph.D. to know that there's a connection between a restaurant's Yelp rating and whether it will survive. But what Jorge Mejia, the Smith doctoral student, Shawn Mankad, an assistant professor, and Anandasivam Gopal, an associate professor, have created is more powerful: Their computer-assisted text analysis was more accurate at predicting restaurants' demise than ratings alone (although it is most powerful when used in combination with numerical ratings).
"The whole idea is that we are surrounded by all of this free, unstructured data," Mankad says—hundreds of thousands of words that would require armies of employees to read, let alone interpret. "We should be using that data."
The influence of online reviews is indisputable: More than 60 percent of Americans say that such reviews have high or medium level of influence over their buying decisions.
Other scholars have sought to take the emotional "temperature" of online reviews, by analyzing the proportion of positive versus negative words. This new approach goes deeper, examining constellations of words that were associated with restaurants' beating the long odds of their industry and remaining open.
For instance, restaurants for which reviewers used the words "food," "good," "place, "like," "order," "friend," "time," "great," "nice" and "service" tended to survive at unusually high rates. The Smith School professors called the variable linked to those words "Quality_Overall," and it seemed to be the most potent signifier of general quality. "Constructing the variables, putting it into a predictive model—this is something that has never been done before," says Mankad.
They used one subset of data to uncover the relevant linguistic patterns and another subset to test the predictive power of their model. In that second group, the variables did predict, to a statistically significant degree, whether a restaurant closed.
Although their predictive powers haven't been tested in the real world, the algorithms and models used could be of great use to restaurant operators, the authors said.
What to read next