Keeping it Fresh: Predict Restaurant Inspections

Flag public health risks at restaurants by combining Yelp reviews with open city data on past inspections. An algorithmic approach discovers more violations with the same number of inspections. #civic

$5,000 in prizes
aug 2015
604 joined

The City of Boston is home to thousands of restaurants and just a handful of health inspectors. Can the inspectors use the Yelp reviews that citizens generate to get a better view of active risks to public health?

Why

The City of Boston inspects every restaurant to monitor and improve food safety and public health. Health inspections are usually random, which can increase time spent at clean restaurants that have been following the rules carefully — and missed opportunities to improve health and hygiene at places with food safety issues.

The Solution

The winning algorithm uses data from social media to narrow the search for health code violations in Boston. Competitors had access to historical hygiene violation records from the City of Boston and Yelp’s consumer reviews. Algorithms detect words, phrases, ratings, and patterns that predict violations, to help public health inspectors do their job better.

The Results

The competition results were studied by Harvard researcher Mike Luca and covered in the Washington Post in 2015: "Using the winning algorithm, Luca says, Boston could catch the same number of health violations with 40 percent fewer inspections, simply by better targeting city resources at what appear to be dirty-kitchen hotspots. The city of Boston is now considering ways to use such a model."

And in fact they have. As of 2017, the city of Boston used the top algorithms from this project and - in practice - found 25% more health violations, while also surfacing around 60% of critical violations earlier than before. By taking advantage of past data and combining with new sources of information, the city can catch public health risks sooner and get a smarter view of how to dedicate scarce public resources.


RESULTS ANNOUNCEMENT + MEET THE WINNERS

WINNING MODELS ON GITHUB