Big Data Helps Solve Foodborne Illnesses

Friday, August 19th, 2016 | 605 Views

Using big data can speed up foodborne illness outbreaks investigations to just a few hours, compared to the weeks and months that traditional investigations take. With as few as 10 medical-examination reports of foodborne illness, researchers at were able to narrow down the investigation to 12 suspected food products.

Timing can be crucial in the economic and health impact of a foodborne illness outbreak, like salmonella, E. coli and norovirus infections. Rapidly identifying the contaminated food source is vital to minimising illness, loss and impact on society.

Researchers at IBM Almaden created a data-analytics methodology to review spatio-temporal data, including geographic location and possible time of consumption, for hundreds of grocery product categories. Researchers also analysed each product for its shelf life, geographic location of consumption and likelihood of harbouring a particular pathogen – then mapped the information to the known location of illness outbreaks. The system then ranked all grocery products by likelihood of contamination in a list from which public health officials could test the top 12 suspected foods for contamination and alert the public accordingly.

Typical processes take longer, employing interviews and questionnaires to trace the contamination source. In 2011, an outbreak of E. coli in Europe took more than 60 days to identify the source: imported fenugreek seeds. More than 50 people died and nearly 4,000 people fell sick across 16 countries before public health officials could pinpoint the source, according to the European Food Safety Authority.

“While traditional methods like interviews and surveys are still necessary, analysing big data from retail grocery scanners can significantly narrow down the list of contaminants in hours for further lab testing. Our study shows that Big Data and analytics can profoundly reduce investigation time and human error and have a huge impact on public health,” said Kun Hu, public health research scientist, IBM Research – Almaden in San Jose, California.

The methodology employed in this study has been already applied to an actual E. coli illness outbreak in Norway. Public health officials were able to create a shortlist of 10 possible contaminants from 2,600 possible food products with just 17 confirmed cases of infection. Further lab analysis pinpointed the source of contamination down to the batch and lot numbers of a specific sausage product.