The ensemble model has been developed to predict flu more accurately, based on data from multiple sources.
Apart from the Centers for Disease Control and Prevention (CDC), many groups have attempted to create models that could predict impending flu activity.
Google Flu Trends (GFT) was launched for this purpose in 2008 but decommissioned in 2015.
A research team, led by computational epidemiologists at Boston Children's Hospital, wanted to know whether they could combine existing models that individually predict flu activity, to provide robust, real-time estimates that could accurately guide hospitals and health systems in allocating resources for flu care.
The team started with four separate non-traditional sources of "now-casting" (real-time) models of flu-like illness activity.
The sources used were:
- Clinical data from electronic health record (EHR) manager athenahealth
- Crowd-sourced flu data from Flu Near You, a participatory surveillance system developed by HealthMap.
In an approach similar to that used by weather forecasters to predict hurricane tracks, they used machine-learning techniques to combine the data and generate a set of "ensemble" models that incorporated the results produced by the four single-source models.
Predictions accurately compare with existing models
To determine accuracy and robustness, they compared their results with those of each of the four real-time source models, with the CDC's historical flu-like illness reports and with GFT now-casts from the 2013-14 and 2014-15 flu seasons.
The new models produced robust, real-time estimates that outperformed their four real-time source models and also generated better forecasts than the CDC in terms of timing and magnitude of flu-like illness activity at each time horizon measured, up to 3 weeks in advance.
The ensemble predictions also accurately tracked CDC reports of actual flu activity, with real-time estimates that were 99% accurate, and only slightly less so at the 2-week time horizon.
- Flu season in the US begins in October and can last until May
- The season normally peaks between December and February
- Vaccination is recommended for everyone over the age of 6 months.
The researchers conclude that combining multiple data sources gives a stronger, more robust, more accurate prediction of flu activity.
Senior author John Brownstein, PhD, Boston Children's chief innovation officer and co-founder of the disease tracking site HealthMap, explains that individual data sources have long been used to track a range of diseases.
Combining data in a new way where the whole is more valuable than the sum of its parts is the next logical step.
The CDC closely monitor seasonal flu-like activity across the US, but the data reports it generates and distributes to clinicians and public health authorities are historically 1-2 weeks out of date.
The researchers believe that one of the keys to the model's success is the inclusion of social media and EHR data, which added more value than purely historical patterns.
"Weather forecasting is an established discipline, and has become engrained in society. We think the time is ripe for the same to happen with disease forecasting."
While the model currently can predict flu activity on a national scale, they hope to extend its geographical reach beyond the US borders, to predict diseases in an international context, and to track other diseases for which multiple data sources are available.
Ultimately, the predictions could be made publicly available, to benefit as many people as possible.
Medical News Today recently reported that a new flu vaccine has been developed that can protect against multiple strains of the disease.