- Published on 20 October 2015
Public health agencies could capitalise on streams of data related to patients on the internet but only once interpretation methods have been validated.
Data is ubiquitous. In the area of heath, there are growing data streams directly initiated by patients through their activities on the internet and on social networks and other related ones such as electronic medical records and pharmacy sales data. These so-called Novel Data Streams (NDS) are very appealing to public health surveillance officials due to their ease of collection. A new paper published in EPJ Data Science evaluates the currently available NDS surveillance papers before outlining a conceptual framework for integrating such data into current public health surveillance systems. The authors, who hail from public health agencies, academia, and the private sector, highlight the need for future rigorous evaluation and validation of standards before NDS can effectively reinforce existing public health surveillance systems.
NDS encompass a broad set of sources, from internet search data to social media posts to Wikipedia access logs, even restaurant reservations and reviews and news sources, according to co-lead author Benjamin Althouse from the Santa Fe Institute, New Mexico, USA.
A well-known example of health surveillance systems based on NDS is a web application called Google Flu Trends developed in 2008. It estimates the number of individuals with influenza-like symptoms who visit their doctor on the basis of Google searches. Despite its initial success, the system was later criticized for not being able to accurately deliver predictions across different influenza seasons.
Nevertheless, thanks to NDS, surveillance systems could soon be nearly instantaneous and deliver on very fine geographic scales, according to Samuel Scarpino, the other co-lead author from the Santa Fe Institute.
NDS could also extend surveillance to places with no existing systems and improve the dissemination of relevant data. And they could potentially measure unanticipated events, such as syndromes associated with new pathogens not currently under surveillance. However, NDS-based approaches can only be adequately vetted following collaborations between academic researchers, the private industry, and public health officials.
B. M. Althouse, S. V. Scarpino, L. A. Meyers, J. Ayers, M. Bargsten, J. Baumbach, J. S. Brownstein, L. Castro, H. Clapham, D. A.T. Cummings, S. Del Valle, S. Eubank, G. Fairchild, L. Finelli, N. Generous, D. George, D. R. Harper, L. Hébert-Dufresne, M. A. Johansson, K. Konty, M. Lipsitch, G. Milinovich, J. D. Miller, E. O. Nsoesie, D. R. Olson,M. Paul, P. M. Polgreen, R. Priedhorsky, J. M. Read, I. Rodríguez-Barraquer, D.Smith, C. Stefansen, D. L. Swerdlow, D. Thompson, A. Vespignani, and A. Wesolowski (2015), Enhancing Disease Surveillance with Novel Data Streams: Challenges and Opportunities, EPJ Data Science, 4 :17, DOI: 10.1140/epjds/s13688-015-0054-0