 Broadening our perspective outward from the individual and the organization to the population, machine learning is becoming an increasingly valuable tool for preventing disease and improving population health at the societal scale. For the last decade, my lab has focused on advancing public health by developing new machine learning methods for anomalous pattern detection. We can use these methods to identify emerging health and disease patterns, which may be visible at the population level but may be obscured when considering data from each individual in isolation. For example, a single individual entering a hospital emergency room with cough and fever symptoms may not be sufficiently unusual to indicate the presence of a disease outbreak. However, when we see multiple individuals with similar symptoms, for instance a group of people buying over-the-counter medications, these data when considered collectively may enable us to detect the outbreak much earlier and save many lives. We have recently developed new methods for detecting emerging clusters of disease cases in space and time, such as this outbreak of gastrointestinal illness that we detected near the city of Columbus. Now we've deployed these methods to assist public health by monitoring emergency department visits and over-the-counter medication sales for early outbreak detection. Recently, we've seen an explosion in the size, diversity and complexity of data available for monitoring and improving population health. New data sources, such as social media, provide a powerful supplement to existing public health data, enabling earlier and more accurate detection. For example, we are currently using Twitter to identify emerging clusters, emerging outbreaks of the rare and serious disease, Hantavirus, in Chile. We do this by actually detecting anomalous clusters of activity throughout the entire Twitter network, such as a set of keywords, hashtags, tweets, locations and users that are collectively anomalous. Now this is a challenging task because of the scale and complexity of the Twitter network. However, we've been able to accurately pinpoint the affected towns and cities in the very early stages of an outbreak, such as this one we detected in the city of Timucco, and the popular tourist destination, Villarica. Now in the next decade, I believe that the ubiquity of cell phone data, along with the development of new machine learning methods that can effectively use all the cell phone data, will enable vastly improved real-time outbreak detection and response. Your cell phone will be able to form probabilistic inferences about whether you are healthy or ill by combining a variety of data sources, such as face recognition, environmental sensors, and changes in movement patterns. These noisy, individual level data can be integrated to form a much more precise picture of where and who are affected by a disease outbreak, using cell phone location and proximity data to track the spread of a contagious illness. Your cell phone will be able to see which other phones are in close proximity to it, and will be able to forward messages suggesting appropriate public health interventions, even before the nearby individuals are actually affected. Now privacy concerns are potentially serious, but can be mitigated by storing and analyzing data individually on each phone, passing only aggregate signals between phones, and allowing each user to access only his or her own data. Now additional public health benefits can be gained by using data about when people call for emergency help, or when they could call for other city services, such as to complain that their garbage has not been picked up. Now with this data, we can detect emerging clusters of health-related citizen complaints. More than that, we can actually predict that such patterns are going to occur, enabling proactive responses by local governments. In our current work on rodent prevention in the cities of Pittsburgh, Chicago, and Baltimore, we are actually able to predict that a cluster of rodent complaints will occur and use this data to very precisely target proactive rodent-baiting efforts. We're currently doing a controlled experiment in Chicago in order to show that by predicting rodents better, we can actually reduce the overall level of rodent infestation. In summary, the power of machine learning to automatically detect patterns in the huge quantity of individual level data that we have is really both exciting and scary. So my question for you is how we can best maximize the societal benefits of these new technologies and data sources while limiting the risks that they impose on individuals and communities. Thank you.