 Hello, everyone. I'm Harvey Miller. I'm a professor of geography and director of the Center for Urban Region Analysis at the Ohio State University, and I'm also chair of the Mapping Science Committee. On behalf of the Mapping Science Committee, welcome to today's webinar on geospatial needs for, excuse me, geospatial needs for a pandemic resilient world. As a standing committee of the National Academies of Sciences, Engineering, and Medicine, the Mapping Science Committee organizes and oversees studies that provide independent advice to society and to government at all levels of geospatial science, technology, and policy. The MSC also addresses aspects of geographic information science that deal with the acquisition, integration, storage, distribution and use of spatial data. Through its studies, the committee promotes the informed and responsible development and use of spatial data for the benefit of society. Today's webinar addresses the geospatial needs to understand, respond to and plan for epidemics and pandemics, such as the COVID-19 outbreak. We can see that the COVID-19 pandemic has clear geographic dimensions ranging from the personal to the global. This includes the spread of the disease, co-morbidity factors that vary geographically, the uneven burden of the disease, the distributed and heterogeneous nature of healthcare systems, and the highly available response to interventions from political authorities and the public at large. The declines in shifts in human activities also affect broader social, economic, and environmental systems. Today's webinar will address the role that geospatial data mapping, modeling, and analysis can play in crafting effective government and societal responses at the operational, tactical, and statistical levels. We will discuss three major topics today with breaks in between. The first is modeling the spatial spread of disease and its local burden. The second is geospatial needs for rapid response. And then the third topic will be spatial indicators of resilience and recovery. In each of these, under each of these topics, we will have short presentations and allow plenty of time for discussion. Please submit any questions you have via the Q&A feature of Zoom, and these questions will be answered live by the moderators. Before we get started, I just want to acknowledge and thank the members of the Mapping Science Committee in developing this webinar. And in particular, I want to thank Daniel Brown from the University of Washington, Kathleen Stewart from the University of Maryland, and Mark Richard from the Open Geospatial Consortium. And as we noted in the opening slides that this webinar will be recorded and will be available to the public in a few weeks. So we'll get started with our first session on modeling the spatial spread of disease and its local burden, which will be moderated by Daniel Brown from the University of Washington. Daniel. Hi, thanks Harvey. Dan Brown, University of Washington School of Environmental and Forest Sciences. I'm really excited to be here to help introduce three really interesting speakers talking about modeling in the context of disease and understanding disease dynamics. The COVID-19 pandemic has brought to public awareness the importance of models in understanding and managing socio-environmental risks. Models help us predict population level outcomes, however imperfectly, as we've learned. They help us design and test interventions. And importantly, spatial models incorporate spatial heterogeneity at multiple scales, from individuals to national scales, and they incorporate spatial interaction, which in the case of pandemic can drive disease dynamics. Spatial models come in many forms, ranging from those that rely heavily on fitting and extrapolating spatial temporal patterns, to those that focus on describing and encoding processes of movement, diffusion, and change, and most practical models combine some combination of data fitting and process representation. In the context of this workshop and the Mapping Sciences Committee, goals were interested in the spatial data and the role of spatial data in these models. And the spatial data informs these models in a variety of ways, providing the foundational patterns on which patterns are trained and tested, data about spatial interactions, and data that can help us estimate spatially varying parameters like R0. Ultimately, the efficacy and accuracy of models is affected by this interaction between data at different spatial and temporal resolutions, collected with different sampling protocols, and how those interact with the process dynamics. So as we dig in, we have three experts here in disease modeling to help us understand spatial modeling efforts and epidemiology related to the COVID pandemic. Our first speaker is Joshua Epstein, who is a professor of epidemiology in the New York University School of Global Public Health and founding director of the NYU agent-based modeling laboratory. Prior to joining NYU, he was a professor of emergency medicine at Johns Hopkins and director of the Center for Advanced Modeling in Social Behavioral and Health Sciences with joint appointments in economics, applied mathematics, international health, and biostatistics. Before that, he was a senior fellow in economic studies at the Brookings Institution and director of the Center on Social and Economic Dynamics. For the transformative innovations that he's been working on, he was awarded the NIH Director's Pioneer Award in 2008 and honorary doctorate of science from Amherst College in 2010 and was elected to the Society of Sigma Xi in 2018. He is the author of many books, including generative social science studies in agent-based computational modeling and agent zero toward neurocognitive foundations of generative social science. Please join me in welcoming Joshua Epstein. Let's see. See if I can do screen sharing and we'll see if we can get underway. Let's see. Okay, here we go. Start my video. I want to optimize something that I'm not seeing here. Does that look optimal to someone, Eric, maybe? Yeah, so Josh, if you see the green bar for zoom, you can go to where it says more, and then you'll see one of the check boxes will be say optimize for video clip. Optimize. Yes, perfect, perfect. Okay, good. All right. So yes, thank you, Dan, for those kind remarks. I wanted to go quickly through a bunch of modeling of epidemics and related geospatial issues from the scale of the playground all the way to our literally planetary scale agent-based model. So the title is Epidemiology from Playground to Planet. But I wanted to begin with really, really toy models can reveal core principles of epidemiology, with which some of you may not be familiar. So I thought I'd demonstrate a couple of those quickly at the playground level in an agent-based model for those who haven't seen those. This is just a little playground of artificial kids. Blue kids are healthy. This red kid is the index case. He's sick. They're going to just move around at random when a red bumps into a blue. He sneezes on him, gives him the bug. That turns the blue kid red. After being red for a while, you're removed from play. And depending on your mood, you can interpret that as red kids all die, or they leave the playground for the infirmary, but they leave the area. I prefer to just assume they died. So here's how it goes in a very simple example where there's just some probability that you give the bug to someone you contact. And there's some probability per period that you're taken out. So it begins slowly like most epidemics and then starts to spread much more quickly as the mathematics can illustrate. And after a while in this very morbid, simple, agent-based playground run, everybody gets the disease and they all die. All right. Very sad, but very simple story. Everybody gets it and they all die. Okay. So of course, what we're interested in is preventing that. And there are several means of doing so. We try to intervene to prevent that and protect as many healthy susceptible kids as possible. And the big tools, certainly for pandemic influenza, COVID-19, these sorts of pandemic challenges. The big tools are vaccination, which in the case of COVID is really the only long-term solution, and social distancing, which is an immediate-term solution. So let's talk a little bit about vaccination just in principle. Go back to the playground. There were 100 kids, everybody got the bug and they all died. So imagine a perfect vaccine and we vaccinate 60 kids up front. Okay. So 60 kids survive, right? I mean, if I were just the person on the street and someone came over and said, okay, I got 100 kids. I'm going to vaccinate 60 of them with a perfect vaccine. 60 of them will survive. Is that what happens? Let's color the vaccinees. Vaccinate everybody upfront. Yellow kids are the vaccinees. Blue kids are the susceptibles. And red kid is our index case. Now, if vaccination protected only the vaccinees, only the yellow kids, then they'd be the only kids alive at the end of the run. So here's what actually happens. Spread certainly does occur. Okay. But at the end of the run, there's blue kids in addition to the yellow kids. So the vaccinees confer secondary protection on susceptible kids. That's called herd immunity. I'm sure you've all heard of this, but that's a nice illustration of it, more than 60 survive. So I don't have to immunize everybody to crush the epidemic. I just vaccinate enough so that it fizzles out. So what fraction V of the population has to be vaccinated to induce that die out? Well, under heroic assumptions about mixing and perfect vaccines and the rest, the vaccination level has to be at least one minus one over something. What's the something? What's the one parameter that everybody's heard about? It's the are not. And a very cute, highly idealized, very crude, but useful formula is that the vaccination level must be at least one minus one over are not. We can derive this with fancy mathematics and define are not in a nice general manner. But for the moment, just from fact, from practical standpoint, if our not is two, you have to pre vaccinate one minus one over are not one over two equals one half of the population. 1918 pandemic are not was about to. So vaccination of half would have sufficed peek Ebola was also around to and COVID as Dan Brown pointed out, it varies spatially, but it has reached this level in various places at various times. So as a handy indicator of roughly how much vaccination you need to do as a conservative lower bound or something like that. This is a reasonable little for me getting the vaccine is one problem. Vaccine refusal is another problem. And just quickly, you know, swine flu was a declared pandemic and 50% of Americans declined the vaccine. If you use the little formula and assume the COVID are not of two, the same level of refusal would put us right at the tipping point for a huge second wave of disease. So I think even if we get the vaccine by 2021. We have a real challenge getting people to take it. And I'm very concerned about that between now and then it's social distancing as we're seeing all over the country. China, of course, imposed draconian isolation, but they did so too late and many cats had left the bag and things move very quickly around the planet these days. So here is a planetary scale agent based model with about six and a half billion agents on a global map with this was featured in Nature and published in Tomax, which is a technical machine learning journal. But here's the idea black black pixels are healthy red pixels are sick and blue are died or recovered. So we're going to start this in Asia. This was a work we did for the NIH Midas network on swine flu. And you can see once it gets to these high population density places spaces, crucially important in all of this, it spreads very quickly here it goes to rips through China in a big hurry and gets around the world quickly by global airlines. So if you want to study here just wanted to head a little bit faster. All right, so that's about a 250 day epidemic unmitigated epidemic base case business as usual, but it gets around the planet in a huge hurry SARS was on, I think four or five contents in 24 hours. So you want to know what about travel restrictions. What's the optimal scale restrictions optimal pacing of travel restrictions, or the optimal distribution of vaccines and antivirals worldwide. And then there's some global scale representation of transmission that you can use to study interventions at this scale. All right, and I think it would be a great advance to embed infectious disease models of that type in large scale geosimulation systems geospatial simulation systems and study the coupled dynamics of disease and other forces at that scale. So many but here are four sort of geospatial for things one is deforestation and changes in animal reservoirs SARS COVID these are all from bats. Another is climate change and the northward northern migration of vector ranges seasonality of diseases and the entire process of urbanization and rising contact densities. Just for example Ebola in 19 for in 2014. You know, the big outbreak was 2014, but it had been around a long time. Here is a graph of Ebola deaths in Africa, you know, for since 1970. So why a big spike in 2014 and why was it predominantly urban. So you could say maybe it's genetic variation or something not doesn't hold up, more compelling is changes in land use deforestation. This deforestation increases the vent vector densities in a nice, not linear way. I mean, most of these people contract the disease in forest or harvesting bats that they then sell or other for their animal animal reservoir contact. But if you take the forest cover from 10 by 10 to nine by nine, it cuts the area by almost 20% and it hikes the human host contact probabilities. Then there are roads connecting rural to urban and a big role for geography remote sensing and all of this is what's happening to land cover and how does it affect contact between humans and animal reservoirs. And you know there's vast deforestation right in West Africa where we had people and Dan knows this much better than I these are you can do much better than this. This is a representation of deforestation in Africa, and it's a big deal in terms of diseases caught from zoonotic diseases. Deforestation also contributes to climate change and rising temperatures may in fact drive mosquito ranges north of where they've been. And we're seeing that even I think in Florida, malaria, West Nile virus, Zika was a big deal here's our Zika model which is also heavily spatial we built an artificial New York City for Zika that has every census tract, all the people, and even several million mosquitoes, and we were able to develop using mosquito trap counts from New York City and US census track level across New York. You can have a heat map of mosquito densities. And then as you march people through their daily itineraries you can track. Are they trans are they transiting high mosquito density areas. If so, we adjust their probability of getting bitten accordingly and we can give a reasonable account of transmission of mosquito born diseases in large urban areas New York City. Specifically, one of the interesting things about Zika is of course it can be transmitted from mosquito to human but also from human to human and sexual transmissions can occur in the mosquito off season mosquitoes are dormant in the fall so you can't go for it, but sexual transmission can go on anyway. And the problem is if you continue to transmit, while the mosquitoes are dormant, you're spreading Zika infected blood widely, and then when the mosquitoes come back. It's a huge second wave. So you can't relax just because it's off season for the vectors. And again, seasonality is another one of these areas that really invites geospatial analysis. It's a really global model with with very toy assumptions about seasonality. And again, I think geospatial detail would be a huge advance to, I think it would be thrilling to combine these large scale infectious disease models with serious high fidelity geospatial simulation systems and study the coupled dynamics of these large systems. We don't understand seasonality very well we don't really quite understand why flu seasonals would be a big, big advance to do that. Another climate related geospatial dynamic is just urbanization, which increases risks it increases density and promotes disease transmission, obviously in a place like New York. This was a huge factor in the initial spike of COVID, but also increases vulnerability to all sorts of environmental shocks, and another area. What we've been working on is coupling atmospheric fluid dynamics with agent based modeling. And here's an example of an artificial Los Angeles with traffic color coded for velocity, and we'll release an airborne chemical contaminant in the, in the, you know the harbor of Los Angeles, and the diffusion of the chemical is computational fluid dynamics, but the agents can decide what to do they can decide to shelter in place, or try to evacuate, or use traffic aware rooting or whole variety of things. The main the main thing is if everybody pours out into the street, they simply increase their exposure through congestion, and it worsens the outcome so what's the optimal mix between shelter in place and evacuation is evacuation even feasible in a city like Los Angeles. Here's another view. But again it would be fascinating to couple these computational fluid dynamics with real urban simulations. Here's several views of the same analysis. The upper left is what you saw top down is, you know, top from top right is the top down view of the disperse dispersal lower left is another but of course what you care about is how many people are exposed. And what's their level of exposure in parts per million per second and we can track that in the lower right, and then assess different interventions and mixes and, you know, change the permeability of those things, all sorts of things designing for resilience to this is we can we can do that using these models. All right, and I think a big initiative is to develop that kind of model for Los Angeles lower Manhattan, Washington, and build up these what I'm calling petabyte playbooks for all the global mega cities and think about disaster resilience to help the post evacuation age. Those could all be linked in a US national model. Here's our model of that. This is 300 million Americans, every zip code and again we tried to try to encode their zip code to zip code contact dynamics in a travel matrix, but there are 30,000 zip codes. This is a 900 million element contact matrix from zip code to zip code. So quite a quite a quite an object. Same color code black is healthy red is sick blue had the disease recovered. This is an h1n run again starts in LA. I'm a New Yorker so everything bad starts in LA and all my runs. Again, do you want to close schools is too late to close schools who has priority and vaccinations. Where do you ship things. Where is it. When is it possible to lift the quarantine. You just can't do this kind of work without serious spatial epidemic dynamic modeling. And I think that has really come into its own, but I think the joint forces with real geo simulation would be a watershed. So I'm very excited about discussing that. So, conclusion is really the integration of geospatial and disease and disease transmission models, I think is a crucial step. And it should be done at all these scales local urban national planetary and implicated in all of it is land use climate change seasonality urbanization, and of course human behavior that I'm also very interested in vaccine refusal premature distancing all of these things matter immensely, but they're coupled, and they affect transmission in complex ways. And it calls for a large scale interdisciplinary modeling effort that I am excited to pursue, and I'm happy to discuss. Thank you so much. Thanks so much. I think we've raised some really great paths forward for integration of geospatial information and agent based modeling at large scales to understand interventions and plan for future pandemic spread and possibilities. I'm going to save questions until the end of the session and then we'll have an opportunity for discussion of all the presentations. So I'm going to move ahead to our next speaker, who is Simon hey Simon is a professor in the department of health metric sciences and director of local burden of disease group at the Institute for health metrics and evaluation at the University of Washington here in Seattle where I am. He obtained his doctorate from University of Oxford where he remains a member of the congregation. He was elected to the board of trustees of the Royal Society of Tropical Medicine and Hygiene in 2012 and served as his 52nd president from 2013 to 2015. He has received numerous awards, notably the back award of the Royal Geographical Society in 2012 for research contributing to public health policy and the Bailey Kate Ashford Medal in 2013, and the Chalmers Medal in 2015. He was elected to the American Society and Royal Societies of Tropical Medicine and Hygiene respectively for distinguished work in tropical medicine. He was elected to the fellowship of the Academy of Medical Sciences in 2015. In 2019 he received the tent and innovator prize for from malaria no more for innovation helping to make the end of malaria possible in our lifetimes. Please join me in welcoming Simon hey. Good morning or good afternoon everybody, depending on where you were listening from. I'm talking from Seattle today so it's morning for me. First I'd like to start with a thank you to the mapping science committee for inviting me to talk today and then declare some conflicts of interest. So I feel a bit of an imposter in this bunch of luminaries for two reasons. The first is I'm not an infectious disease model. I call myself a geospatial scientist. So I am going to talk a little bit about the models that we've been doing at the Institute of health metrics and evaluation. But the second reason I feel an imposter in this process is that it's a massive team doing this work. Essentially, about a third of our Institute since February has been moved over on a voluntary basis to look at the how we work with the COVID response. So I'm representing the work of a vast number of people here and I'd like to bear that in mind. In fact, so many that I can't bring them out individually, but it's essentially a whole Institute. And what I'm going to talk about today is how we should sample the world for a pandemic response at the next slide. So, as I mentioned, I'm going to spend a little bit of time on the COVID-19 model and present some information and background to that how it works, but you can get the vis tools for that. So a little bit of information on how frequently it's going to be updated and the future directions of that. And that really sets up the rest of the talk for something that I have been much more closely involved with and much more confident to talk about, which is how we sample the world for the next pandemic. So how do we improve our surveillance across the world so that we're better prepared for the next pandemic and disease? It might seem odd to think about that now. We're in the midst of one, but this exercise has definitely brought to my attention many of the limitations in information that we have available to make inferences not necessarily in the US, but more widely across the world. So then I'll spend the majority of the talk talking about how to stratify the world globally. And obviously, in this surveilling the world, there are probably an infinite number of ways in which we can do it. And I'm going to try and basically walk through a very simple one, talking about global stratification, how we wait for population travel time and catchments at health facilities, show an example for East Africa, and then leave that with some scoping considerations about how we might think about that in future. But all tied back to what we need to drive these particular models at the moment and what we've learned from the data discovery part of the COVID-19 modeling done at the Institute of Health Metrics Evaluation. Next slide please. So we have been making model forecasts and scenarios now for four months and basically forgive the slightly unkempt appearance, forgive the work from home. You will all be familiar with the craziness that has been trying to respond to information. And our Institute, essentially, it was a design demand driven forecast from the Washington Department of Health who were interested in trying to predict the demand in the first wave of the epidemic for hospitalizations, ICU admissions and ventilator need. And that's where this started. And as I will explain, it has expanded very, very significantly since that time. In the first instance, we used a very, very simple curve fit model. And that proved to be useful for the intended task, which was to predict when the peak of hospitalizations would be in state by state across the US. But as time moved on, and we started to look at more detailed parts of the disease transmission process, as we just heard about in the last talk, we had to switch to a classic SEIR deterministic modeling framework, and that's what we use at the moment. I won't go into that in a great deal of detail, and I'm going to give you a website link to go and beam into that as much detail as you can and the visualizations in a moment. But it's the standard RNOR equation with different compartments for susceptible, infected, exposed, recovered. They move between at different rates parameterized by information that we get on infections, cases and antibody prevalence, and systematic reviews of various things like death rates per capita, mobility, the effect of social distancing mandates, the efficacy of masks, et cetera, et cetera. We also add a seasonal component, and being from the IHME where we have the global disease, we have lots of information where we can parameterize those covariates. So we also link those into the models as well. The thing that I'll talk about today isn't the scenarios that we're spending a lot of time looking at at the moment. So we can try and work out a range of policy options that might be available to decision makers in the coming months to deal with where the epidemic is going to go into the fall and towards the end of the year. And we'll talk about this reference forecast, which is our best, and I'm going to use a word guess, our best prediction about what we think will happen over the next few months. So if you could have the next slide please. This is the, if you look on the website today, the URL on the bottom left of the slide. This is our predictions for COVID-19 deaths, total deaths on the Y axis and the date or timeline on the X axis. You can see that the epidemic in the US, as everybody knows, started in around April and the end of May, end of March, beginning of April, I beg your pardon, and the deaths started accumulating from them. You can see from this site as well, we've got total deaths, daily deaths, infections, hospital use and social mandating. So we can social distancing mandates. We can look at those at the, in this case, state level in the US. If we can move on to the next slide please. And this reminds me to say that this is, this is where we started essentially just doing this for Washington State. You can see that the, it's the same graph as a part as the previous presentation assumed into Washington or specific to Washington. And you can see that the predictions are on the left, sorry, the observed is on the left of the slide as, as I look at it. So March, April, May, June, that's not a model prediction. That's a smooth data that we've seen in the state. And the prediction is the dotted line with the uncertainty around it going out at this point in time to October the 1st. All of the details about how that's done, all of the data that goes into that the code that's run to do it is all available on the website and updated pretty much daily in blogs that go and accompany that. So I'm going to leave that there as a, as a description of what, what we do, and then talk about some of the bits that I've been much, much more involved in terms of the collection of data. And the problems that this pandemic has revealed in our capacity for surveillance. So if I could have the next slide please. So in our SEIR models for the US we get data on cases and deaths by state, etc, etc. If we need to move those down to hospital demand. Obviously we need to start knowing about the healthcare infrastructure to in a given hospital, how many staff are there what a number of beds how many intensive care units. Are these are available. Do they have the WHO list of essential medicines are that isolation boards is their PPE. Are they ventilators, etc. There's a long list of information that we need in the micro simulation part of this hospital demand, which to bring to the main conclusion of this paper, this talk just don't exist at the moment in any nice global repository where you can download them so a huge amount of time I actually meet in the last few months, so that we can parameterize these models in space, and not just do them for the US has been trying to get these information globally and I'm going to talk about the limitations of that and perhaps thinking about how we could improve that for the future. So I could have the sorry the limitations of that data search process and how we could improve that for the future, so that we're more prepared for the for the next pandemic. So obviously being my day job or my day job in 2019 2020 it's been quite different has been the global burden of disease study and essentially auditing disease and death around the world for a range of causes and conditions. From for as many geographies as we can. And they're true between those two as many different causes and risk factors as possible, and my specific job within the Institute to take those down to a spine a special resolution. So is I'll show you a map we're talking about just facial and one of the first things we have to think about when doing surveillance of the world is to chop the world up. And there are many ways to do that from ecosystems to habitats. This is the the classic view of the world that we have gd super regions, which bring the world together in regions of epidemiological coherence. Again, I can point you to papers that support and talk about that methodology, but it's not really the object of the tool. You just use that as a first cut to stratify the world and think about how we might solve that. And the next slide please. Obviously across that distribution population is one of the things that vary enormously and all of this histogram shows is that by those big super regions. There are very different percentages of the global population represented in each of those. So the first thing that we thought about if you go to the next slide is to categorize those, those regions that I just shown you in the map and give them the percentage of the total population. I'm going to focus a little bit more on sub-Saharan Africa is that's one of the areas that we've been trying to collect lots of information and basically be finding it very, very difficult. And to try and think about how we could make that job easier. So we set up the challenge of how, if we could play back to the start of the year, how we set up a surveillance system would give us the information globally that we needed to drive these models in an efficient way that was timely. And we could tend to respond to countries and international organizations in a very quick manner when they're wanting to know about what the impact of this was. So, looking back, we've kind of looked at how many, I'll go into this when we talk about the considerations, but the problem set we started with is how do we distribute 2000 surveillance sites across the world. So we first stratified the world and made that available by, sorry, made that total of 2000, we've distributed that population weighted in those particular region. And I do that just to show, and if we look at the South Asia, East Asia and Oceana region, simply that population density look at the world there. And so obviously we want to sample for that because it's the humans that we're worried about in relation to pandemics, at least in hospital usage predictions. And we can see that the, there's a massive variance in population between the oceanic islands. So we sample that with three to countries like India and China. So you can have a one of the regions is 167. The next slide please. So once we've distributed those 2000 catchment, those 2000 surveillance sites across the world, we have an approximate number of sites that we need to distribute within each one of those groups. So we do that with this population accessibility cost distance service, published by Vice Hotel in 2018 in nature, and use that with a data set of about 5000 health facility locations from WHO. And once we've defined the catchments for those based on a minimum distance travel time, these in polygons around each of those facilities. We then choose those within a country to say, of the sample that we give you out of the 2000 place 50% of them in the highest population catchments to be again to try and be representative urban populations. And then distribute the balance so the next 50% to basically representative as possible of the entire country and by definition those tend to be in the more rural populations. I'm going to show you how that works for East Africa in the next set of slides so you can have the next slide please. So this is an East African subset. And for those of you who know your East African geography very well it will be easy to match table on the left to the map on the right. If you don't, I'll beam you into one country. So if you look at the most populous one in that region to give you. So that gives a balance of 25 central sites that we'd like to distribute in that particular country. You go to the next slide please. This is where all the hospitals are according to this database. Here's one of the first big problems that even information as simple as the location of all the hospitals in the world is not known. So that's quite a bold statement. It is known locally, but there is not one central resource we can go to and say this is this is all of the health facilities across the world and where they're located highly variable. And I just even that in itself is something that needs to be looked at and done much more comprehensively across the world. So for Ethiopia, it says we've got 161 hospitals, I suspect there's many more, especially in the secondary and tertiary parts of the health system. But using those, if you go to the next slide, you can distribute those 25 sites between them with those decision rules and that's what the pink dots are. If you go to the next slide, that repeats that same process for all of Africa, all of Sub-Saharan Africa, I beg your pardon. And you can see that we can do that to come up with the global sample of sites we need. So in the next slide, I just wanted to bring up for discussion and we're at the end of the talk now of if we had this surveillance. surveillance system, one of the kinds of things that we would need to look at. So the first is, you know, is 2002 ambitious or two conservative, I imagine all of us would have different expectations or views on which number we should choose. We don't even know where the locations of all of these facilities are in the world, there's feasibility and cost things to weigh up between how many samples. Should it be daily, weekly or monthly? What's the minimum set of information we need for each of those? How do we correctly balance agility? That is the ability to collect new information versus mission create everybody wanting different information from these different facilities. How do we push or call this information? Who pays to establish and then to maintain and what do we call the baseline? I'm going to finish that. Next slide please. And I'm just essentially trying to set this talk up as a discussion. If you just to recap, if you want to look into the detailed SCR models there in the, and they'll be made aware in the web resources that I pointed you to before, and the predictions are updated daily. At the moment they're available for the US, Europe, high income Latin America, and we're starting to make those models available for other parts of the world. Africa and India, which we're very concerned about where the COVID pandemic, how the COVID pandemic will impact in those countries. But as I mentioned in the talk, we've got data problems in each of those locations that we need to look at. And this is one potential solution about how we might sample the world to be better prepared in the future. Thank you very much. Thank you, Simon. Wow, I think this is really great bookend of talks in a way on one hand looking at the mechanisms and simulations of the processes of disease dynamics. And on the other hand, the challenge of adding geographic realism into those models through spatial data, which is ultimately the detail that both of you have identified. The committee is very much interested in how spatial data can inform our understanding of these processes and our intervention to reduce the burden of disease. And you've identified, you've both identified in a way the need for these data to come together and Simon you've identified some really important challenges for the databases and data needs so very much appreciate that. We're going to hold off on questions until the final speaker to in this session, which is Sean a her who was a professor of geography and director of the Center for analysis and research of spatial information at Hunter College at the City University of New York. And the founder and director of Carsey. Dr Hernd played a central role in managing the design development and implementation of the digital geographic base map for the city of New York called NYC map in the 90s and early 2000s NYC map was instrumental in enabling the city of New York to respond to the 911 crisis. Dr Hernd's role was highlighted in the history channels the twin towers rise and fall of an American icon. He pioneered the first statewide model for West Nile virus hotspot detection in California in 2005, modeling every quarter square mile every day. He directed the solar New York State portal software development effort in a project funded by the US Department of Energy, his past president of the university consortium for university geographic information science. He was appointed by the United States Secretary of the Interior to the National Geospatial Advisory Committee as a charter member in 2008. Professor Hernd received the prestigious 2013 IBM Quality Award. Please welcome Sean. Thanks very much Dan. I'll be talking about an embedded recursive SIR model for county level analysis and actually first started thinking about epidemiological models, about 20 years ago when West Nile virus came to the shores of New York City. And the city was in a panic. They didn't know what it was. They actually sprayed the whole city before they specified where the hotspots were. So they actually hired our team to look at where the hotspots were. And we used a spatial temple model informed by ecology and the epidemiology. And we implemented that model not only in New York but also Chicago and for the whole state of California where we ran it for three years. That gave me some context for thinking about this next slide please. And, you know, what's the best approach. And there's different ways of thinking about how to approach this problem the first one. Next, could be a data driven approach. The problem is the data was poor as everybody knows the case data was extremely spotty testing wasn't really there. And using a data driven approach with poor data. It's obviously a problem. Also using a data driven approach. There's no constraint on what's possible. And this can lead to gross overestimations of death counts and that did for for some of the data driven models. Next. So, we see that everybody's gravitating towards the classic epidemiology epidemiological models. Even with poor data, at least you're contextualizing it in a framework and epidemiology captures the process of viral spread and attenuation and model the model provides constraints on what's possible, right, frankly, next please. We have sort of the elephant in the room which is the sir model, we have susceptible individuals. We have infected individuals, and we have recovered individuals and those that not recover. And those people have an origin where they live there's transportation involved. Next please. So this is kind of our conceptual model of both the nature of the disease, the epidemiology, as well as sort of the geographic components of it. And this looks like California because I only see cars. It's definitely not New York. And so next please. Parameters to this model. Actually, Josh talked about are not, which is if I'm sick how many people do I make sick. And the other is, I call it diffusion. It's also a genealogy called the contact ratio. So if you want to replace one for the other you can going forward I call it diffusion just because it, it feels to me like diffusion. Next please. This looks more like New York with subways and buses and people walking through a plane in there that's another issue of course, and the nature of their interaction next. And if we look at those key parameters for New York, we see a row of 2.6 and a diffusion of point one zero seven or 10.7% of the population. The interesting thing here is that LA and New York have very similar populations, you know, 8 million plus or minus, but very, very different row and fusion factors for the same virus. Next please. So our goal is to understand and reduce transmission and predict the trajectory of the virus and these are the two key parameters. Next. So in our logical model, we're going to use a modification of the SIR model. Next, and we're going to model deaths not cases just due to the lack of testing, I think I actually did both. But deaths are more reliable. I think going forward, perhaps case data will be if we can actually get our testing up there. So we're going to model every county separately and then sort of create a summation of all those counties. Thanks. The parameters that we're using these are assumptions the death rate 0.023 I'm using the death rate from Korea just because they had the most rigorous testing regime. So their numbers for death rate are probably reasonably obviously this is an assumption, and we can discuss that initial rate of infection IO 0.00125 what is the nugget of those infected individuals, the duration of the disease 21 days seems to be out there actually a personal experience because I had the virus. And believe it or not, I was miserably sick for 21 days and then I finally got better. So this is an example of one. So the two parameters that we're going to be estimating next. The calculated parameters are going to be row and the diffusion parameter. Next, please. Next slide. Oh, I'm sorry, go back one. I want to show. So this is actually the generating equation. That's P prime. And there's three parameters here. One is beta and beta is equal to row times new, which is that 21 day only that's a daily rate. So that's beta and then we have time and we have the initial condition. If you take this equation and multiply it by the population and the diffusion rate, you're going to get prevalence. So from prevalence, you can then generate susceptibility, infection and replacement. So this one equation is really the workhorse for this entire model. Next please. Now we had policies that affected these two parameters. We had a pre lockdown. And you could think of row as a function of place and how infectious the disease is it's sort of the combination of those two things. One can think of that as a some function of movement during lockdown in the model. What we did is we took the date of lockdown for each of the states, and we applied a decay function to row and diffusion basically stalled generally. And then we looked at the open updates and row would then increase as some percentage of pre lockdown and the diffusion rate would also increase as some percentage of lockdown and that's kind of a key question. What should those percentages be next please. So this is the curve modeling process and this is New York City. And so what you have here the reason I call it an embedded recursive because it's actually recursive call within a recursive call, because you've got two parameters you're kind of trying to fit. So blue is the data and red is the model. And you can sort of think of this as like a ping pong match where you're fitting one parameter and the. And you get a little closer to a solution and the other parameter gets knocked off you push it in. And so it's sort of like two steps forward and one step back for each parameter until you converge at the net. And that's what you're seeing right here where it captures the curve. It really models the data very nicely. And so this is done for every single county in the United States and we'll see that there's spatial variation in both of these parameters as you can imagine. Next please. So this is kind of what the classic s IR looks like in terms of susceptible individuals. Next, we have the expected individuals and the recovering individuals in the past individuals in the purple. And here you've got a row of 2.68 as I mentioned before and 10.7 diffusion. And this is what would happen if that diffusion sort of stalled and we didn't really infect any additional people this is what it would look like. And what we applied is what it looks like when we take and just one more click, I'll get the we'll get the rate when we take the pre lockdown rate, and we apply that to the open up period in this case we're taking some percentage. In this case it's 25% of that previous rate, and we're going to change that diffusion rate. We're also going to be adjusting grow. This is the result you get for New York City so we're going up to 30,000 deaths by August 26. And we're doing this for every single county in the United States and a key question is, what should that percent be. And of course it's going to vary geographically. New Yorkers are very serious about their masks. I was out in Denver recently and they are not. So how each region reacts to these different prescriptions for for social distancing, etc is going to affect these parameters next please. So this is kind of the map. And it's there's a whole bunch of interesting things going on here so this is modeling every single county for the two parameters row and diffusion. I'm just showing row here. So if you look at where the international borders are and the entry points. If you look up in Washington state we've got very high row values there in the middle of Montana, northern Minnesota and look at Maine. There's nothing up there but there are is a border and there's a lot of people passing through there and I think if you look at the Michigan area and the New York area you'll find the same so there's a whole bunch of spatial things going on here. So let's just kind of cruise around the country and look at different cities and fit so next please. So this is Denver, Colorado we have a row of 1.89 diffusion of 0.038. Next, we have Columbus Ohio 1.83.022. So keep in mind, we're only looking at 2.2% of the population so if we think this is over, we're not, you know, okay this is very interesting New Orleans one of the highest row values of 3.063 with a diffusion of 10.4% and you know models should intuitively make sense and anybody's been to New Orleans knows how gregarious the population is how tightly packed things are. And so, you know, this is kind of a good logic test. Yeah, New Orleans does have a high row value. Totally understand that next. Yeah, so this is Atlanta, Georgia. And Atlanta's kind of split between two counties. We have a 1.8 and a 0.013 next please. Fairfield County is an interesting one pretty high row value 2.3 and a 10%. Now, Fairfield is really a lot of Fortune 500s there but it's also a bedroom community for New York. So obviously there's a relationship there and we've got Metro North which brings probably hundreds of thousands of people a day to New York City. So you can understand why we had a high row value there. Anytime you're in a confined space, like a train or bus, the row value is going to kick up. I think at a couple more Los Angeles, we've seen that one. And one more is, I believe Chicago Cook's County 1.56 and 5%. And you can see how well the model fits the data for each one of these. And you can sample any of the counties and that's generally quite, quite true in the case. Next please. So the bottom line prediction, August 7, 2020, if we use that 25% for the rate of 25% for the pre shutdown going forward, we're going to end up with about 172,000 deaths by August 7. If we were to use 0.15, we would end up with 154,000. And actually, when I was looking quickly at your chart of Simon, it looked to me like that's in the ballpark of what that model predicted. Next please. Okay, so the real issue is how do we come up with estimates of row and diffusion going forward? Next. New sources of movement data could provide the solution. Movement data of individuals from apps has been consolidated to capture movement from origin to destination, diffusion. Next. Next please. And proximity of devices can be used to calculate row. Next. These data can be used to calibrate both diffusion and row by relating it to output model. And next slide I'll show, you know, sort of how a metric might work. In this case, diffusion, how many people from a zip code are going to other zip codes? Next. How many from other zip codes are coming to my zip code? So let's take that and sum it and normalize it by population is the first metric. So that's a possible metric. Row is another one. Next please. And here we're looking at close contacts and a lot of the vendors will give you a five minute interval for close contact. Here it's for individuals in the zip. How many people are they connecting with when they go out? How many did they connect for people coming into the zip? How many did they connect with? And how many did it connect with when they're outside of the zip? So some summation of this normalized by the population could give us an estimate of row. So keep in mind, we're going to use that to relate to the model parameters and the estimates that the models have come up with for both of these variables. Shut down, calibrate it and then use that relationship to determine those two metrics going forward. Next please. Of course, I'm interested in the geographic components of this and these are all of the, a subset probably of the different types of geographic information that we're interested in. Next. The question is what's the best lever of granularity for these analyses? And what's the real relationship between these geographic correlates and these two parameters? Next. Okay, conclusion. A modified recursive SIR model was used to predict COVID-19 trajectory into the future. New mobile data sources are now available to calibrate two key parameters of the model, row and diffusion. And process based models have the advantage of constraining predictions to realistic scenarios of what is likely and what is possible. Next. I want to thank the New York Times for the data and thank Harvey Miller and Anne Lynn and the rest of the mapping science committee for organizing this workshop. Thank you very much for your attention. Thank you, Sean. Really nice, sort of bringing together of the data questions and the modeling questions. I'm going to take a moderator prerogative and ask a clarifying question on your presentation, Sean, if you don't mind. I love the sort of county level estimation of our zero as a as an interesting sort of indication of process and how it varies across space. My question has to do with the meaning of our zero in that situation or zero if we interpret it as the number of people and infected individual infects. We interpreted the counties at the borders as having higher our zeros presumably because there's people coming in and out of those counties, which maybe isn't the same as internal county spread, but it's importation of cases that elevates seemingly the R zero but it's not really R zero that's being elevated right right in some ways it's the diffusion that's being affected by that. The two aren't independent. You know one thing one of my students first class project overlay the subway lines onto the map of prevalence. And sure enough, there was a bunch of the zip codes which had very high prevalence, but no, no subway stops. So like what's going on there. Well what's going on there is that those people are taking buses. That's a more confined space than a subway car. So perhaps the R zero is pushed up because they're in a more concentrated space than the folks in the subway, just a hypothesis. But these are the kind of things that we can start to think about once we start to spatialize these parameters and try to understand what is driving them. Very interesting. I have one question here from Boodoo Biduri for all three speakers, and it's sort of a general question. How does geospatial epidemiology inform geospatial science that geospatial scientists or geographers did not already know. So are we learning something here that is a generalizable to the science of geospatial technology and and he offers this alternative otherwise is geospatial epidemiology and oxymoron. Anybody want to take a whack at that. Well, I guess the way I see this is, I always say never underestimate the value of a strong conceptual model. And in this case we're marrying geospatial principles and analysis techniques data science a process model. In this case the epidemiological model. And I think that's what GI scientists do. They oftentimes work with other disciplines to take their sort of toolkit and apply it to a process model, which informs the way in which they use those tools. Yeah, as an epidemiologist I would say that I have the feeling that geospatial sciences can do more for epidemiology, then epidemiology can do for geospatial sciences, but I'm not a geospatial scientist so I really don't know, but I agree with Sean that the main point is that the coupling of these disciplines is a very powerful new tool to study public health, but also economic dynamics urbanization I mean a million other things so I think that the model, modeling contagion dynamics of all sorts, and geospatial mapping is a very fertile Union scientifically. May I add to those comments. Yes, please. I would just like to support those and say, add perhaps a third reason for being an impulse. The presentation I gave is that there were essentially lots of people working on this organization. And they are essentially siloed in infectious disease modeling work doing that in locations that basically operate independently so they wouldn't, they wouldn't fit my definition of being geospatial. Where the geospatial, that doesn't mean that they couldn't and shouldn't. And I think we've had great talks showing about how those things can be combined. Where in our process, the itinerary process that we have started to do, make more realistic if you like some of those locational variances is in the covariates that we put together so I didn't talk a lot about that but we have obviously distributed samples from Facebook, the numbers of people wearing masks, we have distributed information on mandates, when and when they're applied, uncertainty around those. There's no reasonality so when is pneumonia worse and we use that as a as a covariate in our model. Many of those things are inherently spatial we use classic geospatial techniques to build those covariates to inform a model. So, maybe just a bit of clarification there, using those in much more intelligent and sort of coupling. So, a lot of people with disease process models with features that you can get from the landscape from geospatial models is obviously a wonderful place to be and a lovely aspirational goal. How some fairly big computer challenges for us as well. I'm not sure if the other panelists would echo, but running those things at higher and higher spatial resolution with more and more dimensions is becomes challenging. Thank you. Thank you. We have several other questions. I'm going to just start sort of as they came in. Alexander so called asks, Josh, what tools are using to build these models. Oh, good question. It depends entirely on the scale of the model I mean a playground model like that. We use net logo which is a very, very easy to learn programming language specifically designed for agent based modeling. For the larger models, it's Python, or, you know, C++ or Java. I mean, industrial strength programming languages for the big models, but for the prototypes the toy models gain insight teach principles. We use we really do use net logo, which is a lovely for this and then wanted to do. Typically we do the differential equations model, and then we do an agent based model and see why they differ and how space becomes interesting. And for the mathematical model I use is Mathematica or Matlab so on the math side that lab Mathematica something like this. And on the computing side depends on the scale from net logo for toy to, you know, Java or Python or C++ for the industrial strength models. Do either of you want to offer a quick sort of response on that question. Sean or Simon. On the software environment. Yeah. Oh, sure. So the model models written all in Python. Same here combination of Python for the production environment and often people use are in there. Yeah, of course. And we run it on the Azure cloud to get all the simulations running in a semi timely fashion. Okay, another question comes from Manish Verma. How do we interpret R0 and this is probably mainly for Sean but maybe for the others as well how do we interpret R0 and diffusion into what that might mean for policy. And I guess there's kind of a another question comes in that this somewhat related to this. A lot of this work is in urban areas. What do we what does this work tell us about less densely populated areas. Can I make one comment about R0. I mean R0, its subscript is zero, because what it's usually understood to mean is, if I take a completely virgin bowl of susceptible individuals and drop a single infected into the bowl. How many susceptibles will that person infect directly with no secondary transmission or anything else. So, of course, as susceptibles have you run out of susceptibles, the reproductive number falls, because you know, you can't keep doubling rabbits you run out of rabbits right I mean, the point is that the R0 is the first reproductive number, but the reproductive number falls as the epidemic expands because you run out of gas you run out of susceptibles. So, using R0 as if it's a constant like a interest rate that just doubles and doubles and doubles over states, the actual scale of the disease. So, you know, varies by, by space and in a variety of other ways, but I think this is one kind of scary misuse of R0 as if it were just a compound interest rate. It really isn't because you run out of susceptibles. Yeah, I totally agree with Josh on that. I think part of it seems to be, you know, and I think there's a lot of research on this, how confined the spaces and the duration of the exposure impacts are. And that's why, you know, these, these demonstrations are very scary for that reason, just because people, you know, they are outside but they're very tightly coupled for long periods of time. Absolutely. I mean, properly speaking R0 is not a property of the pathogen. It's a consequence of the pathogen and the contact dynamics and so any, you know, scientific studies you'll see, you know, smallpox R0 ranges from A to B, depending on is it Bangladesh is it, you know, all of this stuff. Some of the things that if I think about how I would translate some of those thinking into policy is obviously across the geospatial landscape and if we were to think about the US at this point in time. There are places and as we've seen from these, these infections that sorry from these simulations and the work of the other panelists that will have different intrinsic values. So we'll support epidemics at higher or slower rates. That kind of starts to imply the thinking about the US as a contiguous whole for policy responses, perhaps not our most sensible option. And one of the things that we've seen that was really surprising to me is in terms of the putting mandates on it, you look at the timing of those. I would have expected states to adjust those according to when the big surging cases were. But actually, if you look at them, there was a big coherence in the time and everybody sort of put them on within a within a week window, because of her behavior I think just because if that that states done it we can't be seen to be being behind. Whereas these types of dynamics would lead you to think that there are much more state and county specific timings of some of these things that you would want to. And if you're thinking about it just from policy perspectives if you're wanting to control migration and movement of people there are some areas that you'd be much more worried about seeding infections to others. I'm not saying that that shouldn't be a policy but that that would be one of the realms that you would, or I would start to think about. Yeah, if you noticed some of the East Coast curves, like for New York City for Fairfield County for New Jersey I didn't show New Jersey but Fairfield, but even New Orleans we're starting to curve down whereas all the other ones I show are still on the total upward trajectory. So this, this whole policy of opening up may have made some sense in New York and some of these other places where you had a 10% rate of exposure of the population. But for some of these other cities where they had 2%, I'm not so sure. Harvey, how are we doing on time we're a little over do we want to we're a little over and we want to. This has been a great discussion but I think we want to give people a chance to take a little break and go away for computer screens for for 10 minutes so I think we'll have to call it here. Okay thanks everyone terrific. I just want to thank the speakers again. Thank you, thank you. Pleasure. Thank you. Okay thanks Dan and thanks. Thanks speakers that was a great session we're off to a good start. So we are going to take a short break we'll be back at 230. The second topic geospatial needs for rapid response. See in 10. Hello everyone. Welcome back to from your from our break. The next topic in our workshop today will be on geospatial needs for rapid response and I'll be moderating the session. So we are first of four talks will be from the USGS we have a joint presentation from Maria Pepler and Chris Crittini. Pepler serves as the USGS emergency management coordinator. Her scope of work includes all hazards and missions, her primary responsibilities include unifying USGS response teams to support the sharing of resources and skills, ensuring to the science is used in the decision making process, coordinating safe access to USGS scientists and technicians into hazard zones, updating continuity and preparedness plans and training USGS staff to interact with the emergency management community. Maria previously worked as the deputy director of the integrated information dissemination division in the water mission area and as the federal agency liaison for the office of service water. And as a national USGS flood and nation mapping coordinator, where we have started her career at the Wisconsin water science center as a fluvial geomorphologist and project coordinator for the web informatics and mapping project. Chris Christine, Chris Crittini sorry serves as the USGS national geospatial program in the national geospatial program, and is the national map liaison to Arkansas Florida Louisiana, Puerto Rico and the US Virgin Islands. He works to develop data acquisition and stewardship agreements to ensure the availability of common base data across a broad range of users and applications. Chris is a member of the USGS geospatial information response team, which ensures that timely geospatial data are available for use by emergency responders, land and resource managers and for scientific analysis. Chris Maria, thank you very much for your time today and I'll let you start your presentation. The mute button moved there when we switched around so thanks. And thanks everyone thanks to the National Academies and the mapping science committee for having us here. USGS has been involved with this committee for a while this is not my first time here and my, but my background is very unusual. In particular response, I have really been managing the USGS's response to the COVID-19 and, and for us that has looked very, very different than it does for the USGS our normal job during hazards and data delivery. And, and I really just thought with this presentation, I would, I would speak sort of from a different perspective, and sort of get us back to basics about the criticality of authoritative data sources. It's interesting because the first panel, I think all three presenters really outlined the need for authoritative data. We're turning to these models and tweaking models with sort of the best we have, but I'm pretty sure I think I heard each one admit that that the data that they're working with is not ideal for their situation. So if you could go to the next slide please. So speaking of getting back to basics, I wanted to point out that the USGS mission and I'm not going to read this but we work in sort of a really wide range of, of disasters, natural biological you name it, and our role is very much to provide that reliable scientific information. We are the bottom of the data pyramid in a lot of the different sciences and resources, not all of them and we do a lot of research and modeling around that. But I think that it's important to note that that we sort of have a different role in these in these products. So I have two examples if you go to the next slide about that. And it's important to note about this emergency management operations center much of the science that gets used in in an emergency operations center gets ingested through that geospatial unit in those multi agency multi disciplinary responses. These maps are an example from a few years ago in 2018 the Kilauea eruption, you know, it was our mappers, modelers, and, and scientists geoscientists that were producing these products, feeding them into the emergency operations center that sort of drive the drove the evacuations, so very much at the cornerstone of that science informed decision making. Next slide. I really hope that most people that are in earthquake zones are familiar with the USGS is sort of central role as the global source of information for earthquakes. We produce these pager alerts which is a derivative product from our global seismic network that correlates the shake map with actually fatalities and economic loss estimation. So really putting that data actionable information, and I was personally involved in the Puerto Rico response this last January, and it's really clear that good data informed good decisions, and I think that that was really the crux of the modeling discussions in the first section. So if you go to the next slide, I'll put my hat on as as sort of the responder now instead of the data provider. And, and this is where I'd like to note that pandemic data is complex, incredibly local, which I think was really one of Sean's points. And for the USGS a shift in that it's not our own. We, we sort of struggled, and I think that this is typical as of many groups that are trying to make operational decisions about which data do we use the New York Times, the Johns Hopkins the CDC COVID data tracker, include one of my emails here in the community because very early on, as an emergency manager I got an email from the HHS Secretary's response and I was like, Oh gosh this data is great because it's cast to be better right. So we spun our wheels for like two days thinking that this was the best data turns out it's exactly the same as the publicly available CDC and data. I just got it like 10 hours earlier. And that good, you know, and the sort of general confusion around which data is the best, which model is going to be using the best data and which data can I use to make operational decisions for my staff and employees. And I think that this is something that state and local and federal and every single person and individual that's trying to do this is trying to sort through all of this information. And there's no right or wrong answer. I put the geo health platform from HHS on here, because this turned out to be a pretty critical resource, but it is available by password only. You have to request an account in order to get at some of the information so it's only available to certain people. And I want to note one of the questions that was asked but not answered in the first panel was about sort of data that's unavailable or uneven ways. I think that that equity of data access is really crucial in trying to make these decisions. And there's a lot of reasons to make not all the data public. But I think that there are a lot of reasons to sort of centralize around a common authoritative data source so that we can all be kind of working from the same piece as a paper. And this is is really difficult because these things are hard to count for any number of reasons. But I think that the same way that we sort of centralized the authoritative data sources on a lot of other things. I think that a lot of us have realized that epidemiology could benefit from this type of information. And this information, while, while complex is inherently spatial, we all live in one place for all germ factors in one place and we touch other people. So it's kind of part of the process. So, you go to the next slide. I wanted to touch on sort of what did we do, what did USGS do and very much what did the Department of the Interior do. I'll speak up here, because we do not have public health officials, although the park service does and their interactions with the. With the public. So we had a lot of advice from public health officials, but you know it's sort of how do we make these operational decisions fast do we close everything do we all tell a work how do we sort of function in this new environment. And, and I'm going to I'm about to pass it off to Chris to go through sort of what did we actually use to help make our operational decisions. But the bottom line is, no matter how many data sources there were and how whether we felt they were best, better, you know, better or what is good enough for my use are really difficult questions that I'm not even going to attempt to answer. But honestly, the one who won was the one with the most stable geo services, until something better came along. We were swapping out data sources and it was very much just what's publicly available what can we consume easily. And I think that that theme carries through a lot of our data products, we find that some of USGS is most used products are the ones that we provide the best most stable access to. So this is sort of a call to action to the geospatial community on the phone here that that this is a really key point in getting that actionable data to those decision makers, whether or not it's the authoritative source the best source or the best source for you. There is sort of a mechanics at play in a lot of the data distribution. So, I'm going to let Chris, if you flip to the next slide, talk for a second about sort of what the USGS does and how we deliver data for our decision. Sure. Thanks, Marie. And hello everyone appreciate the opportunity to be part of the workshop today. So I'm in the national geospatial program at USGS and I also support our geospatial information response team. We, we focus on the geospatial aspect within the overall context of the USGS response. We do that for situational awareness to have timely data and to provide some visualization during USGS response. So what you're looking at here is kind of a landing page that points to some of the events we supported over the last couple of years. You can see that quite a bit of that has been for inland flood and our coastal storm response team. The audience is primarily internal. It's our hazards executive committee, but we do support the department of interior and occasionally we have some external information sharing as well. So if we go to the next slide please. So this is the implementation of our coronavirus events support map. This is a screenshot from last week and this is the operational view that we're using today. So we've been using for about the last two and a half months. The map there only consists of a handful of layers. But those were layers that help start answering the questions that our management asked very early on in the second week of March or so. And that is, what are the case counts by state? What are those counts by county and where are facilities in relation to those affected areas. So some some basic spatial analysis and situational awareness. It seemed like straightforward questions, but we had, we had quite a time in the data mining portion to get the right layers and some map. For example, our own facilities layer that we used was primarily for reference for for other events, whether it be a flood or a hurricane. This was different as Marie said, because every science center, every employee is now part of the event. And that management is deciding when to close centers and now went to reopen those centers. So we needed a more robust layer. And we found it by looking around and it turns out it was there. It was there headquarters and it was updated every month. And in addition to a lot long for the building footprint. It also had employee count. It had the region of USGS had the condition of the building. Other types of information that are very helpful filtering the data to those who have to answer those questions. So it's a bit of a lesson learned in this response. Another layer that we dealt with was the COVID data itself. We found the Johns Hopkins data by state pretty early on. And along with that, we saw they're posting County level data. We said, that's great. That's just what we need. But a couple days later, the County data disappeared because it had been aggregated back to the state level. So we went on a search again and we ended up with the University of Virginia County level data. And probably for one to two weeks we pulled that in. We use some Python scripting pulled in the common denominator data. And that was our source until we noticed that John Hopkins then went back to state and County data as a service, which really helped us out. And that's what we're using at this point. So if we go to the next slide, you'll see some of the specific pop up information that goes along with the map itself. So we ended up with centroid statistics for cases. We merged those with the census boundary layer. And that just allows us to shade the map as we as we wanted to shade polygons and not use just centroid counties. But depending on who's using the data, you might be interested and look at the national view. You can filter by USGS region or you can look at an individual science center and see what is the staff at this particular location and what's happening in the surrounding community as far as the case count. Now, since this has come online a couple months ago, there's certainly a lot more sources. As Marie mentioned, the geo health is one of the big ones for us. And in that case, we point directly to them, for example, hot spot dashboard that they produce was requested. So we point directly from our application to theirs has been trying to reinvent it ourselves. So this is just a look in to some of the things we're doing and now that the focus has turned back towards when centers reopen. We're dealing with data and decisions that that that go into that. And I'll turn it back over to Marie, because she is in the midst of dealing with all that data and how to make those decisions. So Marie. Yeah, thanks Chris. And if you could go to the next slide Eric. Just real briefly sort of how do we move forward and I think some of the earlier panelists touched upon this too. As part of the White House reopening plan they have these gating criteria, and I listed one here on the slide verbatim. Where do I find the downward trajectory of positive tests as a percent of total tests within a 14 day period with a flat or increasing volume of tests data. And, and that is just a lot to unpack. And, and, and decisions are supposed to be made on these types of sentences the sentence structure, but the data that it takes to come into that is very complex, and we've had our data scientists both at us GS and the department, putting these information together to try to make informed decisions and it is incredibly challenging. It's more challenging than we've done it in other types of hazard events. For example, the, the graphs here on the on the left is part of a coven catchment analysis that the business integration office put together for Park Service, looking at sort of how do. How does a park that's located say in Colorado, but going to have visitors from all the surrounding states kind of try to digest this type of information. And the answer is, it, it, it is incredible. There is no answer. I mean, in the three graphs they're supposed to pertain to this particular border state trend, etc. They have an up and a down. So it makes it very difficult to make these types of decisions and I think that this is the exact spot that we're all struggling with. So I think the, the, the lesson and message that I want to take out of this here is that is that these authoritative data sets are critical and that when we put out plans or processes for emergency managers or any sort of decision maker to really make decisions about either geospatial or otherwise to couple that publicly available data to that decision. I think that this is a lesson learned for all of us we make our earthquake data available. And, and it's something that I think that we're all struggling on in the pandemic world. And, and I think a part of the reality is that this is going to go on long enough that we may have some time to fix some of these larger problems. I'm not saying I in this case we are a user of the data. I don't have any answers in particular. With that, I think I'll end there and I think we're holding questions to the end correct Harvey so. Thanks. Yes that is correct will hold questions to the end and unless there's anything particularly burning let me just check real quick. Okay, we'll wait for the for the for the end of the session. Okay, thank you very, very much. Our next speaker is Chow Wei Phil Yang. He is Professor of Geographical Information Science at George Mason University. His research focuses on identifying and utilizing spatial temporal principles and patterns to optimize computing facility to facilitate science, scientific discovery and engineering development. His research is further consolidated through his $40 million plus and research funding 300 plus scientific publications and 15 plus faculty placed. The founding director of the NSF Spatial Temporal Innovation Center, a collaboration among George Mason University, Harvard and University of California Santa Barbara to build national and international spatial temporal infrastructure to advance human intelligence spatial temporal thinking, computer software and tools through through spatial temporal computing and human capability of responding to deep scientific questions and grand engineering challenges through spatial temporal applications. Phil, please the floor is yours. Okay, great. Thank you. So as Harvard introduced, I will report some of our research conducted in NSF Spatial Temporal Innovation Center with a big team here. And it's mainly on the spatial temporal patterns and simulations for fighting against the COVID-19. There's a link here as COVID-19.STCenter.net, all the data sets that I'm using in this presentation or report is available over there, and has link to the GitHub connection or the Harvard Dataverse. And those data sets are collected by the team in the past about five months. It has very detailed case numbers for all the countries, including many different sources. We went to the authoritative sources from each country like Brazil, and India and African countries to collect those data sets. There are also policy stringency index data sets that some of them we build for every state, for example, for the United States and also environmental data sets. So I will not talk details about those data sets. Let's see what we could do to using those data sets. And we know that COVID-19 is a dilemma and has a lot of challenges. It's moving very fast. One day we think we understand it well tomorrow, it's changing. And so there's a lot of questions to be answered. I just need a few of them. For example, how is the pandemic spreading and could the climate control infection, in fact, control the infection speed. So we could slow it down during the summer timeframe. Or is the pandemic biased based on different races and different age groups and gender. And also we have a lot of policies in place from different states, different countries, different counties, and others walking. And after a certain period, we have a lot of demands for medical resources, but we have a laugh. And how the sense is evolving from both a geographical and temporal evolution aspect. And also, as we are in this game now, are we ready to open for economy? Or more focused to us in academia is could we have in person for semester? These are all the questions we want to have an answer for. And at the end, I will introduce my thoughts about the geospatial needs towards a solution for the resilience of the COVID-19 pandemic. So first, the dilemma that will this happen early this year or end of last year in Wuhan. And at that time, we go to calculate the R0 skills. This is almost everyone mentioned this number. At the beginning, we think we have a good understanding. It's about 2 to 2.5 based on the early confirmed cases in China. But after we have more data sets coming in around about April to May timeframe, CDC had a publication recalculate the R0 scores. The medium they found by that time is about 5.7. It's now about three times of the original 2.0 to 2.5 R0. We thought it should be. And there are more research, in fact, discussions by saying that R0 sometimes it gets to 10. That's really bad. And when we look at this from the beginning, people think, oh, it's just a flu. We don't have to worry about it. And using the England and Wales data sets, this is from the BBC. They found out by the end of March, the COVID-19 is much more deadly than the flu and the pneumonia in the 2020 flu season, as you say, from this figure. And as of now, we know more about the pandemic. The death rates is also very different from country to country. The worst, for example, in Europe, in Belgium, it's about 16% death rate. The UK has about 15, 14, 15, and France has about 15. In the US, we were at 6%, as I checked yesterday, it's like 5.5%. That's similar to what China has. So it's really very dynamic. It's not an absolute answer that we could have. It's a dilemma. And we have to answer many questions that impact our lives. So when we come back to see how the pandemic spreaded, we know that early January, it only happened in China, and then it's spread out into other countries through flights and other transportation. And when we get to end of February, many countries, almost half or one third of the countries along the world had the COVID-19. And when we get to April time from Europe and us here in the United States become the epicenter. Now we still have a lot, and we have more join Brazil, India, and United States, and also Russia becomes the epicenter with most confirmed numbers. These data sets, again, you can access from here. And I will share my slides as I signed the form. And so when we come back to look at more closely to the United States, so how did the pandemic spreading here at home at the early stage, we have only a few cases in Seattle. But when we get to end of February, as many cases in California, so California by that time start to put into some policy in place. And then when we get to March time from this gets really bad when the White House take it seriously at the end of March. When we get to April time from or now May and June time from it's getting really worse and we're going to see that from a different. So how a lot of questions we want to ask it with the pandemic going on and also all the policies in place how has the environment been changed, for example, the air quality. And we took a look at the California, several counties, unfortunately, we don't have the data sets for every county. We have some of the data sets. So if you look at the carbon oxide, monoxide and nitrogen dioxide and sulfur oxide, these three figures, you see that the orange ones are the six year average and the 2020 data are those blue ones, you see that we have a big drop in these different counties, but on the other side for PM 10 data sets, we have a big increase, especially in May time frame. And that's because the CO2 NO2 and SO2 are many contributed by the, the cars and the transportation system and the industry usage of the core and other materials by the PM 10. They have a lot of contribution from dust or wildfire and others. So the pandemic is changing the environment and their quality has also been changed. So another question we want to ask and we know that a few panelists in the morning of the previous one has already talked about the climate could help control the spread. If we look at the early data sets, so this is the January and February data sets. If you look at those from China, different cities and including several like Hong Kong, Singapore, Taiwan, and in addition to China and Thailand. And you see that the absolutely humidity and the temperature does have some kind of influence to the R0 values. And as we see here, but it's not very, we can see that high temperature will have a lower R0. We cannot see where it can be here. For example, here, we have here, here, which are the coldest place in China, but there are zero is like 3.0. Versus that in Thailand, those are hot places in our winter time. And their temperature is high, in fact, but their R0 value by that time is quite low. And when we get more data into the June and May time frame, if you look at how the temperature plays here. Just look at these different countries. Brazil, India, Russia, Russia's temperature is definitely increasing when we get into the June time frame. And this, in fact, if you look at the grid line here, the daily confirmed cases is increasing and then it's flattened out. And for the India case, their temperature is also increasing, but their temperature is already high by itself. So if you look at the orange, the yellow color, it's increasing. So that means the temperature probably didn't play a big role here that is able to control the spreading of the virus. So a lot of questions we want to ask, is the pandemic passed? We took a look at the data sets published by some states. Again, we don't have the data sets from every state, which is the death numbers of every state, and also by the this is background. And if we look at this, we can see that the Hispanic or African-American were hit the most here. So it is, but what's the reason behind it? There's a lot of research ongoing. For example, a further study we found out that the low income and the living environment really impacted a lot in this case. Another question we want to ask with so many policies and administrative measures in place, is it safe to reopen now? We know that, for example, this is a stringency index, which means the higher stringency index means more strict restrictions for the movement of people. The higher they applied from the end of January create strict rules so they could push down. In fact, they have about 14 days of the like for that. But after that, their reporting numbers is very low. And so now they have some research in Beijing in the past few days. But even since they have this down there, their stringency index is still tight as we see here. And versus that if we look at other countries, for example, India, they applied very strict rules by end of March and began to loosen that by the end of April. And you see by this time that we would expect an exponential growth, but it was flattened here. And after they loosened the strict and it's now climbing a lot, the cases. So you can see the correlation here. The policy did work to some degree, but is it safe to reopen? And not yet. And if we take a close look in the United States at different states, for example, Florida and Texas, if we look at Florida and they started to put in place the policies end of March and start to loosen that by the mid-April time frame and begin towards this turn. And you can see that probably this is the reason why the numbers confirmed cases are a lot controlled. In fact, it's increasing trajectory and as of yesterday they had the historical daily new confirmed cases. And if you look at Texas, it's about the same time they put the policy in place. And then they start to lower down the restrictions by the early May time frame. And begin to lower even down by the end of May time frame. And that's probably speaks why it's also increasing trajectory in terms for Texas. So, for every state, as we say from here, in fact, they reopened to early. That's not working good. But on the other side, if you look at the case for New York, they always had these strict policies in place and they have the face reopened. And there they need reporting new cases is well controlled. So, another question we want to have is do we have enough medical resources. And we collect these data sets on a daily basis and it's available. The dashboard is based on ArcGIS and the datasets and some of our development. The idea is that we take a look at how many people do we need to have the hospital, the beds for them and how many beds do we have for dealing with contagious disease and also critical staff for supporting them. The bigger proportional circles, that means the more resource we need, or the less efficiency we have the more deficiency we have over there. So from the April time frame, we have some here. When we get to May time frame, we have several places jumped up needs more medical resources. So when we get to June time from now, we have more places jumped up. With the summer going if we have more cases or the second wave of outbreak. It's for sure that the medical resource will be in big deficiency. So another question, more important to us is that could we have in person for semester. I know this morning, both Josh and others, like Sean introduced the ABM. And this is a work that we adopted an ABM model and building some of the criteria into here just dropping one, or a few infected cases and putting about 1000 people here and how long will take for everybody to be infected. So if we don't apply any constraints on campus. This is going to be the case. So it's quickly that a lot of people will become infected. And if we keep the six feet social distance. This one will be much more slower. And if we keep the six feet and also 36 feet as a requirement for example for our classrooms. The classroom used to be able to hold 100 people. Now we reduce that to 15 people. So everyone gets to about 36 feet square feet space. So this will be much better controlled. So these, these are to these three lines if we don't have any control just let the campus open, it will be like this. And if we have some restrictions, it will be like this one. And if we apply more strict, it's going to be like this. And we're still working on this, try to add more criteria to see how could we open and what's the best way to make it open but as long as we open we have risks over there. And if we want really control this we need to really try who come to campus with virus and who he or she has been contact with. So as a summary, I want to report what kind of geospatial cyber infrastructure needs. That's what Harvard and the, the, the committee asked me to answer. So obviously, the first is that the transparency is something we need to look at from the geography side and, which means, if we want to control this we need to access to the personal or community reachable data sets and mechanism, but we know that there's privacy issues so how to balance the transparency and the privacy issues is the key. Another one is data quality. And we hear that from the previous presentation, a reliable global distributed data collection and validation system is K both physical and social. So, which means that we need a lot of people help us validate in the day sets for example when we collect the data sets yesterday where we look at a few countries to change the data sets for the past week. So we have to update those. And also, the decision makers, and we hope that everyone could have a geographical thinking or spatial temporal thinking. And also when they do the reasoning the decision become scientifically based and factual based instead of just from their mind. And also it's a cross domain, as you see, in fact, every figure I show here has a team of different background of people working together to put that in place. And also we need to deal with the diversity, especially for the best part to across, for example, to do deal with the global situation we need almost people in every time zone, and also every both gender and different races and cultural backgrounds to help us. And also, the most important always really we need a collaboration spirit across the domain states and countries. And there's some references which details what I reported today. And this is supported by NSF and also in collaboration with Harvard and our members of the center. And thank you. Okay, thank you very much for that presentation. Again, we'll hold the questions to the end just to make sure we keep on schedule. So, our next speaker is Elizabeth root. She is a professor with a joint appointment in department of geography and division of epidemiology at the Ohio State University. She is also a faculty affiliate of the translational data analytics Institute and serves on the leadership team for the Institute of population research. Dr root is a health geographer whose research focuses on evaluating place based health interventions using geospatial analysis geographic information systems and large administrative data sources. Dr root actively engages with local and state government including the Ohio departments of health, Medicaid, and mental health and addiction services to explore ways in which state data resource can more most effectively be used to inform policy and target health programs. Most recently she began working with innovate Ohio to build a multi agency platform which integrates administrative data sources from health education public safety and the judicial system. This big data resources assisting the state and monitoring and surveillance of the COVID-19 pandemic, and being used to address other ongoing public health crises such as the opioid epidemic and racial disparities. This effort involved from the healing community study and approximately 70 million dollar research effort funded by the National Institutes of Drug Abuse to reduce opioid deaths by 40% over three years and 18 counties in Ohio. Elizabeth the floor is yours. Thank you Harvey. So, the, what I'm going to talk about today is actually contact tracing and how we trace COVID cases through the population to hopefully put a stop to the epidemic as it grows and spreads through the population. And I've actually been embedded with the Ohio department health for the last three months. I sit as part of the data team. And I've been privy to a lot of discussions at, as the state grapples with a lot of the conversations that we've already had in this panel. What data is out there, which one do we use, what do we choose what the world do we make of the White House grading gating criteria. And I've one of the most intractable problems the state's been grappling with is actually around contact tracing. So, for those sorry next slide. Forgot I'm not controlling my own slides. So for those folks who aren't familiar with the basics, essentially contact tracing we've heard a lot about it in the news, but it's a process where you trace and monitor the contacts of people who've been infected with COVID in order to control the spread of COVID. And so essentially what happens in sort of the real world is that a person is identified through testing as COVID positive and that has been a challenge in many states is actually identifying COVID positive patients we know we know we're missing a lot. And so, first you need to identify an infected individual, and then typically you use some form of essentially social network analysis, whereby you ask that individual about their social networks and construct a disease network for an investigation. And then contact tracers will then contact the contacts of that infected individual and ask a series of questions and tell them to either quarantine or stay at home or go get tested. And in that way you box in the disease right so you basically cut off these, these networks quickly and efficiently so that it doesn't spread any further. So clearly this is very challenging because you have to remember you have to do a very good social network interview, which is a pretty challenging process actually. Next slide. So, the interesting thing about contact tracing is that much of what we know and what we've developed about contact tracing comes from the HIV epidemic contact tracing is actually a fairly new public health. Sort of a structure in public health science, and many of the diseases that was initially created to look at or to stop were related, we're spread through things like sexual networks and so there was a physical contact and a social relationship that existed. So it was easy to do that form of contact tracing that really relies on social networks. And so, as we face the COVID crisis. The question is what if a disease is not transmitted through personal contacts right so a disease is spread through airborne transmission or contaminated inanimate objects like doorknobs or, or something of that sort. So, how do you then do contact tracing and how do we restructure the way that we do contact tracing in light of something that's a very different mode of transmission. And I'm sure some folks on the seminar have seen the latest research that just came out about two days ago which sort of confirms that a lot of transmission is likely airborne transmission. We thought that was true but there was a really good piece of science that was just just published on that. And so what this means is that we need to know where a person went, not just who they interacted with. So there's this really deep need to incorporate geographic geographical contacts into social network analysis and to that contact tracing in order to adequately contact trace for something that's spread through airborne and contaminated objects. Next slide. Another issue of course is that mobility plays a very key role in disease transmission as many of the other speakers have have shown, and how this particular, this is a study that was done, looking at transmission between Wuhan China and Singapore is that what then occurs when you have an individual who is ill who comes and introduces it into a new population and so this particular transmission chain that was mapped was actually linked to an important case from Wuhan China. And that person visited one of those people visited a church, and then that person visited a family gathering and then somebody from that family gathering went to another church. And so it is spread through the social networks, right because this individual had a social network that was church related another one had a social network that was related to the family gathering, but there's also a mobility and and a diffusion process that occurs between social networks and some of that is related to location like a church or a family gathering. And so the spread of the disease, especially with something like coven 19 is through co location in space, not just through social network or some form of other network contact. So there was actually a really interesting study of this done, there's not much research on this sort of social spatial or social geographical networks, personally because it's quite hard to do. But there was a study that was done on the first SARS, not SARS code to but the original one that that was like 2015 2016 that was earlier than that is early 2000s, where they actually looked at personal contacts and geographic contacts. So when they found these SARS positive individuals they went and said okay, what was your social network what did it look like who did you interact with. And then on the flip side they said also could you name all of the places that you visited while while you know prior to us identifying that you were sick. And so what happened was when they added those geographical context to the social contacts, it actually increased these networks by which the disease could spread like 100 fold. And one of their social network actually social spatial network analysis is actually shown in this graph on the on the right hand side, where you can see that there's areas or hospitals where transmission occurred. And then there's people that were in those locations, who also interacted and had their own social networks. So what this study confirms is that including geographical locations as nodes on a social network allows you to do a couple of things as you're trying to understand the transmission of the disease or even as you're trying to figure out how do I do effective contact tracing. It allows you to visualize the role that location plays in disease transmission. It allows you to reveal a potential bridges among geographical locations, or among different social networks. And then it shows you how the disease jumps into social networks into new social networks, so that you, you can understand how the disease may be spreading from one unrelated social network to another unrelated social network. Next slide. So the issue with with contact tracing as we look at a graph like that is that manual contact tracing, which is what many of our states and local governments are doing is extraordinarily labor intensive. So for highly contagious diseases. There's an estimate that we need about one contact tracer per 1000 population, which would mean in the United States we need somewhere in the order of 300,000 contact tracers in order to effectively trace all of the individuals, no networks, who are known to the patients. It's also manual contact tracing is really less effective with a disease that has a very high asymptomatic rate or a long pre symptomatic period. And unfortunately, COVID-19 seems to have both of those characteristics. There's a wide range of estimates about what the asymptomatic rate is, but it's, you know, most folks have settled somewhere in the order of 20 to 30% of all cases are asymptomatic. And it does have a fairly long pre symptomatic period up to potentially two weeks, although the average or the median seems to be around five to six days. So when you have that, if you haven't identified that an individual is sick, then clearly the contact tracing where you go out and you ask those social networks, you know, are you sick and all of that that that part doesn't happen. The asymptomatic spread does not stop. So the other thing about manual contact tracing is that effectively tracing geographic or place based contacts is actually very, very challenging. If you think that contact tracing involves a person on the phone calling you and asking you a series of questions, it's quite often easier to say, Well, these are the people that I've interacted with heavily over the last two or three days. You know, I can say I've interacted with my husband and I've seen this person and I've seen my children. But what I can't necessarily do is name all of the places that I've passed through, not just where I've been, but the places that I've passed through to, especially people who are more mobile during their day. And so the question is, how do contact tracers who are doing that interview figure out who else was was in a location when you were there. Because you may be able to say, Hey, I was at this office building for two hours. And then is the contact tracers should then ask you a follow up question of, Well, do you know who else was there. But that process of then contacting those people is a very, very challenging process. And you can see why it might take so many individuals to do effective contact tracing, especially if you need to add in this geographic component that we know is important with COVID-19. So, one of the proposals that's been on the table is that we should invest in digital contact tracing as a way of automating the collection of this contact data using some form of smartphone application. Next slide. What this involves, and there are a lot of different models out there, but it basically a phone typically sends out regular Bluetooth pings to nearby devices. It's just part of the functionality unless you turn it off on your phone. And what these contact tracing applications do is if another phone stays within a six foot radius for 15 minutes, you can change that but about 15 minutes, the phones exchange a code. And that code gets stored somewhere in the cloud. And therefore if a smartphone owner tests positive, they can put into their app that they've been tested positive and their app will send an alert to all the other devices that it exchange codes with in the last two weeks. And then it could end there, right? You've given, you've empowered the individual to say, oh goodness, I've been potentially exposed. I'm going to go get tested. But if a loud health providers could actually access that contact log information and build that into the contact tracing network that they have. So this isn't, this is clearly an extraordinarily good use of technology. And there's a couple of challenges to it though. It is, while it is more effective for tracing these geographic contacts because it literally sending out a ping if you're in geographic contact with somebody else. There are many players in the game right now. So the Ohio Department of Health that I'm working with, we've had no less than five separate entities contact us, offering some form of a contact tracing software that we as a state could roll out to try to build into our contact tracing. And they all have strengths and weaknesses. There's not a lot of transparency about how they're being built or what they're doing. The other big one of course is that there's a tremendous amount of concern about privacy. I was in the room with some of my colleagues, and they said, this is way too big brother for most Ohioans. And they did not believe that this technology would be widely accepted by the population of this state. The third concern with this technology is that especially in, in rural communities, I speak for Ohio here we have a large Appalachian rural community in the south and southwest or southeast of our state cell phone connectivity is very spotty. It pops in, it pops out. There's huge swaths of territory where there's no, there's really no signal at all. People tend to turn off their Bluetooth in those regions. And so just geographic accessibility of these tools would be a challenge in many states as well. So while there's a lot of really inherent benefits and this technology could could track geographic contacts extremely well if employed properly. It's a really challenging technology to roll out, especially when you think about rolling out as a whole. So next slide. So this is actually was a science article that was published last month that looked at how this technology would work and then and they were actually doing a study of how quick how much more quickly cases could be isolated and therefore we would see a decline in the epidemic faster. Right, so, so you say subject a has COVID-19 infection they don't have any symptoms, and they don't know that they're infected with COVID, and they have their phone at their, their home. And of course at home they have maybe a spouse or a partner and that person potentially gets infected. They then take the train to work the next day and they sit right next person C&D. And that train ride is a half an hour and so C&D in the app or coded as having been in geographic proximity. And in their workplace, and in their workplace, they have a pod of people that they work with person EF and G, and those people are all exposed because they're in close proximity in that workplace. But H&I actually work on a different floor, they're still in the workplace but they work on a different floor and so the app actually does not include them necessarily in the contact tracing. And then that person goes home again and this might be a typical day. That person wakes with a fever, they report their symptoms, they get tested and they're found to be positive. And as soon as they tell their phone that they're positive, it sends an instant signal to person B, C, D, E, F and G, that says, hey, you've been exposed to somebody who's now COVID positive. And then you could code it to say whatever you want, self isolate for 14 days, please contact your local public health department, you should go to the testing facility and hey, here's a map that shows you where the close to testing facility is. And then H&I maybe be advised that they might have been exposed, but that there are much lower risk and so there wasn't as much of a concern. So it's clear from this diagram how how much an app like this would benefit our understanding of contact tracing in the community and our ability to do it effectively. Next slide. The real challenge for this is that most polls in the United States and actually abroad now as well have shown that there's fairly limited support for digital contact tracing in a large portion of the population. So this is a recent poll that was done by the Brookings Institute, and they asked people what's the likelihood that you would download and use a contact tracing app if it was provided to you, either by your state or some other entity. And what you can see here is that among all respondents, almost half said extremely unlikely or more than half said extremely unlikely or somewhat unlikely. And that's a big proportion of the population. And of course there's variation by race. There's variation by age and by sex. But the take home message is, is that there would be a limited group of people that would likely use a contact tracing app. And you'd need about there's a couple of studies out there that show between 50 and 60% of the population would have to actively use the app on a daily basis, in order for it to really provide the data and information necessary for this, this to work at a population level. So next slide. I think the question is, it's clear that understanding geographic mixing, right, and geographic proximity for a disease like COVID-19, because it has an airborne transmission mechanism is really, really quite important. And our traditional contact tracing methods that we've been using that were developed for HIV and others STDs is not a super effective way of pulling in that geographic perspective into the contact tracing and using it is a problem. Like we need to sort of reimagine the way that we do contact tracing to take into account mobility and geographic location. But the problem is, is that this this digital, the promise of the digital app has not become reality and I'm not sure that it will at this point. There have been widespread problems with adoption, effectiveness and data privacy. These are a couple of headlines I pulled out of the news. Utah is multimillion dollar contact tracing apps still can't track person to person contacts. So that's a functionality issue Utah. Utah decided to go this route and have a go it as a state of creating a contact tracing app. Coronavirus in Texas, there's been delays and privacy concerns that have slowed the effort to use digital contact tracing and actually some of our European neighbors actually did this so the British Health Service just got one up and running Norway was one of the earlier adopters and several of them started the app and then suspended it because there's a lot of concerns around privacy and how the data are used and who gets to see the data and and if health departments can't use and see the data is it is it useful for stopping the spread of COVID. So, the WHO director actually made this statement he said digital tools did not replace the human capacity needed to do contact tracing. And I think that if there had been sort of a left for an effort led by the CDC or some, some federal entity to develop a solid contact tracing app that could have been rolled out across the United States, we might be in a different might be in a different boat now, but because of the sort of wide variety of different products that were offered and the successes and the pluses and minuses of those. It's been a really challenging process. And I'm, I'm not sure I personally see the path forward yet. So I think I'll leave it there. And I look forward to discussing or questions at the end of this particular portion of the panel. Okay, thank you Elizabeth terrific. We'll move on to our final speaker before we have questions at the end. The final speaker is David blazes, and he joined the surveillance and epidemiological program at the Bill and Linda Gates Foundation in May 2016 after serving as a physician epidemiologist in the Navy. In his 21 year career, Navy career, he served as director of infectious disease research at Bethesda Naval Hospital, department head of infectious disease on the hospital ship USNS Comfort and director of the emerging infections department at the Naval Research Unit 6 in Peru. His work there involved developing disease surveillance systems and resource limited settings, working with local officials responding to outbreaks and characterizing several novel pathogens. After returning to Washington DC he directed the Department of Defense's global disease surveillance effort and contributed to the department's global health security work. He served as chief advisor to the Navy Surgeon General on infectious disease as well as on several national security staff interagency working groups dealing with global health security and health diplomacy. His academic work has appeared in science nature in the Lancet, and he has recently edited a textbook on technological innovations in disease surveillance. He has previously served on the Institute of Medical, excuse me, Institute of Medicines Forum on microbial threats, the Infectious Disease Society of America's Global Public Health Committee, Pandemic Influenza Task Force, and currently serves on the National Academy of Sciences Board on Health Science Policy. At the Foundation, Bill Millen, the Gates Foundation, David serves as a relationship manager for the Institute for Health Metrics and Evaluation, we heard from previously, and manages a portfolio of grants around the burden of disease modeling, geospatial mapping, and next generation genetic sequencing of pathogens with epidemic potential. David, please, the floor is yours. Hi, thank you very much for the invitation to join your group and speak. So, as you mentioned, I work at the Gates Foundation. And in this unprecedented time, we certainly are eager to apply some of the geospatial technologies that have already been discussed. Just for brief background, we've been able to commit $255 million to the COVID response. A lot of these funds are directed toward diagnostics and therapeutics, as you might expect. But we definitely invest in epidemiology and modeling, especially as it can inform our future investments, but also hopefully those will be those data and information will be used by others to inform their own policies. One disclaimer up front, I am not a geographer, like many of you on the call, but I'm an active user of maps for public health decision making, and that's kind of what I'd like to talk about is how we can improve the precision of our health decision making. I think there's been a lot of progress in this space. And I think the full potential has not been fully realized yet. And so in many ways, this is still aspirational. But I think we are getting there and I think this outbreak is or this pandemic is really demonstrating the potential, potential power as as many of the previous speakers have already talked about. So go to the next slide. Yeah, so as you know this is an unprecedented situation and something that none of us alive have really faced. And so many of us are learning as we go. I think that a lot of the public health actions that we know and have been tried and true in the past are pretty blunt, and I think we've seen them, them work in various settings. But, but as I mentioned, these are, these are instruments that that apply across a population. And they're obviously, obviously bystander effects that are either social economic or on health systems or all the above. So I think as we go forward it's important to apply precision in this space, and I think geospatial mapping allows for that that potential to improve precision. I think the value here is that it can make our public health actions more efficient, more cost effective and hopefully both in, in delivering health improvements. There are many ways to do this. Some of which have already been mentioned, including contact tracing and isolation and quarantine. I think that the key is to pair, pair information as at as precise a level as possible with accurate testing and access to control measures that can be precisely delivered. This can be things such as, you know, hybrid response, which can include all of these things. Some of the things we're looking at at the foundation are how do we do, how do we target clinical trials. You know, how do we know what the likely burden of disease will be in a country, whether it's in the United States or elsewhere, where a clinical trial can be done on a vaccine or a therapeutic. I think we've all just seen some of the information out of Oxford recently about dexamethasone. How can we precisely decide where to deliver that that type of intervention. If you go to the next slide. A lot of a lot of what I'll present in the next few slides is is based on some of my grantees and and so some of whom are on this call. A couple maps that is generated from Alex Vestmanyani's group up at Northeastern University, and he has an agent based model that essentially is trying to predict what the projected cumulative deaths are to coven in both unmitigated and mitigated circumstances. And it's a great model and and I think has a lot of potential. My only point here is that currently the model predicts at a country wide level. And I think that it would be better to eventually project levels at a much more subnational or even higher resolution level for public health action. So go to the next slide please. I think this precision type of model has been pioneered by Simon Hay who you've already heard from today, and his team at IHME. And you can see that the national level certainly give some resolution but if we can get to much more precise levels. We can make decisions of how to deploy vaccines or how to deploy therapeutics or diagnostics even. And so I think that paradigm is what we are aiming for in terms of precision public health. Go to the next slide. There's a number of different data layers that can be added to these maps to improve their their their accuracy and this can include travel time, which you've also heard about today. Many, many different epidemiologic factors can be added. I think you heard about temperature and climate. I think PM 2.5 and even various other environmental risk factors can be added. And this can be done either to predict where outbreaks may occur or during outbreaks to trace the spread and potential impact on various related geographies. Go to the next slide. Next slide please. Yeah. Past it. Go back one. Yeah, here. I think that we've all known or realized recently that there is an age distribution of deaths in COVID and severe disease. And so I think recognizing how those how populations are distributed around a country are important. This is a map from from any Tatum at world pop and you can see the population pyramid up on the upper right corner. And then the color of the map by by sub region or subnational geographies kind of lists the percent of people over 80. And obviously those populations would be a higher risk so you might target those those areas for for quicker intervention. Go to the next slide. Another thing we are looking at the foundation and elsewhere are are not only COVID deaths but also the impact on health systems and other illnesses. And I think we all know there are many other diseases that have potential impact. Malaria is one of those and certainly having an underlying understanding of where where malaria occurs in and where the highest rates occur can lead to understanding where we might prioritize interventions going forward. Obviously with COVID many of many of the ACT and diagnostic efforts have have been hampered. So prioritizing where where those might have most benefit would be valuable. Next slide. I think there are also a number of potential responses that that have already been implemented. And so one of those is access to water sanitation hygiene. You can also see here that that knowing where those those challenges are highest also makes makes targeting potentials places where interventions should be employed is also important. So in this in this case targeting places where where wash has less access would be important for control of COVID and other diseases. Next slide. I think that's coming up. I think a number of people had mentioned various interventions. I think one that is consistently becoming more recognized as as valuable as wearing a mask and Facebook has done a number of global surveys. And I think this is another important layer to include in any mapping. Here we see that in the United States in places where there are there are much more granular data you can get to a subnational level level, but elsewhere we have just national level data. And I think improving that would certainly make make the implications more robust. Go ahead to the next slide. I think another important component of response is access to the health facility. And this is certainly of concern in many places of our key geographies in Africa and South Asia. And certainly if if you do have access to to a facility you can you can get diagnosed, you can get treatment where it's available, whether that's intensive care or hospitalization, oxygen or even dexamethasone given the current recent data. So understanding where exactly a facility is and where you might need to supplement the current facilities that exist is important. Next slide. I think testing is a is also another challenge, as we've seen in the United States and elsewhere access testing is not universal and understanding where testing occurs is really important. I think each country has a some understanding of where they have testing in place, but that that data is not available on a global basis yet. I think they're starting to pull this together in many places. IHME has done this for their models and many other groups are doing this as well. The goal being to increase testing and geographies such that contact tracing can occur. Next slide. And then I won't go into this too much. I think mobility data is really important for a number of reasons. And, and obviously this data may be biased based on on smartphone access. But I won't go into this because it's been covered I think extensively already. So next slide. IHME has has looked at mobility from a number of sources so Facebook data, Google data, Descartes data, there's a number of different providers that have access to these types of data. And I think again getting more and more granular is would be more valuable to to building these maps and then making public health decisions based on it. And then next slide. I put this up just out of interest. It is I think this this this pandemic has demonstrated that there are a lot of new actors in the field. Instagram appears to be one of those they are calculating their own are effective and I put this up in a mapping meeting because they are doing it by geography by by state. I think there are some methodological issues that that are not ideal in this in this model itself, but it is important to note that they are categorizing it by geography and I think eventually we could get to even more granular Distribute or display of key epidemiologic factors like like are effective. And so I'll stop there and hopefully this was interesting in in in how we're using data such as these two inform our policies. Okay, thank you David it was very interesting and very useful to see how this how the state is being used. We're actually a break time but we'll take maybe five minutes for a couple questions and I also want to encourage the other panelists to look at the Q&A panel and directly answer some of the questions. There are lots of questions and we don't have enough time to get to all of them. I want to ask one that was originally directed to Maria but I want to I think it's a more general question and the the questioner ask points out that the ESIP Federation Disaster Life Cycle Cycle Cluster the all hazards consortium and the utility industry have worked together to improve the trust level of geospatial and other data sets, and they're doing this by assigning something called an operational readiness readiness level to those data sets. So, is the USGS taking similar steps to apply trust levels of geospatial data sets, and I want to open it up after Maria to the other panelists. We've, we've heard a lot about the data coming in and these measures based on data. How, how do we trust and how do we, how do we measure the trust of these data, because obviously it's very important. So we'll start Maria. Yeah, thanks Harvey, and I'll say that I actually reached back to, to several people in my organization to get a rough answer to this question because I didn't know for all data sets. So the, the short answer is no we are not publishing this information with each data set, although I'm very interesting, interested in following up with Dave so if you could reach out to me after I think that'd be terrific. I'm sort of how to connect these two groups together. But we do work with several groups, especially across the interagency federal groups, especially like disasters.geoplatform.gov and the data.gov platforms, which do take some, some sort of general thing about who's authoritative here or there. But this, this type of trusting your data is really, really critical. And especially with something that's relatively new for a lot of us data consumers like pandemic information, you know information about floods or the weather, or a lot of things that occur pretty regularly are a lot easier for us to tackle and and suss out authoritative data sets. In this pandemic, I think there's a lot of noise and we've got to find that signal in there. What other panelists like to comment on that. How do you, how do you know you can trust your data. Yeah, Elizabeth, how about you. Well, for us it's actually in Ohio, it's actually been a process of working with our data for the last three months. I think we're finally coming to some understanding of what our data is actually telling us and what the strengths and limitations of it are in the way that it's collected or the changes that occur every time our testing strategies change in the state. And so, unfortunately, there's a there is some aspect of just having to work with the data and observe it and understand what happens when policy changes, or ways in which the data are procured change. There's not a great magic bullet, unfortunately. And it's been a long process where at least in Ohio after three months I think we're finally comfortable with the data that we have and we understand it a little better. Okay, Phil, do you trust your data. Yeah, I just want to add to that. In fact, yeah, for today I may say I trust my data tomorrow when looking back. Oh, wow. Absolutely, I agree with that. So yeah, it could be introduced by a volunteer who's typing, you know, extra zero or two zeros after that. Or it could be that, for example, even the government from different countries, their health department when they announce the data sets after a few weeks say, oh, we need to revise that based on the new data coming in. So we have some confidence, but I would not say 100% trust on that. The way to do that is that we do go back to look at the historic data sets so we have people checking that. And also, we check our data sets and others who are present in the data sets. For example, if I say that the curve I use today is different from the one from John Johns Hopkins, maybe there's something over there and we need to look into. So it's very dynamic. Yeah. So David, do you trust the data of your funders? People receiving your funding? What can you say about that? Yeah, absolutely. So I think it is nice when two distinct data sets agree or conclusions can be drawn that are similar. So I think it is important to triangulate across data sources. I think that, you know, for instance, with contact tracing, it's nice to look at genetic sequence data from the pathogens and see if it corresponds to what the contact tracing allows. Okay, I think we'll have to, oh, I'm sorry. Go ahead, Maria. Harvey, actually, can I answer the second question because I think it actually pertains to this? The second question, if you can repeat that question for the audience. Yeah, it's what are the good reasons for not making the data public? And I actually think that this has to do with data trust because in the USGS, we pretty much have two main reasons that we wouldn't make data public. We are a public service agency. We push everything possible out public. One would be national security. We collect data that we straight up can't release. That one's kind of a given. And two, the reason that we would delay releasing data is QAQC. And I think that that quality assurance step is something that we are all sort of rushing forward with with this pandemic. And I think that the press and many are rushing to publish things before they're peer reviewed. And I think that we commonly will hold data just to check it for a little bit before we're more confident in it to release it. And I think that that's really a critical step is to publish your QAQC information so then we can gain the confidence in the trust of our data sets. They're very intrinsically linked. Okay, thank you. I think we have to call it there because the next session is at four and I want to give everyone a chance to take a Zoom break. So we'll end it there. Thank you panelists for a very interesting session. Also, I encourage the panelists to respond directly in the Q&A window to the questions that have been raised. So we'll quit now and we will gather again at four o'clock for the session on spatial indicators of resilience and recovery. See you at four. Okay, everyone, welcome back. And we have our third session saying a third topic, which will be on spatial indicators of resilience and recovery. And Kathleen Stewart will be monitoring that session. Kathleen. Great. Thanks, Harvey. Hi, everyone. I'm Kathleen Stewart. I'm a professor in geographical sciences and director of the Center for Geospatial Information Science at the University of Maryland. And I'm a member of the Mapping Science Committee. Thanks so much for joining us today for our workshop and for being with us for our third session, which is, as Harvey mentioned, spatial indicators of resilience and recovery. So our first speaker in this session is Ed Parsons. And Ed is the geospatial technologist for Google with responsibility for evangelizing Google's mission to organize the world's information using geography. In this role, he maintains links with governments, universities, research and standards organizations, which are involved in the development of geospatial technology. Ed is a member of the board of directors of the Open Geospatial Consortium, and he was co-chair of the W3C OGC Spatial Data on the Web Working Group. He also represents Google at the MTEL committee of ETSI developing geospatial solutions for emergency telecommunications. He's a visiting professor at University College London and has been an industry advisor to a number of international universities. Ed is based in Google's London office and anywhere else he can plug in his laptop. Ed was the first chief technology officer in the 200 year old history of the Ordnance Survey and was instrumental in moving the focus of this organization from mapping to geographical information. He came to the Ordnance Survey from Autodesk where he was an EMEA applications manager for the Geographical Information Systems Division. And he is a fellow of the Royal Geographical Society, an associate fellow of the Royal Institute of Navigation and a professional member of the British Computer Society. So Ed, we are ready for your talk. Thanks. Kathleen, thank you very much and actually good evening here from London. It's the end of a rather cloudy damp day as perhaps you would expect all the cliches are often very true. My presentation is probably going to be slightly different from those that you've seen earlier on today. I'm looking at the use of geospatial technology as we move out of this current outbreak, potentially looking forward to other outbreaks, but also looking at the impact of geospatial technology in restarting economic activities. And doing that from the perspective of looking very carefully at the ethical considerations that arise from this news of geospatial technology that's going perhaps in a slightly different direction than previously. Before I start and something you'll be familiar with from a Google point of view, if you can sort of conceptually agree to this Terms and Conditions dialogue box, the comments that I'm going to make during this presentation on my own, they're not those of Google or any other organization that I might represent. I'm not going to say anything particularly controversial, I think, but nevertheless they are my own personal opinions. I'm sure we've all thought it's very strange. The last few months, we've been living on what felt like a science fiction film. This is from the science fiction film 28 Days Later that was popular a few years ago, a British film that covered the impact of an outbreak of a very violent and fatal infectious disease in London. And often it felt like we were living in this science fiction movie. Of course, being technologists and geographers, we all wanted to use data to try and improve the handling of this outbreak from an operational point of view as well as a strategic one. What do we do to help? And I think most of us have probably been frustrated over the last but more importantly to allow national and local policymakers to make decisions. There was a recognition, I think, clearly that the commercial sector technology platform providers like Google and Microsoft and Apple and Facebook and Twitter had some access to data sets that potentially could be useful. And of course, we across the rest of the industry worked on that problem very, very quickly and produced what information that we thought would be initially useful. I'd just like to talk about some of the thinking behind creating some of these data sets and making them publicly accessible, because I think they then give us some indication of the direction of travel where we might be over the next months and years, where we've suddenly made access to this type of data and technology possible. The first is the creation of what we terms community mobility reports. This is an idea of how people are traveling around the world and how to represent that information in a way that is both beneficial and useful to policymakers, but also privacy protecting. It's based on an inherent capability that we have with the mobile devices that we carry around with us today in our pockets. And this is something I like to call ambient location. It's that ability to always know where you are and the ability to share that information to service providers or to other applications completely seamlessly with no particular effort on your behalf. So you are in effect collecting this information as you you live your day to day lives. You collect this information often because it gives you a direct benefit. This is the local railway station to me in Southwest London actually looks pretty much like this now pretty much deserted far fewer people using the trains than normal. But on a normal day, I would be able to pull out my phone and see represented on my phone exactly how busy the station was at any particular point in time, compared to what I would expect on average. That data we we have been displaying on people's mobile phones for information about all sorts of different locations largely public locations such as shopping malls or railway stations or airports and it's been useful but not particularly prominent. Of course, with the outbreak and the the requirement the need for government agencies to have this sort of mobility data. We worked on okay well how can we make this information more available and accessible to people while maintaining individuals location privacy. And over the years we've come up with a methodology that uses differential privacy in simple terms the the application of noise random noise to people's spatio temple location to make sure that no individual can be located within the data set but nevertheless it provides a statistically meaningful measure of how busy a particular location is. We could abstract that out in terms of space and time to provide grass like this as you may well have seen very much aggregated but telling you how these different categories of land use have changed over time from a baseline that was in January of this year. You can see here in Greater London big reductions, particularly in terms of people using much less public transport. So the key concerns behind doing this from a privacy point of view where this was based on a function in Google Maps called location history that people needed to opt into. Was it appropriate to use this data even though potentially people were unaware of how this might be used for this particular outbreak. How could we make sure that we avoided misleading insights and I think it was a very prominent useful point made earlier sometimes it's not appropriate to share data for example, how busy is a hospital is not necessarily the sort of information. You want nevertheless, necessarily to share what was the appropriate level of granularity both in terms of space and time, who should have access to the information actually the mindset that we should make this available as broadly as possible. And actually how long should this data set be available, it was created specifically to deal with the impact of this COVID epidemic for public policymakers. It's not something that we think should be there for always and when the outbreak finally does disappear, we will retire this product. But that same data has also been exposed now more prominently for people returning back to work the new normal is going to be an environment where we're much less happy to be in contact with other people. So here are two examples of technology that we've just introduced in the last few weeks one crowd sourcing how crowded bus routes are. So as a passenger you can say this particular bus is you know slightly crowded or more crowded than it usually is, and that information is then shared with other potential passengers and also very focused on transit stations and on public subway systems where you know, there's going to be a huge kind of requirement to increase people's level of confidence in terms of using these services, much of that driven by information. Now Elizabeth did a really good job earlier on describing contact tracing or exposure notification as we prefer to call it and there's a long history. Well, in COVID-19 terms, a long history of doing this. At the very beginning of the year, Singapore was perhaps one of the first countries to introduce an app for doing the digital contact tracing that you've heard again, using that Bluetooth LE technology and of course we worked very closely with our friends at Apple to develop a framework to do this and to try to reach as broad a population of potential users as possible. Key point raised again I think by my colleagues earlier on is that this needs to be out there and being used by people to have a degree of efficacy. So key points from a perspective of the technology and their privacy point of view from the joint Google Apple approach is that this needs to be an opt-in. It should be something that you can choose to use or not. No public identifiable information should be shared. So it makes use of anonymous IDs generated by the Bluetooth subsystem on your mobile phone. And really importantly, perhaps a little bit surprisingly for this audience, no location data should be shared. And for many applications of the specific part of contact tracing or the ability to be able to notify that you've been close to someone who could have potential infection, you don't need to know where that actually occurred. So here's basically how the system works. Elizabeth carried this very, very well. Phones both generate and receive Bluetooth identifier strings, they regenerate every 20 minutes or so randomly. The key part of the Google Apple solution is that matching of different keys to infected and potential carriers is actually done on the phone itself. It's not carried out in the cloud anywhere, minimizing the risk of data being shared more broadly. Very important to focus on that privacy preserving part, as I said, has to be explicit user consent. No location data is shared. Positive results are not shared between Google or Apple. They only go to the public health agency that's been identified and Google and Apple made a conscious decision only what to work with specific health agencies identified at a national or local basis. You have to be whitelisted to make use of this functionality to build your app. And here is an example of what those apps might look like. Okay, that's kind of covering the territory of what we've done up until this point in time. I now want to just kind of conclude by looking at where this might go forward from a point of view of the ethics of using this sort of information. Clearly, we have been living in an emergency situation. This is something that, you know, none of us really expected. And we've had to develop capabilities and functionalities that perhaps have surprised all of us we've done things that perhaps we wouldn't normally imagined doing. No crisis is about choosing good solutions. It's also about choosing the worst. You're about choosing the least worst solution. But in cases of emergency, what you do is you come to your emergency and you break the glass. When you break the glass, you do things that perhaps you wouldn't normally do. And that's an appropriate circumstance. Having a protocol that you could then adopt when you're in an emergency situation makes a lot of sense. You do end up doing things you wouldn't normally do. But when you break up glass, it stays broken. It's very easy for us to adopt processes or techniques that perhaps we might not want to see persisting. We may do things in the stress of dealing with a pandemic that actually create foundations of an environment that we might not want moving forward. And let me just give you kind of two examples of, you know, I'm not going to be explicitly criticizing these approaches. People take different approaches to dealing with public health issues, but it's interesting to think about that. This is the contact tracing app in India. And it's something that you actually have to by law install on your mobile device. There's no choice. There's no opt in the Indian government have made the choice that no, this is something that you should install. In Hong Kong, there's a very sophisticated contact tracing approach that makes use of lots of different types of hardware. It's not so completely reliant on a mobile phone. You also use just simple wristbands or wristbands with additional RFID or Bluetooth sensors. This is of particular use when you have been quarantined. If you've been quarantined, you'll get a message from your health agency saying that you have been quarantined. You're expected to stay at home. And as part of the process of being quarantined, you need to walk around your home so that your device, your mobile phone or one of these wristbands will pick up all the RF signals in your particular house and will build a geofence of your particular properties. It knows the radio footprint of the property that you live within. If a change in that footprint is sensed by you leaving your property, you can then expect a rather nasty message telling you that, hey, we know you've left home. You may well be prosecuted or you'll least have a warrant acted against you because you have broken your currency. So this is a use of geofencing technology to restrict people to a particular part of the planet. Now, you might argue under these circumstances, that's appropriate, but it's a technology that once it's rolled out in such a sense, it's there. So I think we have to be very cautious. Fortunately, I think we're beginning to have much broader conversations around these topics at this point in time. There's a number of activities of foot. I particularly point out the ethical geo organization, which is a part of the Association of American Geographers or the American Geographical Society that is working on this. I originally blogged on this topic as well. I think it's really important for us as an industry and as a technology to start to think about the ethical implications of the technologies that potentially may be rolled out over the next few months with the best insights and with the best of society in in policy makers eyes, but we need to be wary of what the long term implications of that technology might be. So thank you very much for listening. We're going to be hanging on and hopefully to answering some questions at the end and please feel free to reach out to me via email or via Twitter. Thank you very much. Super interesting. Great. Thank you so much, Ed. I appreciated those thoughts and information very much. Okay, well let's save our questions to the end and we'll move on to our second speaker. Who is someone I know from the University of Maryland I'm very happy to introduce. Professor legion legion is the Herbert ribbon distinguished professor of civil engineering and is director of the Maryland Transportation Institute at the University of Maryland. He is a professor at the College Park campus. Dr john's research focuses on innovative mobility solutions travel behavior smart cities and decision support true tools, driven by data, AI and cloud computing. He leads several federally funded initiatives on leveraging mobile device and other emerging data sources for improved understanding of spatial behavior and modeling. Thank you for your talk. Thank you, Kathleen. And if I could have my slides up. Good afternoon, everyone and I want to thank Harvey, Kathleen and other organizers for this great virtual workshop. My pleasure to be here to talk about some of the work at Maryland Transportation Institute and University of Maryland on how we measure mobility, social distancing and economic impact with anonymized mobile device data. I'll put our website up there in case you have not visited it, I would encourage you to go to data.covid.umd.edu, which is the place where we give the general public access to computing the measures on these specific factors. Next slide please. And this is what the platform looks like. What we wanted to do is to really compute a variety of different measures for decision support using mobile device data and other data sources all across the US. And honestly, we also caught a lot of media attention. I have to admit that I never thought I would catch such media attention for discovering something that's fundamentally so basic that everybody knows, which is that people really do not like to stay at home for way too long. You know, that's an interesting behavior discovery that we found early on after people staying at home for more than a month, they just decided that's too much and they decided to go out. Next slide please. And there were questions, next slide please. There were questions early on about how mobility and social distancing and these factors could have computed from different kinds of data sources, and we certainly do that as part of our platform like many others. So, you know, we track our social distancing index, which is based on the anonymized mobility behavior data, and we look at that at a state and county level on a daily basis. So we, you know, if you go to the website, you can see the holistic view of how social distancing has been evolving between January 1 this year until the most recent day for which we have data. The clear thing is that the level of mobility and the level of social distancing really vary a lot from state to state, from county to county, from community to community. Next slide please. And we also compete, we understand it's important for both the general public and also decision makers to not only be aware of the mobility trends, but also auto COVID public health related factors, as well as economic indicators. So we also leverage our mobility data to compute several different economic indicators, percentage of people actually working from home, number of workers actually working from home is one of them. But we also look at changes in consumption based on the kind of data that ad just showed you in terms of visits to consumption, different kinds of consumption sites. So we also actually estimate job gain and job losses by economic sector at a county level, using, you know, based on this kind of anonymized behavior data. Next please. Finally, we, you know, I consider ourselves a leader in transportation and mobility data analysis. We didn't start doing this because of the pandemic. Our CATLAB, our Maryland, you know, which is our, you know, the data center, affiliated with the Maryland Transport Institute and we've been working with mobility data, transportation data from passively collected sources for the past 20 years. In the beginning we only get data from GPS devices, embedded in vehicles, or other person wearable GPS devices or cell carrier data. More recently, you know, we, you know, starting about six, seven years ago, we started working with a mobile phone location based on data from apps from different SDK units. And our platform right now serve more than 12,000 public and the private sector data users all across the US as well as in Europe. Next slide please. And the specific data I want to actually dig a little bit deeper into on the next slide is the primary data source we're using and added did a great job talking about ethics and privacy protection on this kind of data. And I want to dig a little bit deeper into what this kind of data looks like. And we all know these kind of mobile device location data, they are biased. Number one, because the smartphone penetration is not representative of the general population, it's naturally it's biased toward or against the lower income and also older age groups. And even if we see a device in our sample, it doesn't mean we see the device every single day like a longitudinal survey. Even if we see a device on a particular day, we may not see the device in every single hour of that day. So there is a lot of a geospatial temporal social economic biases and the data how you know I want I do want to spend probably four or five minutes to talk about the kind of work we've done to help set a national standards for the quality of this kind of data so we have more confidence in the results that we obtain from this kind of data. And by the way, we can also impute in all kinds of information from the data, not only where people go, or, you know, with a not, you know, privacy protection but also how they go there impute a travel modes, purposes of the visits, as well as a social demographics in a way that also protects privacy. Next slide please. There might be a delay so can we go to the next slide. Yeah, thank you. And so one thing we've been working on in collaboration actually for the US Department of Transportation is to work with them to try to, you know, this work started about three years ago to set a raw data standards for mobile device data because of all the biases and all the issues with this kind of data. How can we be sure and when when researchers publish a paper using this kind of data, they can also they cannot share the actual data set they use that also creates a problem how can we replicate the results. If we cannot share the raw data because of privacy protection. What we see as a way to deal with this is to come up with a standard quality metrics, and this is the place where I think geographers can help a lot, but we did look into the typical data quality standards like on a monthly basis. The devices we see MAU monthly active users, daily active users, but then we also develop some additional measures to look at a geographical representativeness of the data. Do our data really cover all the different urban rural different areas, equally or not and to an extent they cover different geographic areas. In a way that's that so the data is representative and in terms of temporal consistency, and we, you know, if you look at a table, you know, we cannot share the name of the actual original data provider. But as you know, there are many, many, quite a few different data providers out there. If you measure these things for different data sets. This is all for the US and you will see that they vary a lot. But at least, you know, this kind of information is something researchers and everybody can share as we share the results. So we know the kind of data that went into producing these results. And going further, what we do is we don't rely on the data from just one data provider. We actually merge the data from multiple different data providers to create a raw data panel. And we set data standards as follows, and we said that when we merge the data, you know, this is a standard for data fusion and for merging data from different sources. The criteria we use centers on the concept that the merge the data set should be superior to any individual data set based on all data quality metrics. This is a standard reset, which we believe it could be a good practice for for data merging as researchers practitioners and decision makers start using this kind of data more and more as we see as a result of this pandemic. Next slide please. The next issue is even if we have a high quality original data set to work with. We still need to address a lot of issues with this kind of data for you know for geospatial analysis for economic analysis and all kinds of decision support and research. And we need to identify trips and tools. And there are issues here and how do you define a trip we look at a ways commercial data providers do that. There are a lot of issues you know we we often see that a trip that you know a single trip because of the tool and activity tool gets broken into multiple trips if we just take the number of trips from these commercial data providers, then we will be either significantly overestimate or in some cases on the estimate number of trips and so we have to use a tool based approach. And a lot of information that's not in a data that we need to impute. So we need to look at travel modes purpose and distance and just measure the distance of these trips, because they're you know you know for financial sector applications. There is a lot of interest and how far people travel and just measuring that distance is actually not a trivial task. And on, we also need to integrate mobility data with point of interest data with other economic data, and all of that takes a lot of algorithms and also finally, but last but not least, we need to weigh the data in ways that we can address, like all of the different kinds of biases, geospatial temporal and social economic and penetration wise bias biases. So we actually have developed a multi level waiting to address all these different biases, you know when we look at these kind of data. And lastly, we need to also be able to calibrate and validate the results which, you know, is often missing when we see some of these results how do we actually validate the results we get from this kind of a large but biased mobile device data. We don't have time to go into details on these each individual algorithm, but what we hope is to find a way to work with the old, you know the entire community to be able to use our collective wisdom to improve the accuracy reliability and enterprise protection of all these different algorithms, and be able to compare these error grid algorithms, developed by different sets of researchers and companies together. I think we really need something similar to the image net for the AI community in terms of geospatial and transportation community, as we use this kind of data, we need a standard algorithms, we need a standard data set that people everybody can access securely without violating the protection, so we can really collectively enhance these algorithms and enhance the science behind it. Next slide. Alright, so now I want to describe some interesting use cases and some that probably have not been touched upon by earlier presenters I want to focus on these cases. At a federal level, you know several federal agencies are using our data, some of them are data we publish on the platform. Some are the data you know we customize and provide to them through direct collaboration with them, and including transportation, CDC, Veterans Affairs. On our platform, we also combined a house data, economic impact data, mobility data, vulnerable population data to provide society and economy reopening assessment, you know a tool called a CERA, Veterans Affairs is actually using that to make their own reopening decisions. A lot of interest from the financial sector and department treasury on using this kind of data to look at long-term economic projections and different kinds of financial investment portfolio analysis as well. Next slide. And as a transportation engineer by training, you know we certainly look at different kinds of travel behavior changes and this graph just shows the changes of number of very short distance trips and very long distance trips all across the nation in different months and so you know you see the trends for all these different trips. You know one thing that's interesting was that at the beginning of the pandemic we actually saw a huge increase in long distance trips and these are people who were in a hurry to go home or to you know flee away from their home which might be a hot spot at a time. Next slide. Some other things that might be of interest to geographers and others is in terms of activity, people's activity participation and time use changes, you know we see a lot of these activity participation and time use changes also. Here I offer two examples on how people's arrival time, you know how people's preference on when to go for shopping changed because of the pandemic as well as the amount of time they spend in the stores when they do go there. They actually do spend longer time to go to the store and now you see more people going to shopping sites earlier during the day rather than in the evening or late afternoon. You can you know look at these for all kinds of activities and time use patterns as well. Next please. Next slide please yes. And other things that others already mentioned from the raw trip observations we can aggregate things up to the origin destination level and be able to look at external trips and when we look at external trips and we you know this this would really allow epidemiologists to really trace the transmission of the viruses as people travel across the nation. So so in a way based on this kind of mobile device data and algorithms for expanding the sample data to the population, we could actually create a digital twin of entire country in terms of the individual level but anonymized individual level movements to be able to actually trace out how mobility and virus spread really occur across the nation and be able to publish the kind of results at aggregate level. Next slide actually shows one interesting you know county level example in Maryland. So, in Maryland there was a myth, one of our counties, which is where University of Maryland is located in is doing a great job on social distancing based on data based on observations. And somehow the Prince George's County is just getting a lot more new cases than the rest of the state. And then after we overlaid the data on important cases, these are our estimated number of people who traveled to this particular county from the rest of the nation from New York from all the other places in the nation. And we found a very high correlation between these number of imported cases in Prince George's County so in other counties they got less so that explained in a way why they're still having a lot of new cases, even though they're doing a good job in terms of preventing community transmission of the viruses within that particular county. Next slide. And I want to also talk a bit about contact tracing and so and also I'll break, I'll break a new outbreak prediction like pre based on visit data that at showed you that I'm showing you here as well. You can monitor all the visits right to all the different locations all across the country and maybe internationally as well, as long as we don't we eliminate the point of interest that are getting too few visits, which would get into private protection concerns you know Google, you know I see Google does that but also sometimes they would just say we don't have enough data, we do the same thing. But the point is that on the next slide you will see that we're able to look at the level of social distancing and the level of crowdedness at a point of interest level so we're actually providing that service, just as a pure service to different cases in this particular case for Baltimore County in Maryland, which also suffer from a lot of cases and new outbreaks. We're helping them on a daily basis, monitor the level of a crowdedness and more than 6000 vulnerable locations. These are the locations they worry about every single day to actually predict the risk of a new outbreak at every single location based on how many people go there. And the approximate location of where they come from, and the level of infection rates or number of active cases and origin so all of that play into this record risk estimation at individual point of interest level. And that helped them understand okay today maybe we should go to these places to do more disinfection and allocate their limited resources for COVID response. I don't have time to go through probably too many applications maybe just one more on the next slide on what we call a community level or aggregate level contact tracing. If we could go to the next slide please. And so this idea comes from the observation from our data, you can see that on the platform as well. Actually the slide number 16 the previous slide please. I mean there's a delay I think in what I see and what the operator sees. And other than two of the US the states in the US, no other state has really enough number of contact tracing workers who can use a boots on the ground and method to do individual level contact tracing. But by integrating this kind of mobility data and AI algorithms, what we can do is, if, if a new outbreak happens at a supermarket, a church or a school within within seconds by leveraging cloud computing. We can give an area to local authorities and tell them that based on the aggregate level mobility patterns, because you know, in order to contain this local outbreak, we suggest that you notify people who are in this localized area to encourage them to self quarantine or at least let them know they might be at risk. Many, many days before they can complete contact tracing using the traditional method and we see this as a good example of leveraging this kind of mobility data, AI and computing, while protecting individual level privacy to really help local and state agencies to be able to better contain a local outbreak, a new outbreak. Next slide. And we have also been doing work on really estimate the economic impact, because by looking at how many people go to work previously and stopped going to work, which means they either are working from home, or they lost their jobs. And we have developed a method to actually distinguish these two kinds, which would allow us to look at how many people are working from home. How many people have lost jobs, what kind of jobs they had before. And now for the people who lost their jobs. You know, at aggregate level, you know, we're lucky not looking at individuals at a content state level, how they're adjusting, you know, are they, you know, are they getting new jobs at economy recovery. And so we can provide that kind of estimate at aggregate content state level, based on mobility data, much, much faster than federal statistical agencies can provide based on their typical measurements you know their data is really three months, maybe even six months nine months late, but we could actually provide these estimates on a weekly basis. And with that, I'm going to my last slide, I see Kathleen also, you know, already showing her face, I understand what that means. My last slide is just some research questions for, you know, for the community to to ponder. The first one is how can we best define mobility and spatial behavior indicators. I look at the way we define social distancing, and others define, I think more work can be done to really measure close contacts to measure social distancing, which then epidemiologists that can use for their models. And the second question has to do that a travel models in the past were done for long term planning. We're looking at the normal day 20 years from now, and now is day to day observations of mobility data. How can we really advance travel modeling using this kind of day to day observations from a larger sample of the population. And the third one has to do is how we can work together to leverage the ability to measure a person level spatial behavior, almost continuously for a very large sample of anonymized individual. Now, every single day I worry because when my students work on this kind of data, I worry if they really fully understand privacy protection. So we took some very rigorous measures on protecting privacy. The data always sit in a secure server in the cloud. There is no data on zip drive on USB drive on local computers. But I'm not sure if everybody, you know, we do that so it's really the ethics for protecting privacy because otherwise, the entire community may lose access to this kind of data. If privacy protection, you know, really is not observed by certain groups, then we're all losing access to the data. So the balance between privacy protection and being able to extract research value and private value out of this kind of data is very important. So really look forward to, you know, engaging and working with the community on how we can ensure both privacy protection and responsible data use while we seek scientific discoveries together. And thank you again for your attention and look forward to seeing what questions and comments you may have today. Great. Very nice. Good. Thanks, Lay, for that. We are going to move to our final speaker in our session today on Jonathan Chef. Jonathan is a statistician at Transit App, a mobility as a service smartphone app. Jonathan trained in statistics within the fields of education and neuroscience, receiving his MAD from Harvard University and MS from Vanderbilt. Within neuroscience, he used longitudinal MRI data from children to find neurobiological indicators of dyslexia and pre readers. Since moving to Montreal with his wife and epidemiology doctoral student and McGill, he has shifted towards data sets and public transportation. He used Transit App data, as well as transit agency data to seek behavioral patterns of users and insights into equity and development. Jonathan is a mountaineer and a relative newcomer to geotemporal mapping. Hey everyone, thank you for attending and Kathleen and Harvey, thank you for hosting and to the rest of the mapping science committee. I'm really honored to be here. Simon, you said earlier that you're an imposter here. I feel like I'm an imposter squared maybe. My wife, the epidemiologist, should maybe be doing this, but I'm excited to present to you just the background I gave, I do come to transportation and mapping from neuroscience. There are a lot of processes that we're building and one thing I'm excited about from this workshop is the discussion after about hearing from you and the other panelists about what else is possible. We can go to the next slide, by the way, look at the outline. So I'll talk about the data sets that I'm using at Transit App and I'll try to just generally talk about what is, what are the data sets that we use in public transportation analysis in general. Also one of those capabilities is user survey capabilities which I think haven't been touched on much so I'll just mention those as a part of the data we're analyzing. So I'll start with the kind of location data we have and what we have done with it, but especially where we're going with it. Because I think overall public transit data has been and can be a very strong indicator of many components of the flow through a pandemic. So next slide. I'll just start by giving you an overview of the kinds of data that I'm working with. So from Transit App, I included a screenshot on the right just so you can see what kind of behaviors we're looking at. It's a trip planning app, it gives you real time ETAs for buses and trains. There are multi-modal trips like bike share and ride share so it's measuring basically how people get from A to B and so any of the behaviors that you're looking at in the app are things that we measure. So we do have location and timestamp when people use the app. We don't measure background location tracing like some apps. I was really excited to hear about ETAs ideas on privacy. Those really mirrored a lot of our own user privacies of core value at Transit. So we don't do background tracing but when the app is open, unless it's disabled, but to plan trips we have location data. We have trips data that are planned or just if you interact with routes, if you tap on say route E, the blue line there, you can see a close up of the buses and where available. You'll have crowding data and things like that. And then the other kind of data, we have our preference data like accessibility choices and modes that are turned on and off, things like that. Next slide. So when back in February when the spread of COVID was really becoming worldwide phenomenon, we built a model looking at transit demand in well all over the world and so we're using geography and time to look at the essentially the use of public transit. But this is based on data from our app. We don't claim to have ridership. We have usage of our app. It turns out where available ridership and our statistics correlated really well. So that was a nice validation. And I think the other thing that worked out really well is that ridership data aren't frequently available on a daily or an hourly basis. They're frequently published monthly and sometimes only quarterly. So it was nice to have this real time measure of public transit demand from our data that spanned across many different regions. So having the same source for multiple regions was what I found really valuable about this data set. And oh yeah, and this isn't just about transit app, you know, in public transit in general, agencies are collecting similar kinds of data, they'll have ridership data from automated counters on vehicles or station data, ticket, ticketing data, things like that. And again, there can often be a delay on that kind of this kind of data. Next slide. Great. So here we're looking at the hourly breakdown over the course of it. I chose a day before COVID the 13th of January. That's the top like green line. I chose a day serve peak drop in code. No, that's a bad way to say that. When when people were writing public transportation, the least in the middle of the COVID pandemic 15th of April. And then now, which is the middle dark green line. And it was interesting to see a lot of the temporal patterns shift during COVID. So again, the top line was essentially a normal Monday. That was the rush hour peaks during COVID that bottom line, the peaks really disappeared in many cities across the country, or if they didn't disappear, they flattened and shifted. And that was, there's a really interesting way to observe a shift in behavior of people using public transportation. And, and all of this ties to policy, it was a way to look at how people react to policies in their regions so when state at home orders were put in place, we could see these behavioral shifts occur and and how much they occurred so you know, getting back to the geographical data, we saw that there was a slower response in a lot of southern cities and Midwest cities and on the west coast and in the northeast. Next slide. Okay, so I thought I'd talk a little bit about user survey capabilities because it's an interesting part of our data. Next slide. In April, we we ran a survey, I mean internationally but in the US after cleaning is 15,000 respondents 10,000 in Canada and this is very much a phase one kind of survey where we were curious who was writing. We asked about transit usage and frequency and and some basic demographics. So here you can see some of the regions, the, the primary regions of the responses from the survey, and I'll talk about some of the results there just so you get a sense of our data. So next slide. So, for example, we saw during COVID a shift towards being the user base and presumably the writer base then being disproportionately female than before so this top these top pie charts are in Pittsburgh on the left you see before COVID where equal numbers of people identified as male and female. On the right, the April survey where almost twice as many people identified as female and male. There's a similar pattern really across the country you can see Las Vegas below is is a little less stark in Las Vegas but a similar pattern. Next slide. Another disparity that was very clear across the United States especially was a race disparity where people who identify as African American were disproportionately represented during COVID and people identify as Caucasian underrepresented during COVID so in gray you can see the before COVID breakdown in green you can see the during COVID and again this is Pittsburgh, which I chose in part because we had we had recent demographic work before COVID so it was a good basis of comparison. Next slide. And lastly we also asked a lot about employment and that was really useful to really get confirmation of what we all know that which is, you know, essential workers with the people still writing on food service healthcare sales things like that. So I brought these up because in the geographic mapping it's it's very useful to be able to have these kinds of variables and our survey work has been a valuable part of building maps and understanding discrepancies between usage and distribution. Next slide. Oh, this is the rush hour point I mean before we can really see a flattening of the curve where and a service shift in the patterns and the nice thing about our survey work is that we can then ask people directly about these behaviors so we asked people how their departure time shifted how their return time shifted and there's a clear shift earlier for the departure time in the morning and shift towards in both directions for the afternoon. So that was an interesting collaboration. Next slide please. Great and so connecting this to policy again. One of the things that we got a lot was when cities and agencies suspended fair collection due to coven. Was there a spike in essentially joy writers, which hopefully sounds silly to you that people are joy writing public transportation during a pandemic but it was really neat to see in the data that we weren't seeing any evidence for something like that so what you're seeing in front of you is New York. The dotted line is when the MTA suspended bus fare collection and, and we saw really no correlation in New York or, or in other cities where fairs were suspended. So what the picture I'm trying to paint is public transportation data as as a tool towards measuring and understanding people's reactions to the pandemic itself to public policy to COVID rates in their region and things like that. Great. So for for us at transit we collect some aggregate data on locations and trips. And next slide, you'll see Dayton, Ohio. This I took activity in the app from just some of the morning rush hour period when the our users are predominantly commuters and if you go to the next slide you'll see it the same period in 2020 you can see it's much less dense, but more than just seeing what we saw in the counts. I was curious about whether in Dayton, the shift in usage if it was geographically associated with anything. You know, such as demographics from the census data something like that so if he's in the next slide, you'll see I was comparing the 2019 to 2020 period where there was enough data and the blue is where the 2019 was the activity was greater and the red is where it was greater in 2020. And I didn't find I mean visually these are hard to parse but also mathematically didn't find a significant relationship here with some of the underlying demographics but it's the this is where I depart into sort of the the area of work that we're still developing and building the ETL pipelines to analyze but it's I think an interesting direction for public transit work in general that we we can look at overall movement of people in aggregate and try to get a sense of who is being underserved who are the people who have to take public transportation and that hopefully can lead to different policies about planning decisions that can serve them better. Next slide. This is an example of something as I was wondering would correlate with previous data in Dayton unfortunately like many cities in the United States does have some strong racial segregation on the left you'll see from the ACS 2018 data the darker colors are for higher percentage of people who identify as black or African American on the right. It's for the darker colors are for people higher percentages of people who identify as Caucasian or white. You can see their inverses and I was wondering if that would if we'd see that some see a difference in the pre COVID to during COVID distribution of behaviors and trips. Maybe associating with some of these demographics in the case of Dayton I haven't yet but this is again an ongoing project that I'll keep working on. Next slide. Also our demographic work we can use to look at how it correlates to geography. So, to continue talking about Dayton, you know, one thing that was mathematically significant was people identify as white were taking much longer trips, and that what I don't I have hypotheses that make me want to the next slide. So, to overall I'd like to talk about public transit as a COVID 19 indicator next slide. I've talked about how we have these basic counts of overall demand and how they alone have been powerful towards really measuring people's response to social distancing orders to measuring sort of the workforce dynamics during the crisis, and so as we recover from the crisis, how the workforce might be, how the dynamics might have changed, as well as an indicator of economic activity, but then when we add in the geotemporal distributions for me what I find particularly interesting is our ability to really look at discrepancies and access we can do, you know, route level, the map you're seeing on the right by the way is a Boston map we made it a route level demand. And that kind of analysis where we do route level demand. Over time looking at before COVID during COVID and during recovery can has really had the power to address questions of equity and discrepancies and access. And next slide should be. Thank you. Thank you very much for having me at this talk I think I learned much more than I distributed but it's really been an honor being here. Thank you. Great. Thank you very much. Great talks from all of our session speakers today. There are a couple of questions can Harvey is it okay if I take two minutes for questions or just just a yes, just a couple of moments for questions and then we're going to move into a wrap up. Okay. So, I've, I've got a question here about. Well, let's see was this for our session on contact tracing contact tracing discussion here sounds quite useful, especially considering no IDs are required. This question might be for you. However, they often, however, they often assume to individual physically present close to each other at a given time. How does the case interaction and close space at different time, e.g. one person a used to seed in the classroom later another person be got into the room after person a left use the seat. If person a was diagnosed positive later how could such a method capture cases like this so really detailed find spatial resolution. Yeah, I mean there is there's a natural limitation unfortunately it the technology is relying on the the proximity of those two devices at that time so there's a there's a spatial temple element to this. I think as we've heard earlier on today, it appears that most of the spread of the disease happen to airborne particles and not so much from surfaces so it's a it's a lesser risk but but you wouldn't be able to pick up the fact that someone had been in the room 10 minutes before you entered it because your phones would no longer be in proximity to each other. Yeah, I think it's, it's interesting. And from our earlier discussions on contact tracing to I think it would be interesting to know some from feedback in other words if you got a message saying you had been in someone's contact list. One of the questions you probably start to ask yourself is where could that have happened, and you'd want to know potentially where where did that happen. I think that's very much the case I think that there is a recognition from the technology providers that the, the technological component, the contact tracing via devices is just the starting point you still need that human follow up to go through okay. Where were you what were you doing and and that the human aspect I think is really important, not only from the point of view that you're capturing data but also from a, you know, a patient point of view of actually dealing with the fact that you're telling someone that they've potentially been infected. Absolutely. Yeah. Okay. Well, in that case, Harvey, why don't I turn things back to you for the last session. All right, thanks Kathleen and thanks for the panelists in the last session. Again, this has been a real really rich day full of lots of concepts and ideas and ways forward and I hope that everyone's mind spinning like mine is right now. I have a key I asked Mark right right heart from the mapping science committee to step up here in the last session and to give us a summary statement wrap this all up for us. So, Mark is the president CEO of the open geospatial consortium in addition to being a member of the mapping committee. And he has the, perhaps, inevitable, you know, not inevitable, you know, not welcome tasks to try to wrap all this up in one quick statement. But if anyone can do it he can. Mark, please. Oh, thanks Harvey. I'm the past president and CEO of OGC we have a new boss in place but still associated with the organization. I really appreciate all the speakers today all the panelists and their perspectives, some things reached out to me that crossed all the boundaries of the presentations. Of course, the diversity and granularity of data types. This is a real opportunity I think for a follow on in terms of, I think as one of the speakers mentioned that collaboration to pull together an understanding of the cumulative data sets that are required. The accuracy information related to uncertainty in the bias that are presented in those data types. I think that's a critical area of research. We talked about the use of traditional geospatial data to feed the models and how that data affects the outcomes that are being predicted but also tunes the are not and differential numbers. And I think that came out very, very clear. But I think one of the things that was really interesting across most of the speaker speakers was the use of mobility data, how that's rather emergent and new capability. It's showing a lot of promise but it carries with it a lot of issues related to privacy methodologies to anonymize the data but make it balanced enough so it's useful for the decision making in the modeling process. The use not only in contact tracing, but also in modeling behavior that becomes a very strong input to understanding the policy decisions that need to be made, the types of public health situations and preparedness that needs to occur. I really liked Ed's talk about breaking the glass. Everybody had a sense for that. We talked a lot about preparedness and understanding the escalation of an infection like COVID and the repercussions. But what is the playbook necessary amongst all of us to engage with less uncertainty and more more collaboration as we go forward. And that was Josh that mentioned initially that petabyte playbooks coupled to interdisciplinary geospatial modeling capability are needed to respond to future events but to design for resilience. That to me was a major statement that ties all these discussions together, everything from the predictive side to the resilience side and there's so many commonalities here. I just can't say enough about the experimentation and the use of mobility data. I think that's going to be a huge game changer and it needs to be a strong area of research. So I have a lot of other things to mention but I know we're getting close on, we're over on time here but I think this this really demonstrated across all the presentations about a number of seriously common threads that we can move forward with. So I'll turn it back over to you, Harvey. Thank you. Thank you very much, Mark. Good job. We do have about five minutes left if anyone has any final questions. We can, I can, you can turn them into the Q&A. Use the Q&A feature and I'd be happy to respond and send it off to any of the panelists throughout the entire day. Just a comment. Thanks to all the speakers. Wonderful events. Thanks for hosting. Yeah, thank you for that comment. I do think this has been really a terrific day. We had a great collection of speakers. There are some commonalities that I think those of us in the Mapping Science Committee will be processing for quite some time. We do have a committee meeting tomorrow which we'll discuss some of the threads and ideas that came out of this discussion and talk about ways forward that the Mapping Science Committee and others could continue development. This is an unprecedented event in human history, and it does require unprecedented responses. And I do believe that geospatial data, mobility data, temporal data is really one of the first lines in defense and mitigation and response to a pandemic like this. And I don't want to end on a down or no, but I do want to point out that this is not the last COVID or last pandemic that will hit us. This may be a way of the future, so we do need to plan for our societies, our cities, and our economies to be able to be resilient against these type of events. And just more generally, there are other shocks coming, of course. Some of them are social and human, some of them are related to extreme weather and climate change. So I do think this is a general call for us as a society and for us as geospatial professionals to think about what role these new technologies, new data, these new capabilities we have, which really are quite amazing as you saw in some of the presentations today. What can we do to really not just react when something like this happens, but the plan for a society and economy, and cities and neighborhoods that can respond gracefully to these shocks and recover, and in fact, ultimately make our city stronger as we become more resilient in our societies. So I think on that note, I'll wrap up maybe three or four minutes early. Well, before, let me just check the question here. No, I'm just getting compliments about how great this event was. Thank you. We appreciate that. Someone's asking if they can get in contact with them with several of the panelists with, there's a couple, there's at least one question about that, but I think that's a more general question. So this video will be available publicly within a few weeks, and I'm sure that we can share the contact information of all the presenters if people would like to get in contact with them. Okay, I think, I think we'll wrap it up then. So thank you everyone. On behalf of the Mapping Science Committee and also on behalf of the National Academies of Science, Engineering and Medicine. We thank you for your participation today and we hope you learned something and we hope that you go forward and make the world a better place with this knowledge. Thank you very much. Have a good evening.