 Hello. My name is Joseph Rickert. Welcome to the COVID-19 data form. A series of public webinars and discussions focused on the role of data in responding to the pandemic. This form is a collaboration between the Stanford Data Science Institute and the R Consortium. Today's discussion will explore the ways in which data journalists are contributing to combating the pandemic. These include uncovering and vetting difficult-to-find data, helping researchers explain the relevant science, and producing data-generated charts and visualizations to help ordinary people come to grips with the shape and seriousness of the pandemic. Today we will hear from three speakers, Mark Hanson, David and Helen Gurley Brown Professor of Journalism and Innovation at Columbia University, Anna Carolina Moreno, Senior Data Journalist at TV Globo in Sao Paulo, Brazil, and Megan Hoyer, Director of Data Reporting at the Washington Post. Our moderator today will be Dr. Arina Wong, a data reporter with ProPublica. Dr. Wong holds a PhD in Electrical Engineering as well as a Master's Degree in Data Journalism. Among her work is the May 5th, 2020 article, a comparison of four major COVID-19 data sources, which was featured by Big Local News. This article was an early resource for journalists describing how to obtain data from Johns Hopkins University, the COVID tracking project, USA FACS, and the New York Times. So we will be taking questions as we go along, and the moderator will decide when the questions will be posed. Please put your questions in the Q&A at the bottom right of your screen. So now please welcome Dr. Wong. Thanks so much, Joseph. It's great to be here. It's great to visit my alma mater virtually, and I'm so pleased to introduce the guest today. So we're going to kick things off with Mark Hanson. Mark is the director of the Brown Institute for Media Innovation and Professor at the Columbia Journalism School, where he teaches data and computational journalism. Before joining Columbia University, he taught at UCLA and was a member of Bell Labs Research. And sorry if this wasn't clear, but we're going to be doing a few presentations first, so we'll kick things off with Mark, and then we'll be hearing from Carol and then Megan. So Mark, stage is yours. Great. Thank you so much. Let me just cue out my screen here. Sorry, should have been prepared. Thank you for inviting me to be part of this panel. I just had too many screens open at once. All right, we're set. Thank you, arena for the introduction and thank you to the organizers for allowing me to speak today. Maybe inviting me to speak allowing seems like you don't let me out of my cage very often. I'm supposed to be team things off and telling a technical crowd a little bit about what data journalism is all about and specifically then work toward some of the work that that data journalists have done specifically around code. And then the other two speakers will be reporting more from the trenches say I am a professor at Columbia and will give a perspective from having to educate a new generation of data journalists. And as arena pointed out, I am director of the Brown Institute at Columbia journalism school, we're in fact a bicoastal organization, the other half of the Institute is housed at, I'm going to say here at Stanford, even though I'm sitting on Cape Cod and everybody's everywhere but I think technically we're supposed to be in in, we're supposed to be in Stanford. So the Brown Institute is half Columbia half Stanford, Columbia journalism, Stanford engineering. And our mission is to explore the ways in which technology and journalism might influence one another, changing the priority research priorities for engineering and perhaps the practices of journalism. So, I just to fess up, I am a statistician by training my doctorate statistics I spent 10 years at Bell Labs 10 years at UCLA. I've been now 10 years in the journalism school and more spy in the House of Pulitzer than anything else, but I have learned a lot over the last 10 years. And from my corner of the data world this is overly wordy and this is the only slide that will have this many words on it I promise, but I thought I needed to get it right. Data technologies have new social relevance and almost every aspects of our aspect of our lives can be rendered in data. These stories often are these data often tell us stories about who we are and how we live, but because they are products of human attention, innovation and memory, their perspective isn't neutral and the stories they tell are often incomplete open ended in bias. Prevalence in reproducible research and transparency of meant that data are plentiful, some would argue that data journalism, or each incarnation to date has flourished precisely because of the prevalence of data about people, businesses and governments. But in all cases we also have to mine the gaps. Journalism is called on to combat the misuse or invention of data and analysis. A subject under intent scrutiny by the profession at the moment. Data and the networks they circulate across can either reinforce or challenge systems of power and so in a very real sense our democracy depends on the public being able to think critically about these technologies, telling good stories from bad. And journalists as truth tellers, sense makers and the explainers of last resort need to understand the workings of data and society and the ways in which data code and algorithm function. Journalists are becoming more and more bold in producing their own data collection and analysis, borrowing widely from different disciplines. We've assembled a powerful toolkit for finding and telling stories with data. These acts of journalism are also contributing substantially to the data literacy among the general public. But data journalism is not in any way new. Pulitzer himself in 1904 in his college of his paper called College of Journalism. He mapped out what journalists need to be taught. This by the way is the school that I teach in any mapped out law and ethics and history, which we all still teach today. But he also mapped out statistics. He wrote everybody says that this is statistics should be taught but how statistics are not simply figures. It said that nothing lies like figures except facts. If you want statistics to tell you the truth, you can find truth there if you know how to get at it and romance human interest humor and fascinating revelations as well. I love this quote from him because in this moment Pulitzer sees tremendous narrative possibilities in data. He sees romance human interest humor fascinating revelations of course there's there's his adherence to the truth, or his, his need for the journalist to put truth first. And the idea that that data contain these stories contain rich stories about, as I said before, who we are and how we live is quite shocking, especially 1904. If you think about 1904. Fisher was like 14 years old at the time so statistics wasn't even statistics. I mean, I mean I think in our in the way we're thinking of things today he might be Pulitzer might be more referring to kind of sort of data and data science as opposed to formal statistical inference but but you get the idea. At Columbia just to give you again a sense of like the terrain of data journalism for for those who don't know much about it we we offer a master's of science degree. We also have a dual degree with computer science so in two years you can get a degree in computer science and a degree in journalism. There's something called computational journalism in the J school, often as part of a six university consortium, where we bring engineers and journalists and designers and in total six, six different departments together from six different schools across the across across Manhattan, I guess also Brooklyn. There's a lot of sorts of things we teach. And I apologize that the next couple slides or Python and not are actually the motivation for the teaching comes from the idea, or the basic observation that that I've made over over the last 10 or so sort of teaching and refining the teaching my teaching at Columbia is the data journalism or computational journalism is not just so much more data analysis. It's a hybrid practice that's steeped in a reporting trend tradition. And for those of you not familiar with what reporting might mean. I have a quote here from Philip I'll from a piece in the Columbia journalism review, where he's reporting sort of, he's talking about the active reporting and in particular reporters notebook, the piece is called owed to reporters notebook. And he says reporters know but the reporters notebook is a low tech device that I use to capture the site smell sounds feelings tastes and other impressions of the world. The piece of this quote that I love is to report is to be alert and alive in a particular time and place to report is to be alert and alive in a particular time and place. What this is asking us then is is when we're thinking about about bringing computation to journalism. We're taking that basic curiosity that we're cultivating in our students minds. We're cultivating a curiosity about the world a kind of restless questioning spirit that why that that asks why things look the way they do. And what we're doing with computational journalism or data journalism is adding computational lines of inquiry to that habit of mind that questioning of why things look the way they do and are they fair and are things working the way they should. Like one thing from this presentation is that that that reporting practice as a call to be alert and alive in a particular time and place means for me in my teaching that computation simply extends that alertness and curiosity to the virtual world of data code and algorithms. Another teaching I do, it's either our Python and I happen to have some Python notebooks here, basic introductions to Python how the web works, some machine learning, maybe some using some API is maybe learning a little bit about bots. We're not the only ones doing it, of course. I'm Philip's who I see is in the audience and Charles Barrett who is a graduate student at the time the Columbia wrote a lovely report about about teaching data and computational journalism and the state of that teaching in the United States. If you want to learn something about computational journalism you you just missed the night car meeting. You can go to the National Institute of Computer Assisted Reporting, where you'll find topics like visualizing data, using data report on climate change, finding needles and haystacks with fuzzy matching. You get a number of topics that will for the technical people in the audience will resonate with you but then to think about how to apply that in a journalistic context is interesting. In the National Institute we also have an educational mission where we put on events and try to bring in sort of computational ideas to the journalists and have them think about computation differently. This year we sponsored a program called changing course where we had Catherine DiNazzo and Lauren Klein talk about their book, their new book on data feminism and ready at a baby spoke, who's just joining Berkeley spoke on computing and social change. We had events to introduce students to public data, where we had say the chief science, the chief statistician of the United States, the chief demographer of New York City, and then a panel of data journalists and the day kind of went, we are here to give you data says the federal government and the state authorities and then the data journalists are like, man, not so much. We also sponsored various kinds of partnerships between different groups. In this case we put out a call for partnering official statistics organizations and journalists to better inform the public. More COVID related the Institute has spent time trying to figure out how, how to respond in the way that that, you know, through through the kind of work that we do. One of the things we put on during the, the sort of the beginning of the pandemic was something called at home with the Brown Institute where we basically led computational journalism courses. Here's a two hours on bots here's two hours on unix tools here's two hours on web mapping here's two hours on data visualization. We're trying to bring these skills, mostly to the alumni of the J school, get their, get you know what their appetites to learn more about about computation and bring it to their reporting practices. These are just sort of screenshots of all the different pieces that we brought together. And then we had a series of micro grants where we tried to pair, or we looked at supporting people to, to better inform the public about the, about the pandemic and about what it means to them to the local communities. And we had, they were $5,000 awards each but we had over 325 proposals come in from around the world. And the winners were everything from stories about say, what would happen if during the pandemic we had major flooding along the Mississippi River to battling misinformation in rural Southeast Alaska. We also held very early sort of talks about about how bringing together epidemiologists and journalists to try to see how we can sort of meld our skills to be able to tell stories and help help the public better understand where we are. You know in terms of the actual reporting that was going on. You'll see this probably more from from the next two speakers, but you see these are two New York Times front pages from from March, March of last year, March 20 and March 27, I think. And you see the kinds of graphics we were starting to look at and the kinds of things that journalists, the data journalists were struggling to pull together. So we have a series of maps at that time sort of bubble charts were a thing. You also see journalists struggling with how to express scale so this is sort of job loss in the US as of at the end of March of last year and trying to reflect scale. And then trying to show what might happen depending on different conditions that take place. As I mentioned before journalists also spend time looking where there isn't data and trying to, to, you know, places where no one's incentivized to collect data and trying to collect an add to it. So the covert tracking project is one example that came from the Atlantic. This is sort of project called documenting coven 19. It's an open website that has been doing the team has been putting together for your requests to freedom of information act requests to to state and local health departments and governments to try to figure out or to try to collect emails among elected officials or government officials who that mentioned COVID in some way. This this trove of documents has been responsible for over 60 stories in various outlets from front page New York Times articles in this case about the meatpacking industry and who knew what when to just last weekend there was a piece in the Chicago sometimes about just sort of reflecting on the crisis a year into it using very local data from the freedom of information requests that are documenting COVID project pulled together. Sometimes it's big. Sometimes we provide these stories to big organizations sometimes it's smaller like the Kansas City Star when the coronavirus outbreaks at meatpacking plants were made were made public. And when this is when our time here is finished you can jump over to look at data journalists who are busy fighting for public information and how freedom of information act requests work. I feel like I must be at 15 minutes so I'm going to stop now. Perfect timing work. Thank you. That was an incredibly comprehensive overview of digitalism, especially practice at Columbia and, and, you know, having come into the world of journalism by way of a technical field like you. I really love the philosophical quotes that you brought up which I think help remind me why we're doing what we're doing especially during a pandemic. So, thank you again for that overview. I think next we have Carol. And let's see Carol. There we go. Cool, we can see you now. Carol is a senior data journalist at TV Globo in Brazil. I'm hesitating because Carol, there we go. It's like we can see her share, which is also fine. But in case, okay great I can see you now. Carol currently produces data driven news stories for the television news programs and has reported extensively on the spread and mortality of COVID-19 in Sao Paulo, which, as I understand it Carol, it sounds like, you know, we watch the numbers here in the US. I mean I'm watching US numbers but also in Brazil. I guess, first off, like how are you doing. I should start with that. I'm hanging in there. I guess that's the answer. It's been really hard seeing the numbers rise and seeing what was going to happen. We're seeing it happen. We're just sitting. I'm in the state that has the most hospitals, the most ICU beds in the beginning of the pandemic or ICU beds. I mean I feel like I'm sorry it doesn't really do enough but you know what, we're so glad that you could have time to be here with us today. And please take it away. We'd love to hear your presentation. Okay, let me share my screen. Thank you. Thank you very much for having delighted to be able to share a little bit of our experience in journalism covering COVID here in Brazil. I am not from the technical field. I'm not a statistician. I was horrible in school, but I ended up in journalism data journalism because I covered education for the TV global news website you want for eight years and a lot of education data. That's how I got into data journalism. And in January of last year, I transferred teams so I came to the, the television news programming, the local news specifically focused on data journalism and now that that's pretty much COVID data. So that's, that's how I got into this. Oh, sorry. One second. It sounds like the audience is having a little trouble hearing you. Would you mind maybe moving the mic a little closer to your mouth. Can you hear me now. Oh, that's beautiful. I'll try to hold. Yeah, I always have to do that too. Thank you. All right. Sorry about the introduction. Okay, thank you. Thank you for let's see if I can try and put all the information in 15 minutes, just to let you know a little bit because you're not from Brazil so in Brazil we have a universal public health system that works integrated in all the levels of government so we have the municipal level, the state level and also the federal level, which is the Ministry of Health, that's the main authority, but they also they all work integrated and they have each of them have their own responsibilities. So that's where we get the official data from COVID in here in Brazil, but we also have in the pandemic and also before the pandemic we've had different, how do I say, let me, I can't. So we've had different here. As you can see on the right in green we have different initiatives from researchers from Brazilian universities and universities outside Brazil that have joined up together to build information and knowledge from those official data sources to help identify the pandemic and we also have, for example, was a platform that existed before the pandemic that tracked SARS cases in Brazil. So this is, for example, the first platform where we could actually see the COVID data, the pandemic arriving in Brazil last year. We have very important data sources and they're very abundant actually in Brazil, but I'm going to show you a little bit afterwards, some of the problems that we've had access, accessing all that data. So, we have a lot of information systems in Brazil, in the health department, and some of them have already been used that were created for the pandemic so these are the, I'm just going to go a little bit of an overview of the system that we have for COVID right now. And for cases, we have a system called CIVEPGIPI and also a system that's called Quesus Nutifica. CIVEPGIPI is pretty much the main system that we have for data. If we want to look at COVID in Brazil, because it also gives us the information about deaths by COVID and also hospitalizations. For testing, we have a system called GAL that already existed before the pandemic that the public labs use, or they use it for tracking different tests that they process, and also vaccination we already had a pre-existent system. As you know, since we have a public universal health system, we have a tremendous capillarity, so we have a very strong vaccination program. In the country, we have crews that can access all the communities in the country, so that's our strong side. We only struggle with getting vaccine doses, so we can use the whole system that we have right now. But just a quick overview of CIVEPGIPI, which is the main system, the main database that we have. It was created in 2009 after the H1N1 pandemic. Since then, it's been mandatory for every single hospital, be it private or public, to notify the municipal surveillance authorities within 24 hours every time a person with symptoms of SARS shows up at their hospital. So that was a system that was already being used, it was already in place, and it gives us details of each patient, so obviously we don't have access to the whole database, we only have access to the anonymous database, so we can't see details that identify each patient, obviously. But we do have information of how old they are, their gender, their race, the dates when they ended up in the hospital, if they went to an ICU bed, if they're under a ventilator, what happened to them if they were discharged, if they died. The tests that were performed, the results of those tests, so that's very useful information for journalism and for research purposes as well. And in the pandemic, because we needed a place to tally the deaths by COVID, this system was already, it was also adapted to get all the information, so the surveillance groups, teams, they have to notify this system for every single confirmed death by COVID, even if it didn't happen in the hospital. And it's not too thick, it's another important system because it was created in the pandemic to tally all the new cases that we have, even if the person didn't need a hospital. So in theory, we even have information on suspected cases, and if their tests came back negative, then we would have discarded cases as well. But we have many, many, many obstacles. I'm going to try to sum them up. I'm not going to get into details of those because we don't have time, but if you have any questions you can ask, and we can go over them in more detail. Okay, so we do have all this information, all this data, but it's not really available in a very effective way for us. For example, all levels of government, not just a federal level, they have access to a lot of this data, but they won't just share the open raw data so we can use in a lot of details. So we have some details, other details we don't have, so we have very limited analysis opportunities with this data. For example, this is the main one. Since it's a node system, we don't have a public API for it. And we only have the raw data once a week. There's only one update a week. That's every Wednesday at 7pm. So for example, in my day-to-day work, if I'm thinking about a story that I want to produce, and it's Tuesday or Wednesday, for example, I say like, I'm going to wait until tomorrow because tonight I'm going to get a better version of the database because the other version is already one week old and we have the lazy notification. So it's very hard for us to do our job like this. We have to program ourselves to get the information as fast as possible with the best available information out there. And the issues, the new system to carry all the cases. It's also very inconsistent because the authorities are overwhelmed. So not all states are able to notify all the suspected cases, for example, or all the discarded cases, for example. And we also have a very, I think it's a very lacking testing strategy. We only test this in the native cases in Brazil, so we don't trace the disease. We're pretty much navigating a storm without a compass. We actually saw this new rise that we've had in patients in the hospitals. We weren't able to see it coming before with the rising cases, for example. The first rise we saw was hospitalization. So that was very hard. And it's very hard for authorities to actually take measures to prevent a hospital system from collapsing, for example. And we also have problems with testing. We don't have open raw data about tests that were performed, PCR tests. We don't only confirm cases from PCR tests, we confirm cases from antibody tests, so the new cases aren't exactly active cases. So it's very confusing for us to use most of the data. So we focus mostly on the hospital data because I think it's the most reliable. And we also have some problems with articulation between levels of government right now, since things are very unstable, and that reflects also on the transparency of those governments, the policies, and the way we can get and use that data. And just finally, I think it's worth noting that in the beginning of the pandemic around May or June, the federal government was the only source of new cases and new deaths. Around the country, they were fed by information from the states, but they were the only ones who gathered the whole information from the whole country. And they decided to change the way they were going to show the new deaths. They were only going to show the deaths that were taking place and confirmed within the last 24 hours. And as you know, because we have a delay between the death occurring and the death being confirmed in the systems, that would leave out most of the deaths, so we wouldn't be able to see what was going on in that area. So there was actually a never-before-seen initiative, what we call a press consortium. So different competing media outlets had to join efforts to get journalists talking every single day with every single state authority to get the numbers of cases and deaths and now of vaccine doses. So we have a parallel tally from the official tally every day and that's that the parallel tally became our official tally, the one that we use because we can guarantee that there's no risk of it being changed over time or being limited for us to see what's going on. So I think that's very important. And on the second part, I would just like to show you a little bit of our day to day here, working in journalism and television journalism. So every hour, every day we have 10.4 hours of news programming and four of those hours are local news here in the team that I work. So we have three news shows every day and we are focused on the state of Sao Paulo, which is the biggest one in the country and mostly on the metropolitan area, which is the largest one in the country and the first hit by the pandemic as well. So this is pretty much our team here. We have 17 news producers in the local news team. And I am the one that's focused on the data driven stories so we work together to build up stories and tell the stories and use data to help share that information with the public. We also have a team of designers and a technology team that helps with automation. And also we have the editors and the reporters that help get the story on here. And the main thing that we need is getting the data fast so we do have sources of data, but we needed to automate a lot of things in order to get the help the editors and the reporters and the producers to get the information qualified and and all the analysis that we have to do. So we automated all of that to help give them some independence. So the rolling average and the trend analysis was something that we started at our local news crew in June 1 and we've been doing that pretty much every single day. And now as you can see, we last yesterday, we had a rolling average of 421 that's every single day in the state of Sao Paulo, and it's been rising alarmingly. And this is a way that we automated a project so we just input the numbers the total numbers and it already calculates everything and helps bring this screen to the public every single day so they can see what's going on and what's been happening in the past few weeks. We also have indicators for the reopening plan so the state is divided into 17 different regions and each region has their own indicators. So this is something that we've worked really hard to to make happen. Basically, what we do is we get the official data and get all those rules, and we built a code to calculate the indicators every single day, every single region. Because the government has the rules and rules are public, but they're very complicated, and the government doesn't show us the indicators every single day for every region. So we have to do it ourselves and what we do here out our strategy has been on the daily basis to create dashboards so every single editor, reporter or producer can just go on the dashboard and get the, the most recent data, and also look into the history to see what's been going on, to see if it's a record number, to see which regions are doing better, which regions are doing worse. So that's been very helpful on the day to day basis. And that leaves us with time to do more special analysis for special stories of the trends that are going on. So these are just a few examples for, for example, this is my last slide, okay, I'm almost done. So this one here on the bottom. In January, we saw Manaus collapsing. So the lack of oxygen, a lot of people dying, waiting for an ICU bed, and what we saw in this event that database was a high number, a very hard number of patients that were hospitalized in Sao Paulo, but were residents of Manaus. So we were able to catch this migration in the database and write a story about it and find patients, we actually found a few patients that were illustrating the data, right, they were proving that the data was real. And here on the top, we have the city of Sao Paulo have a very big and very strong and quick system of mortality information, it's a model system for the country. And so we were able to show that in 2020, one in every five deaths, natural causes were caused by COVID or were suspected cases of COVID-19 that were never confirmed. And we were also able to split the city in different districts to show the risk of dying is a lot higher in some districts and that's a problem that we have here in Brazil, which is social inequality, and that, so we can prove that COVID doesn't affect the rich and the poor in the same fashion. One, one analysis that we've been doing since January is tracking the age profile of hospitalizations because we've been hearing that it's growing among the younger people, and we've been trying to track to see when the data actually shows this new trend to see what it means. So, so far it hasn't shown it's still worse among the elderly, but we've been, we've been tracking it every single week to see when we can see that change so we can inform the public as well. So, in my presentation I left a few slides with links to these stories if you guys want to see but that's pretty much it. I think I got 16 minutes sorry Chris. Yeah, that was that was perfect Carol, you're right on time. And thank you for that presentation I have to say that even though I've been very attuned to data reporting stories about COVID here in the US, I must admit I haven't. I haven't been thinking a whole lot about or had access to data stories, data journalism stories in Brazil so thank you for that perspective. So, you know it was really interesting to hear about the challenges that you faced in Brazil. Some of them were all too familiar, you know when it comes to newsrooms having to bootstrap data visualization dashboards, really galvanizing data collection efforts, which is why I am particularly excited to choose Megan Hoyer, whom I met last summer when I interned the data team at the AP. And who was recently recruited by the Washington Post to lead its new data journalism department is an experienced data journalist, I can attest to that having worked at the Associated Press USA Today and other publications. Last year Megan was part of a team that received the AP Chairman's Prize for their success in improving the distribution of data sets to member newsrooms and bureaus. I'm excited to see you first hand when I interned there and Megan so so yeah Megan please, I would, you know I'm excited to have to to hear to hear about your work again and I think our audience will be as well. Thanks so much for having me, I really appreciate this. And basically what I don't have a formal presentation but I do have a, I do want to kind of walk through a little bit about what the last year has been like for us data journalists. I think in the US honestly our experiences have been very similar to those in Brazil. So a lot of what Carol said, ranked true in terms of, you know, the types of problems we have and honestly one of the chief problems we have is had, have had in the last year and data journalists as data journalists has been trying to use data to tell this story as it happens. And most often as data journalists we're taking data that's been either collected by a private entity, a researcher, the government most often, honestly. And that data is usually shaped process to the takes a while to collect. And so there's oftentimes a lag, you know we're using data, a lot of times in our stories that's a year old two years old. Obviously that's not going to work for us in, in COVID times as we're trying to get a handle around what's happening as they have as things happen. And so there's a number of techniques and, and things we had to overcome in the past year. I'm going to walk you through a number of those. The first I wanted to talk about hold on as I share my screen. But the first I want to think I want to talk about is modeling. I'm very early on in the COVID crisis. What we saw was in the absence of, you know, kind of knowledge about how this was going to go was was just a huge number of models, being targeted to try to predict what was going to happen next and how, how contagious COVID was what caseloads would look like what burdens would look like. It was a little bit kind of what everyone wanted to know but the problem was, as data journalists, we were kind of being besieged by models, honestly, almost every group organization think tank. You know, private citizen was kind of coming up with their own coven model. This is a, the CDC has has ended up kind of compiling these models. And, and you can see, you know, back in April of 2020. What we were thinking at was, was models with like massively and wildly varying outcomes, depending on which model you were, you were choosing to listen to. And it was a challenge for us because as data journalists were often pairing with newsroom reporters and editors across the newsroom in a large news organizations like at the AP, which covers the entire world and has reporters embedded in these, but also most countries, it places like the post where there's lots of journalists across the world as well. What we were getting was reporters in, in Kansas were choosing one model because that's what their state was looking at and reporters in Washington State were choosing another model. Those models didn't have actually a whole lot of relation to each other. Both sometimes predicted dire scenarios but completely different dire scenarios. And it was really hard to vet these models given the lack of true information we all had about how coven was going to behave. You know, we were kind of operating a little bit in an information vacuum last year, everyone was about, you know, how it was transmitted how, you know, how contagious it was like who would be affected back in April and March we really didn't know the answers to these questions. And so models were therefore just inherently problematic. And yet they were kind of being forced upon us as a society by all of these different means and you know in many cases honestly governments were touting them as well. So, you know, as data journalists a lot of our job is to vet data sources and to kind of think about data that we're using and adhering to in stories. And so early on, we at the AP, I was at the Associated Press at the time we at the AP agreed and decided that we were not going to use modeling and not going to base stories of modeling predictions and things like that. And it was a data decision that honestly was just made because it lacked at the time that the true science behind it. A lot of these models honestly were being pushed forward by people who weren't epidemiologists who didn't didn't have a full grasp of even this the science, the disease science and, and honestly it was just it was just too early to know. So in the absence of a lot of case data which we weren't getting and then and in the absence of being able to really really rely on models. What did we have like data journalists had to be uniquely it was a taxing year because we had to be uniquely really really creative in terms of what we looked at last year to be able to tell this story in real time. So, you know, here are some of the highlights and kind of things I thought were interesting last year. One of the things we saw first when COVID came and first hit in the US, New York City was particularly hit hard. And one of the things we were seeing was lots of reports that that Manhattan was emptying out that parts of Manhattan were empty that's why we weren't seeing cases there are things like that. And I thought this was an interesting look because the idea is you get and you get a story idea and then the question is how do we quantify that. And I thought this story did a very good job at trying to quantify what was happening in New York City with the population changes as COVID struck the city so hard. And so they decided to as a proxy to people's movement. They looked at garbage pickups, where was household waste dropping as a proxy for people not living there anymore and having moved out of the city to avoid avoid the virus. On the map they made very early on in the pandemic. And I thought it was an interesting way to get at some of the other issues and some of the issues the city was seeing. Where were people staying, you know household waste is going up in a lot of places because people are at home, they are hunkering down. And honestly what it turned out was a lot of those dark green areas were some of the places in the city that were hardest hit by COVID. And in those dark green areas, people did depart the city for other places. They might have had a lower caseload. Early on to we were hearing anecdotally like there was that there were disparities into as to who the virus was affecting and you know what we were what we were seeing in terms of who was showing up at the hospital, who was dying from this. Data was really particularly poor on this subject, especially in the beginning but even now that problem persists roughly quarter to a third of COVID case data in the US doesn't have race attached to it. So it's very, very difficult to say with any of degree of certainty. As to, you know, who's really coming down with the virus. You know, part of that is just the bureaucratic overload and like of keeping these papers certain hospital systems certain pharmacies systems testing locations, haven't kept this data. So, very early on, we wanted to look at disparities but it didn't exist and we couldn't find it in the caseload data because of all the missing values. We decided we learned that death data was less likely to be missing racial and ethnic data, although it still is missing at some level, but we were able very early on to hand collect the other thing is this data wasn't. wasn't centralized in any place so we had to hand collect and go from state to state and city to city to find out where what the racial toll of deaths were so this was a like kind of mid April look at the percentage of deaths. 75% of deaths in the District of Columbia were were black people and you know the percentage versus the percentage of population 45% of the population of Washington DC is black so you can start to see that gap emerging even though we don't have a full data. But that because that that data hasn't been released by many states and this in this case, these were the only states and the only cities that were providing or releasing that data or even collecting it at that time. Since then, we've all the states have started collecting it and the CDC as has started collecting it as well. But at the time this data was very hard to come by and had to, you know, be manually collected from a number of different dashboards state websites state reports and by calling and foying different states and cities. But as we get into things like this these disparities, you also have to keep some basic data. So things in mind, things like the black population in the United States and the Hispanic population in the United States is in of the same age as the white population in the United States. So then you have to move into like higher statistical, you know, issues such as age adjusting your data. This is Connecticut was one of the first states that started looking at age adjusting its data. And when you look at the deaths, you see that it, the age adjustment makes a huge difference Hispanics and the rate of deaths for Hispanics goes way up the rate of deaths for blacks increases even higher than what it originally was. And then the white death rate, because whites are typically significantly older the white population of states is significantly older, you know drops significantly. So we have to kind of keep those kinds of things in mind. A death is not a death if a bunch of 45 year old Hispanics are dying. So that might take more of a toll of a on a community than an 80 year old, several 80 year old white people. So, so that was something and you know that work is honestly still going on to age adjust the death data to get a better sense of what each community's loss was. Another way into this because as we were as we were going for further into, you know disparities and what the toll of COVID looked like on communities across the United States. You know how do you get into that information into that story with a lack of data. One of the things that a number of news organizations worked on last year. One of the things that it was a smart way to kind of walk around the problem a little bit is looking at the issue of excess deaths, taking basically a baseline of deaths across the US or across each state. You know, it's just three or five years, and then measuring 2020s deaths against that to see what the discrepancies there are. And basically what you find is, you know, the, the, the access on this is basically normal, the normal death rate. The peaks are above the normal death rate, and you can see the large disparities in black Hispanic and Asian pop and Native American populations versus white populations. So it was another way again to get at the issue of what we were seeing on the ground as it was happening without having a full data picture unfortunately because you know a lot of cases. COVID wasn't being marked as the cause of death in a lot of these deaths. So that was really hard to get at. In a lot of cases we weren't able to see specifically death certificates or the death data was lagging by by months for COVID deaths. So this was another way into what we were kind of seeing on the ground as it happened. A lot of people took different kind of slices of the excess deaths question last year to look at, you know, places where they weren't reporting COVID deaths basically, but also places just to track things like this kind of disparity among populations are in different states. This was a project we did that at the AP, we did it alongside the Marshall project so we worked with them on this, and we broke it down by state and each different racial group. Another technique we, I think a lot of news organizations stuck to last year was hand collecting data. Okay, and the lack of, with the lack of data about a certain issue, can we, you know, create a survey methodology, or hand collect data are from enough places that we can tell stories around it. So this was something we did around schools. And, you know, as schools reopened in the fall, the school season started in the fall of 2020, you know, what did it look like in terms of who was reopening and who was going back to school. We really wanted to see that that's data in real time. The, you know, the National Center of Education Statistics keeps this data will have this data, but they'll have it in about a year and a half. And what we wanted to do was try to capture who might be affected by loss of learning who might be affected by not having technology right away, as it was going on. So this was our method to do that we basically binned the school districts, they are binned actually an NCS data into four categories urban suburban town and rural. So we use those categories, and we picked the five largest schools in each category in each state, and it sent them a survey and called them and followed up with them. So we ended up getting roughly 700 schools answering our survey, and we're able to then say, this was true district serving mostly students of color, we're more likely to start online. It was true across every single type of school. It was true in in rural areas as much as it was true in urban areas or was true in rural areas as it was in urban areas. You know this was something and it was a trend nationwide. We also were able to follow things like whether students had access to the internet, whether they were given computers or tablets to use and things like that as well that ended up in other stories. So, you know, collecting some of this data ourselves I think a lot of organizations ended up doing that. They were just helped to fill in the gaps where we didn't have comprehensive, even state level or national data. The best example of that is the COVID tracking project. This is a large, large organizational project that was run from the Atlantic. And they were, they were collecting test positivity rates and at some point in the US test positivity rates became a huge kind of metric of how states were doing and how localities were doing in controlling the spread of COVID. So, test positivity became a huge policy issue. States were starting to decide on whether they were going to reopen certain things based on test positivity rates. The CDC was giving guidelines around what your positivity rates should be. And at the end of the day, what the COVID tracking project came down to was that test positivity was being measured radically differently from state to state and from locality to locality. So, there was, you know, one positivity formula that was set by the CDC. But in reality, what was happening was that there were three totally different ways of counting how many tests had been given, which is the denominator in any kind of positivity rate equation. So, you know, what that created was then a totally unequal process where you couldn't measure one state's positivity rate or one locality's positivity rate against another. You know, hugely problematic and honestly, you know, it was never solved. And really was one that speaks to one of the biggest problems with the, you know, entirety of COVID data across the last year, which is the lack of true equal systems to measure some of these things. I mean, I'm just going to go on for two more minutes. I wanted to talk a little bit about as cases went up, there was a statistical need to kind of what I call flip the numbers. We were always measuring things per 100,000. So in Alexandria, where I live in Virginia, you know, we have 14 cases per 100,000 people. That's a really hard number for people to understand, like how many does that mean in my community, what does that really kind of equate to, and flipping the number to this made it much more kind of rockable for a lot of people. We ended up doing that in a number of stories. This is from the New York Times. We ended up doing that in a number of stories that we did as well. You know, where you flip it to, okay, it's one in 15 people, and one in 15 is immediately understandable to folks. We did this with prison data that we are collecting by hand with the Marshall project last year, and obviously got to one in five prisoners had had COVID across in that year. You know, which was significantly higher at the time than than the US average. So kind of thinking about the numbers and how you can make them more understandable to folks is particularly important when you're dealing with these kinds of like ongoing big problems that just keep on getting bigger. Finally, want to talk about vaccination tracking because we're kind of seeing all the same COVID problems again. Yet another data set that is problematic being reported differently. This is our COVID vaccination tracker. The post is doing a heck of a job trying to track vaccinations across the US and where they're going. This is a labor of many, many people and is involved writing 50 state scrapers many times because the states changed radically how they were reporting the data many times. And then finally changing it to the CDC and now the CDC has changed how they measure things several times. So, you know, we do the best we can but honestly, a lot of this COVID data as it's coming out is changing literally by the day. You know, there's oftentimes new fields in the data each day from the CDC that mean different things. How do they track J&J, you know, single dose vaccines versus double dose vaccines. What does that mean for people fully vaccinated or having received one dose? You know, it's been a very difficult process and continues to be a very difficult thing to track all along. So I will stop now so we can talk. Thank you, Megan. Yeah, I mean that was that was a really good overview I think of just how joint data channels have been figuring out not just the scope of stories but also finding untold stories and statistics during this pandemic and contextualizing them. In particular, I, you know, I didn't realize until you mentioned it that you're right, sorry, I had noticed I think in the back of my head the flipping of statistics to make it more understandable. And it's good to hear from you how that was like a conscious decision or how thought was put into that at newsrooms. Yeah, so again, Mark, Carol, Megan, thank you for your presentations. We're now going to move into a more sort of free forum panel discussion time. And I thought I would kick it off with a question about, you know, data presentation and how reporting during the pandemic has sort of really brought this intersection of science communication and news you can use right from the general public. So I think in some ways, COVID, the COVID pandemic has been a real crucible for the field. Mark, I'm curious, you know, from the academic perspective, have you seen unique forms of storytelling or ways of presenting information from newsrooms that you're excited to maybe integrate into your lessons plans, but also conversely have you also seen some real teaching moments where maybe something was confusing or reporting oversimplified important scientific details. Yeah, I think I think so I'm teaching a data visualization class now and we have one of the things we did was to go back over the year and just look at how the graphical forms have changed. At the beginning, it was, as Megan pointed out, lots of sort of exponential curves that fanned out in a variety of ways. And, and, and as I put in my presentation, various kinds of attempts at mapping and and trying to just communicate scale. I spent a lot of time actually the number of news organizations did some work around when the US hit the 500,000 number, trying to make that number. Relatable somehow. The Washington Post in particular had a kind of theatricalized piece that that had people on a butt on buses and as you scroll buses were going by, you know, as you tried to count and count and counting you thought well this is impossible. This is never going to get us to the total and then after a long period of scrolling after so many buses each bus had like 51 people on them. And after a minute of scrolling you end up with one day worth of casualties and then they flip over from buses to to to if you were to all the buses were to be lined up on a on a street or on a on a stretch of highway from two familiar places like New York and Philadelphia or New York. But the, the, the ways in which the ways in which we shifted and the different strategies that went along to try to figure out, even just like getting the public accustomed to a log scale seemed like something that had to be talked about and thought about and, and so yeah there were there are a lot of teachable moments and thinking out how we kind of work through and presented not only what we knew but what we didn't know. Right, like what what where the where the gaps and and sometimes many times the uncertainties that were present and were not sort of statistical and couldn't be thought of in statistical terms, you just had to, you had to find some way to express what you didn't know. Yeah, I'll leave it there. Thanks. So, assuming out into the journalism industry as a whole. Carol mentioned the need for journalists organizations to collaborate because of data issues. You know Carol and also Megan to have you. Do you think that the competition versus cooperation tension do you think that will change the way me outlets are maybe working in competition with each other but also together. And of course, Carol let's start with you because I think in the US we've got a very US focus and curious to see how this might play out in other countries. I think, I think this collaboration is something that we've been, we've never seen before. And I think there's a good side to competition, because we actually improve everybody's coverage. We know that somebody else is doing something that putting in an effort to do me something special so we want to do that too. And also because we have so many stories out there that need to be told. And if everybody keeps telling the same stories, then there's a lot of stories that never get told. So, the more variety and the better, but we do have we have seen a big collaboration and growing collaboration, especially in data journalism because data, doing data is very hard program learning languages so we do have, we do help each other. We have different WhatsApp groups with journalists that are covering the same thing, obviously we're not sharing our stories ideas but if somebody needs help using the database that I know how to use they can just ask me and which field which variable they need to use and I'll let them know. I think this collaboration only helps out, especially because the more people pushing for transparency, the better data we're going to have, and that's going to be better for everybody. Thanks Carol, Megan how about you, you know you've. I know that the AP collaborated with you actually brought up you know collaboration with Marshall project also with chalk beat. Do you think that the code pandemic will be changing the way collaboration is done in in me in the US. In the in the biggest sense of the terms it really needs to move to a conversation of how do we. How do we all work together to verify and establish these basic data sets right like we all kind of needed a lot of the same things last year. We all spend a significant amount of time trying to get or build or maintain those basic things. You know vaccine tracking is is taking up a huge chunk of a lot of newsrooms time and effort right now. So how do we best organize and make this a universal and singular effort. I think still needs to be worked out a little bit more but it, we certainly moved much closer to you know sharing resources working together things like that the AP. We released a bunch of data sets to for for newsrooms to use and made a lot of those publicly available completely publicly available some of them were just for our members some of them were just for everybody. Just to try to lower that like bar to entry to just do the basic things. But there's, there's a lot more effort needs to be, you know, made in this area, I think there could have been even more collaboration than there was in the last year. As always right. There's always more room for collaboration was competition. Let's see, seems like the audience is also interested in in not just right so not just how you know thinking about how this experience might shape journalism as a field. So, you know, one of just thinking also about like how how it might be pushing innovation in day journalism as well. We've seen that some of the most effective cases for day journalism in the pandemic has been conveying interpreting case counts but what are some of the other big picture opportunities for day journalism, as we move forward from the pandemic. Let's see. Who should we start with Megan do you have thoughts on this. I actually thought Carol, pretty interesting. I mean, I think, you know, in in many countries. We kind of had to be big advocates for public and an open data. I don't know how much those arguments were heard but I mean what we saw was a lot of governments doing dashboards where there was no raw data behind the dashboards or the raw data was inaccessible behind dashboards. You know, Tableau sites where you couldn't actually download the raw data from them, using a variety of different sites, you know every every state had their own dashboard it was completely different included different networks. So there was very little standardization around all of that. And yeah the question does become like where, where can data journalism become an advocate for standardization and organization of some of these things I'm not quite sure where we fit in but it certainly seemed like I would be interested in hearing both Carol and Mark's thoughts on this as to as to how we, we push for better, more comprehensive and, and more, you know, standardized data. Thanks. Yeah Carol, what do you think are the opportunities for day journalism in Brazil. We have a lot of opportunity because we have a lot of room to to prove. I don't know how easy it's going to be to improve, because it's really really hard like, for example, the testing data. There is a database. We could have the open raw data. We could, we could see how many tests are being done every day and everything, but there is no data, and I tried for about two months. There are a lot of ways to get this data. And, and I asked in the press conferences, when they were actually doing press conferences at the ministry is like, why don't we have that data. And they say like, Oh, we can't put data with personal personal data from people up online. I was like, I didn't ask for personal data. I asked for the anonymous database, you know you have it, you could do it, but you won't do it. So we actually have, we've had a forum of associations asking for more transparency. We have the open knowledge foundation in Brazil. They actually had a ranking with different indicators of transparency. This actually helped a lot of local government state level governments, improve their transparency because of these indicators, but that also had a backlash so there was other states or cities that took back on their transparency levels because they looked bad on some ranking and said, Oh, I'm just not going to do anything for them. So that we've seen both things happen. I think that we actually Sean had a question here on the Q&A that I think kind of matches this. I think that as we are able to automate the data that we have and try to find ways to get that data quickly turned into information that we can use in journalism. We start advancing naturally and then we can try and get like why don't we have that data in this format so we can do that with that sort of data as well. And what we can do as well is we can, I'm going to use this word, but I don't know if it's we can arm the reporters and the editors and the people that are interviewing the authorities with this data fast. So they can try and fact check in real life in real time when the authority says something. And then if I hear someone say something, I know the data and I say, No, that's wrong. And then I can quickly inform the person that's there in the press conference like that's not that's not correct. And then they can question them in real time. And so this is a way that we can use data as I'm not I'm going to use the word weapon because I can't think of any other word but I don't think it's a war. I think I think society is only going to prove with this. It's, it's as a weapon to, to tell the authorities that they can't push certain narratives, just the way they want to, because the data will help us not allow that to happen, or at least we can show both sides, and keep the public informed. And I think the public pressure is the one that also helps get authorities to actually comply to laws because we have transparency laws that people are abusing or ignoring. Wow, I think I know someone had also asked in the Q&A about impact and I can't think of a more sort of direct form of impact than being able to question authorities in real time about the information of presenting to the public. And so, okay, thank you for your perspectives Megan and Carol on on how newsrooms how the field is changing in newsrooms. Mark, you know this also extends in academia because the Brown Institute, for instance, supports me innovation media. I mean you mentioned these grants specifically geared towards COVID projects. You know how has the pandemic changed the way academia supports innovation media and especially data journalism innovation. So, I mean, you know, we had had a lot of programming set up for 2020, because it was going to be the year of data, right, it was, it was the 50th anniversary of Earth Day so you have climate change you can talk about the Olympics were going to happen. And so what was happening, like it was, it was just a, just bristling with data opportunities and to bring in new stories and then, and then the pandemic happened and then everything focused focused on that. I think one of the things that that that at the core like, so I get asked to teach statistics in the J school from time to time and reporting classes or whatever and so, you know, using COVID as a as a backdrop is is an interesting is an interesting example. And the issue is that if you want to touch any epidemiology, you're almost immediately hit with a partial differential equation and the students go. And so, see, see, you don't kind of want to go there but but it but but what it suggests is that what we started to think about is that maybe it's suggesting having a tighter, tighter bind, what bond rather with with sort of experts whether they be epidemiologists or other sort of technical people in the technical field, and not just have them appear in the publication under the opinion pages which is where a lot of people present their research and not just have them quoted as a source but have them as collaborators in building up stories and sort of creating a kind of shared language. A second thing that that we've been we've been looking at at Brownist is ways to try to bring more undergraduate STEM students into journalism, right so that you have a basic kind of sort of data and sort of modeling awareness, coming into the journalistic practice and something really interesting happens when those two work together. One of the projects that I mentioned the documenting COVID-19 is pulling all of these emails from from local and state health departments and governments to try to see who knew what when. Right, and to take all of that to create timelines to think about, you know what what what data sources are sort of implicit in here that we can start to tease apart to tell a story. The team has has been really successful in bringing on not just sort of investigative journalists but also students from our dual degree who are both journalists as well as computer scientists who have the capacity to ask questions of the data on their own, but then also answer them on their own and then work work in a team. So I think I think I think this goes kind of back to your collaboration question a bit but but collaborations, sort of outside of journalism and and also sort of helping to kind of advance the field by by bringing in students who have more traditional STEM training. Thanks. Great to get perspectives from both newsroom and also in academia. Let's see so we've got. Okay, we've got I think we've got time for one more question with input from everyone. So perhaps the most broad question I've seen all day about digitalism, especially in the context of the pandemic. What would you all say have been the biggest technical challenges to digitalism but also the biggest social challenges it's been a pretty roller coaster year all around the world in terms of not just the pandemic but also politics. And social conflict stuff like that so. Yeah arena we can't hear you if, if you're still talking, you can type the question if you're still present you can type the question in. Well arena comes back from technical difficulties. Megan, would you take a shot at the social aspect of the question arena just asked. I think there's been a challenge and, and as Mark said, you just, you know, training people to understand some of these visualizations and, you know, terms and things we were, we were throwing at data this year, things like a log scale things like excess death calculations, you know these are complicated ideas that you know the average person isn't familiar with and it's literally are kind of some of our only ways into measuring what's happening, what's happening. And so, you know we saw a lot of really good I think very high level step backs in terms of walking people through some of these types of visualizations some of these types of models. I honestly thought some of the best and most successful data journalism of the year were things that stuck with extremely simple idea. And that also took a very high level look, you know, at what was happening, because the daily, you know, shuffle and the daily, you know, increases and where caseloads were and things like that was an incredibly hard thing to just kind of keep a thumb on. So where we where we did well is in slow down, let's explain this really closely, I thought, you know, El Pais had a had a great explainer in the fall about how coronavirus moved through the air. And it was a fantastic step through with visualizations. It was very data driven in the sense that they had clearly done the math behind it. It didn't show up as a data story it was literally like here are three scenarios you walk into a bar you walk into the friends home things like that. And here's what your chances kind of are, because of, you know, things, issues like ventilation. So when you're talking about, you know, how we express things. I think it's upon all of us as data journalists to kind of take a really slow look at, can we can we step back can we simplify this can we explain this to people. You know, in a in a very logical and kind of, you know, coherent way at a very high level, so that they're going to understand not only this, this graphic or this story, but also stories going forward on this subject. Thank you that that idea of making things tangible through simulations is, I think an aspect that has thrived in particular parts of the reporting but it's not it's not prevalent everywhere yet so I think that's really important Carol do you have anything to add there. I agree with the simplicity factor. Before the pandemic, two months before I switched from online news to television news, and it's very, very different, and the audience is also very different. Before on online news, you have someone who once click on the story and they can read the whole story or they read only part of the story they can spend two minutes reading it or five minutes reading it or 30 seconds reading it. But on the television, when it goes on television, you have your, your graph, a person is going to look at it in 10 seconds so they have to understand what's going on. So there is a, it's a very difficult process to simplify the information in a way that the audience can understand in such a quick way and it's also, I think it's, it's so it's a challenge, and I learned a lot from it doing television with the editors because they have a very hard, hard job in translating all the data and all the information into something that you can watch with your eyes, and it can be informed by it, and they only have to work three hours to put the story together so it's very demanding. But also the audience is a lot broader. So it's the people that go out that have been a, that have had to go out and take the bus to keep working, because they don't have any other way of supporting their families so they have to risk themselves because there are people, there are people who are living six people in the same household. Three generations, they have to protect the elderly that live with them. So these are the people that are most impacted by the pandemic so talking directly to them is pretty much a privilege. I don't really get frustrated if I think like I'm gonna do this beautiful graph and you know like a flow chart or a box plot and I can't do that because I can only use bars or lines. But I know that that's the way that I can communicate that information. So I think, I think, I think that's the most important thing of our job so I don't get so frustrated about that. Well, thank you. Thank you. So Mark, I guess maybe we'll give you the last word and something that. So do you have, do you have any thoughts on, so we have science here we have communication and journalism, what, how does that all wrap up to communicating the emotional impact of what's happening. What's the challenge for the scientists. Right, so, so emotional impact isn't is an interesting one right so so the. You know I mentioned, for example, the Washington post piece with the Trent and you first with the buses, you should have a look at this work but first with the buses and then stretches of, you know, if you were lying all those buses up to try to give you a sense of what 500,000 people what what is the number 500,000 mean. And I'm not sure that emotion for emotion sake is what is what we're, what we're after what what data journalism does provide though is an opportunity, if it's there, right to think through. There's a data source that you're going to look at to capture a particular situation. I mean Megan already mentioned like the emptying out of Manhattan and how we might like pick a data set to to to to portray that. Right, and, and so there's a in the same way that that, you know, a good journalist picks their sources, a data journalist would pick their sources there's a there's almost an I don't want to say aesthetic but there's a quality there that will speak to people in ways so it's it's not just how it's presented, although that's important, but it's also what what data have you chosen to help portray that particular situation. And, you know, I would say that at the end of the day, especially with COVID I feel like our responsibility is to help people make better decisions about their actions about the health and well being of their families. And, you know, those stories then I think the note that I wanted to end on was a statistical one, which is those stories have embedded within them uncertainties and being able to I mentioned this before but being able to communicate honestly what you do and don't know, so that people can, you know, make appropriate choices can can hopefully make a better informed choices. Because again, the data as you've heard the data are in various formats, some of the numbers that were quoted or that we keep using don't make any sense. Right, like the proportion of vaccines that are the proportion of tests that are positive doesn't really tell you much because you don't know what the downstairs like what's affecting that from day to day, and who's showing up to get tested and all of that so you know okay if that number is high and these other numbers are high then yeah okay so maybe it says something bad is happening but like we're we're we're constantly sort of making these these sorts of trade offs and I think that the sources of data are one thing that a journalist can look to and sort of be creative about to help make it real for people. Thank you so arena you're back so you do in fact get the last word you have any wrap up that you'd like to leave us with. Thanks Joe yep I'm on my phone thank goodness for cell towers, even when electricity or internet goes out. Yeah I just want to thank Megan, Carol and mark again so much for taking a time to chat. Thank you to the audience for taking the time to listen to the art about individuals. And hopefully you do too. And, and it back over to the announcements, and also driver. Audience members that give interest to others in this. All right. Well thank you. That's it for today. We appreciate the time and the expertise of our speakers. Thank you for the audience for your willingness to to be here and for putting up with our minor technical pitches. Please watch the website, the coven data form website that's coven 19 hyphen data hyphen form dot org for information about when the video for this conference will be available and for upcoming events. So, thank you all, please stay safe and take care. Bye.