 Thank you for joining everybody. And welcome to the talk, openly available air quality data, not just blue sky thinking. My name is Shruti Modakrithi. I'm the platform lead at OpenAQ. And I know this talk for the format is a little different than people are used to. So I'd really encourage you to reach out with questions and comments, my email's at the bottom of every slide. So it's easy. I also just posted a bunch of links in the chat which are some of the links I'll be going through in the talk. So feel free to, you know, explore as I'm talking about them. All right. So air pollution causes one out of every eight deaths on the planet. It is one of the biggest public health and human rights issues of our time. According to the World Health Organization, over 90% of the world breathes unhealthy air. And what does it mean to breathe unhealthy air? So public health experts have determined that particulate matter is the most dangerous to human health because of its size. So particulate matter that is 2.5 microns in diameter or commonly known as PM 2.5 is small enough to penetrate our lungs and even enter the bloodstream causing serious cardiovascular and respiratory ailments. And you can see the size comparison here in this slide. But all hope is not lost. We've seen how a small technology change can have a snowball effect in spur action. Back in 2008, thanks to the Olympics, everyone knew Beijing was polluted, but we didn't know how much. That year, the US Embassy in Beijing installed a rooftop air quality monitor that started tweeting out PM 2.5 data every hour. It was one of the first times that real-time PM 2.5 data was available to scientists, allowing them to study air pollution trends and make predictions. Engineers started scraping the data from Twitter and developing third-party apps that informed citizens when they should stay inside, when it was okay to go outside, basically informing people when the air was unhealthy. It took some time, but the overwhelming international response to this data being tweeted out forced China to set up monitors all over the country and invest billions of dollars to clean up their air. The left side image is from 2008 and the right is from 2014. And you can see the huge difference that it has made. So the lesson here is that communities need basic access to data to affect policy change. And it was this story that inspired the creation of OpenEQ. They started out with the simple question, what if all of the world's air quality data were open and accessible? OpenEQ is a nonprofit dedicated to fighting air pollution with open data and community. In this talk, I'll be going over an overview of the platform, looking at some impact examples, talking about who makes up the OpenEQ community, going over the global air quality data assessment that we did and talking about what's next. So the challenge, as I mentioned, is a lack of access to air quality data, especially in some of the most polluted places. And even where air quality data does exist, it's often in an inconsistent or temporary air quality format, making it difficult for anybody who wants to look at global air quality to get a good picture of what's happening. And both of these combined together prevent civil society from taking adequate action to improve air quality in their local communities. And this is where OpenEQ comes in. So we take all of those disparate sources, standardize it and make it openly accessible, basically building data infrastructure so that journalists, people in government and researchers can focus on what they're good at and not spend hours and hours transcribing PDFs into spreadsheets. Our strategy is to combine this very tangible platform and data with a diverse global community to create a healthier, more efficient and connected data sharing ecosystem. And overall make everybody better positioned to fight air inequality across the globe. So the OpenEQ platform basically contains real time and historical government-grade data from all over the world. We have over 500 million air quality measurements at this point from 133 data sources in 87 countries. You can check it out at openeq.org. The platform is entirely open source, which allows everyone around the world to contribute to make it better, make it their own. It's written in Node, running on AWS. And how it works is that it's a unique adapter is written for each government data source and a fetch process runs every 10 minutes to grab the data and then it's stored in S3 and also in a database. And there are several different ways that you can access the data. The most popular and easiest way to access the data is through the API and you can also query the entire database that we have using AWS Athena. And you can also download the raw data files from S3 using the Explorer tool in the bottom right corner, but I would not recommend that. It's easier to use Athena or the API and links on how to do that in the chat, like I mentioned. So we have a variety of options on how to access the data based on the skills and the purpose. And the community has actually built tools on top of the platform, like an R and Python wrapper for the API so that it's easier for people working in those languages. And in terms of the API, we've gotten 200 million data requests per year already in March and March isn't even over yet. We've had 30 million requests and that number only keeps increasing and it's been accessed in 162 countries. So the OpenAQ community is a dynamic group of global collaborators full of researchers, engineers, journalists, students, government employees. I checked this morning and our Slack has exactly 700 members and I hope that a number will increase after today. And here you can see some great examples of community collaboration. On the left you have Dolgen from Mongolia. He heard that people in Bosnia were actually hand recording data. So he wrote an adapter in the OpenAQ platform too so that Bosnians had programmatic access to their data. In the middle, you see an example of government employees adding in their own data to the platform. And on the right is Smokey, the air quality chat bot which takes in the raw concentrations available on the platform and translates it into a much easier to understand metric for people so that they know when it's healthy and unhealthy to go outside and exercise and so on. And so we cultivate and engage this community through in-person and online workshops. We've done a variety of international workshops. The one on the left is from India in Delhi and then the one on the right is in Sarajevo. And the workshops convene people from different sectors and show them how they can use the data that's on the OpenAQ platform as well as the tools and help them kind of create an action plan for instigating change in their local communities. And here's like various impact from different sectors. In terms of research, the NASA Geos Research Team use the real-time data available on the platform to conduct a research study and develop air quality forecasting models. And they were actually able to use the real-time data that's available on the platform to double-check their models and make sure they're working correctly. In addition to doing that, they were able to make data comparisons between different countries and also identify data gaps in their key areas. The media has also been able to use it. So here you see the Bloomberg green data dash that overlays OpenAQ data with population data and helps inform the public, makes it easier to visualize the magnitude and severity of the problem. They were able to use the API to pull that in very easily. So, and then this graphic also helps identify gaps in data, especially in countries with large populations, ultimately making it easier to push for environmental policy change. This is one of my favorite stories. We held a workshop in Ghana last year. It sparked a civil society to take action and bring air quality data to their community. They ended up releasing a community statement demanding increased coverage in frequency of air quality monitors in Ghana. They had a publication come out. Columbia University heard about this, reached out and donated air quality monitors that they've been setting up over the past couple of months across Ghana. And they've set out to save 17,000 lives each year from air pollution related deaths. And this is a great story of people leveraging the global community to improve outcomes for their local community. Another great story, and this happened recently, we worked with the environmental agency of Iceland to add their data to the platform via their new API. Because we had so many government sources, they actually reached out to us and asked how they said they're developing an API and we actually worked with them to develop the API to make it easier to ingest onto the platform. And now the air quality data is openly accessible for anybody to use. The tweet on the right shows in Icelandic, the environmental agency talking about how they work with us, which is pretty cool. So building this platform is an iterative process and it's shaped very much by the needs of the community. And to better gauge this, we conduct an annual survey and we try to do this every year. And it's basically to help people share feedback about what challenges they face in their work and what tools we should build, how they're doing. And we got results from over 100 people and I'll share with you some of those results. So where are they based? And majority are in the United States. We know that it's skewed towards the Western world and we love for more people in other parts of the world to join us. We do have a good chunk of people in India and Mongolia. In terms of what countries their work impacts, again, mostly in the US, but you can also see parts of Asia, like Mongolia and China. And also parts of Africa, like Ghana and Kenya and Rwanda. And also definitely Europe, lots of people in Europe. So what sector are they from? Blue just highlights the most popular responses. So a lot of people in academia or education sector, a lot of people in software development technology and a good chunk of people in physical and environmental sciences. Well, we have a good representation in government and public health as well. Definitely could do better in media. What are some of the biggest challenges that these people still face in their work? Biggest one is lack of coordinated management. Next is lack of public awareness. And these are two things that we try to help tackle with our workshops. And the other big challenge that people still face is non-existent air quality data, which brings me to the global air quality data assessment that we did. Where we looked at all the countries in the world, which countries have data? Is it open? How openly accessible is it? We felt it was an important contribution for us to provide a landscape looking at global governmental air quality data to help everybody identify gaps, limitations, and work that still needs to be done. And this kind of helps identify low hanging fruit, right? Where can we open up data to affect a lot of people? Where can we add data where it doesn't already exist? And if you look at existing data that's already on the open AQ platform, actually you see that some of the countries that have the least number of stations in their country have the greatest pollution and vice versa. So we're definitely seeing a correlation between how many stations that they have and how they're able to change the air pollution in their country. So out of the 212 countries that we looked at, about 51% of countries didn't have any air quality data and that represents a total population of 1.4 billion people. And if we look at which countries are sharing real-time government data, that number drops to a third. And the two thirds that don't have real-time government data represent a population of 2.1 billion people. And in terms of real-time data, we evaluated the openness of that data and we used four criteria to evaluate whether the data is fully open or not. One is physical data. So are they sharing raw concentrations of the pollutants and not just like an air quality index? Next, we were making sure that it was transparent where the data was coming from and it was reported at a station level along with the coordinates of the station to make sure that people knew where exactly the data is coming from. We also wanted to make sure that the data available is a fine-scale temporally, so at least daily or sub-daily levels in real-time with the time of collection, like date time stamps. And then we also wanted to look at is it available programmatically? Is it data and the metadata? Is it in a format that allows machine-to-machine interaction to do an API, an FTPs, their direct download or something like that. And when we did that, we found that at least 30 governments generate real-time data but don't yet share them in a fully open manner. And making this existing data in a more fully open manner would affect 4.4 billion people and the top four countries in this list of 30 are listed here, China, India, Indonesia and Brazil. And if we were to look at the other end of the spectrum where the countries don't have data, there's 13 countries where just adding in the data would affect a billion people. And you can see the list of the 13 countries on the left. And if you would map that out, you can see that Africa and Central Asia have the biggest gaps. So what will we be doing to address this? Coming up this year, we will be piloting a low-cost sensor platform. So in the past, we focused on government-grade data but we've seen an increase in consumer air quality devices and like citizens and scientists setting up sensor networks, especially in countries that don't have governmental air quality data already, setting up those networks and we see value in aggregating all of that data in a platform similar to what we have done with the governmental data. And so we wanna focus on lower cost or even like medium cost sensors, which cost tens, sorry, which cost thousands of dollars instead of like tens of thousands of dollars, which is like the government-grade stations. So we'll be piloting that platform later in the year. So it's safe to move to that. But more immediately, we will also be launching a averaging tool that'll make it much easier to calculate averages. The current methods can take weeks and using the platform, you can find the country average for every country in a matter of seconds and this will be available through the API and you'll be able to specify temporal and spatial resolutions. So definitely stay tuned. We'll be launching this in the coming weeks. And the best way to know about it is to keep in touch. We definitely wanna hear from you. What started out as an experiment has grown to millions of data points and hundreds of people around the world. We have a dynamic and diverse international community and we would love, love, love for you to be a part of it. So definitely reach out via email, via Slack, Twitter, definitely be around after for questions. And yeah, thank you so much for joining.