 Cool, now it's going Hi everybody We are Well into the roller-bag sessions at the end of the conference I know like energy is way down here, but I'm also going to tell you this is something that's really important to me And I'm going to try and keep my energy up for it. So hopefully it'll be interesting to you, too My name is Evan Padromo. I am probably best known in the open-source community for my work in wikis Started a wiki called wiki travel. I'm also known for doing social networking software. I Built a classic social network called identica that did open-source social networking And then I've also done a lot of standards work at W3C. My current work is director of open technology at an organization called OEF OEF is the open earth foundation where a US nonprofit with a worldwide team We have people in Asia. We have people in Africa. We have people in South America And our mission is to build open source for a thriving planet We've done a lot of different kind of researchy style projects where we're productionized Work coming out of universities But I think the most important project that we've done and the one that I'm going to be talking about today is open climate open climate is Digital infrastructure for doing nested climate accounting in order to support an independent global stock take of Emissions data, which is a lot. I know there's a there's a lot in there and I am going to unpack it as we go along The context I hate to be the one to give you this news, but if this is the first time you're hearing it We are in a climate change Catastrophe in an emergency the World temperatures have gone up by 0.7 degrees C over the last 30 years and continue to continue to rise and We are rapidly running out of the carbon budget necessary to to Continue with our current way of life. So we are in a very severe situation in terms of managing our Emissions of carbon dioxide and other greenhouse gases Fortunately in 2015 the countries of the world came together in Paris and agreed to a non-binding framework to reduce emissions awesome So every country in the world is working on as best they can policies and efforts targets to drive down emissions of carbon dioxide and other gases The kind of report card on that process Because it's a voluntary process. There is a global stock take that happens this year 2023 first one ever since 2015 where all of those countries come together and provide their You know turn in their homework and say this is what we've been doing. This is how we've done so far So great that seems pretty reasonable except Countries aren't the only actors that are involved in climate change, right? Countries are really important. They set high-level policies they are a great Set of actors for doing measurement and they have a lot of the resources, but states provinces and regions also have policy Activity cities are extremely important. They're very on the ground but we also have corporate actors who Also have climate policies and climate efforts all the way down to individual emissions sites like farms or Mines or factories, right? So we've got this spectrum across different types of actors that are Making emissions and setting emissions targets and taking actions all of these these emissions aren't Happening in a vacuum. There are relationships between these different different Actors they can have part-whole relationships like Vancouver is part of British Columbia But they also can have ownership Relationships so a an American corporation can own a mine in Bangladesh and there is a level of responsibility That crosses there that is not necessarily reflected in the geographical relationship These different actors have to be coordinating their activity Like I said, it's not happening in a vacuum. It has to be happening together both across that part-whole and international level so setting climate targets that are Related to each other at that part-whole level and also taking climate action setting up climate plans that are complimentary and In accord harmonized the big problem with this kind of stock take issue well one of the big problems is that a Lot of the data is distributed across lots of different sites, right? So it's in different places. It's not really surfaced into public Open data systems, so There is a lot of data that's distributed around the world hard to get to another problem is that very little of it is actually turned into Quariable data that we can use for mixing and matching for generating different kinds of Insights or information a lot of the data on climate change climate emissions either lives in Excel Spreadsheets on somebody's File server or even more common is that you'll see it in in PDF formats that need to get extracted so it's a very very 1995 world for data out there and it's very difficult to get to it and when stuff is actually in In a queryable data set that you can use You'll see lots of different formats lots of different methodologies for doing measurements So it's really hard to compare apple apple apples Which we have to do right so for everyone who is trying to make reductions to their emissions trying to set climate action plans they Need to decide who they're going to do business with who they're going to relate with Relate to who to reward who to encourage who to kind of you know Nudge along the path and that only happens when we're actually when we have well surfaced Information that that actors can work on in order to Deal with these challenges disparate data not very well available data the OEF is partnering with other organizations some academics some in government to Provide a digitally enabled independent global stock take or digs I'm not crazy about the name It is very long and complicated But basically what we're talking about is getting the digital infrastructure ready that we can make actually do these comparisons and do it in a simple way That means not only data harmonization. So bringing data into common formats But providing the kind of infrastructure making queries Easy and to some extent having policies and these these are policies about open data and and data sharing I'm not going to go into the policy right now because that's not my area. I just do the software We have a Really nice system called the open climate network that you can find there open climate network that Shows off the system that we have built to provide this infrastructure Here is my giant data schema the biggest Challenge of building this system was coming up with a schema to harmonize towards and a schema that covered the important parts of the Of the climate effort and that left out a lot of the things that weren't Relevant for setting policy achieving targets and taking actions probably the biggest part of this schema that was Fun to do or interesting to do was focusing on actors So we treat all actors as if they're the same whether you're a nation state or you are a farm or you are a city Or a town of 50 people They are all represented the same within our schema and we have mechanisms for Resolving different identifiers and names and relating these actors Together like I said both part-hole and owner property relationships We also support our actors with a lot of context within our within our schema like population Geographical data economic data About all these systems whether it's a GDP for a country or annual revenue for a company We cover emissions of course on an annual level We have lots of different sources for emissions because there's the official sources that the Country or city may have published, but there's also independent measurements or estimates machine learning systems that might show a different story right these Governments may be Tweaking the numbers a little bit to kind of show different put themselves in a good light And so we try to show multiple sources and give a better Estimate based on those different sources. We also go into breakdowns by gas scope and sectors So these are ways of thinking about Climate emissions and we cover different kinds of methodologies for targets And an important thing once you know what your emissions are you know what you need to get to we have representations for targets from from different in different styles and we also have representations for action plans What is the actor actually going to do to reduce their emissions down to their target level? and breaking that down by what type of actions that are going to take how hard they're going to do it and Doing estimates on the effectiveness how much carbon is this actually going to reduce from the annual emissions load We track all of our data Pretty extensively with with metadata because we need to be able to prove The source of the information we need to be able to track it back upstream So every row that we have every bit of data in our system is tagged with data sources publishers It's linkable. We can track it back to the dates Things that we didn't cover within open climate because this is a policy and an action System we're focusing primarily on So we're not focusing on effects of climate change mitigation Weather events etc. We also don't go down to the level of individuals because that's really just too hard to track We bring in data from the UN F triple C, which is the UN's climate change Organization for that we actually have to do transfers from those different file formats There are a number of international Organizations like networks of cities. There's a global global conference of Mayors The C40 there are a few other networks of cities where they've joined together to collect data And then we're also using other organizations like the World Bank a Lot of the regulatory agencies like the EPA or E triple C here in Canada also provide open data that goes down to that individual site level We have a lot of rich information that's coming out of those regulators and because it's being published We're able to We're able to map it in but We're taking that, you know site level data and we're connecting it to the companies We're connecting it to the place where it's actually happening. So we're taking providing more context for that for that information at that site level academic information has also been really helpful Because there are a number of academic papers that are producing data based on sensors or estimates that are happening in the city's Involved this especially happens in the global south where there is not as much on the ground estimation happening we provide a Harmonization system which lets us transform data into our Data schema and loaded into our databases and then finally we have a data explorer That lets you you know search around look at different kinds of Look at different kinds of actors within the system Go up go down in a geographical hierarchy and understand the different contexts that that matters for those organizations The system is not designed for the web interface web interfaces designed to show off The the infrastructure we have a restful API that is open for use. So it's open for anyone to use. It's not a Private API we are able to get Important information about each actor about their relationships with other actors. So you can navigate up and down that geographical hierarchy We have a pretty extensive search system that lets us look at Not only free text search, but also focus down on structured identifiers or names in different languages. So we can track down the ways to look at say Montreal or Quebec in French and English we do one of the big things that our Users have asked for right the first thing that data scientists want is like a big data download They're like, okay great API looks work looks great, but we want a big data download so we have mechanisms for Downloading selected parts of the data based on the actors that are involved and that brings all the information We have on say British Columbia down into a single CSV or JSON file that you can then do additional work on one of the things that we really like about the system for exposing it to other Researchers is that we're participating in the data comments from OS climate. So Open climate is one of the nodes within that federated data system We're a little bit of a funky note there because we are doing a lot of aggregation But we are participating there and accessible through the through OS climate. So great ways for someone who's say trying to measure risk against Climate plans, they're able to kind of bring that together across the federated systems. We have Really nice Python library So we're really supportive for those. Oh my goodness that Earl is terrible But yeah, you can just kind of scroll past the end We have a really nice Python library for doing evaluations works great in Jupyter notebooks, etc and We have of course since we're here We are doing this work all out in the open. So this is all open source software The data is primarily open. There is some that is you know licensed restricted for for some kinds and it's well Well noted, but it is available through the API So we have a github repo where we do most of our work The system that I just showed you is based on a very simple stack So we use the pern stack post-rex express react and no JS that provides the API that we have the explorer that we have As well as the back-end data data store and all the software that we have is all ASL To so it's very very liberal license for those who want to participate Boy that was a lot right through a lot actually so we've got a big database We've got an API. We've got an explorer. We're connected to a lot of different A lot of different mechanisms What has that got us what have we done with that information? Well, first of all, we've been really successful in collecting data some of the data has been done by My data team some of the some has been done by researchers We we we partner with and then some of it's been done with with folks who are participating in our in our network through our open source mechanism Altogether we've had 63 different data sources imported It's a hundred and forty six thousand Actors that are represented in our system. So those are again cities cities countries corporates around the world We're tracking 360,000 annual emissions records across all different kinds of actors We're tracking about four thousand targets You can see what we to tours magnitude less on targets than emissions I really think of emissions as data points and targets as a as part of the curve So it does break down a little bit there, but it's still quite a lot of targets to to be looking at Results wise this all comes out of the Demos from the Python client, so I just screen-shotted those We're able to get like really interesting information about emissions And then actually temperate with contextual information. So up here you see the annual emissions of a number of different countries like China in the United States and you can See it modified on the on the right-hand side by population right so what one story there of the meteoric rise of China's emissions is somewhat tempered on the other side realizing how How much that is tracked by population in China? Different colors. I know it's not great It's not great. I I'm aware of that Yeah, I was like considering I was like should I change that I'm like I don't have time I go Another thing that we can do because we're including targets and emissions and emissions from different areas Is we can look at how well Different actors are actually achieving what they say they're going to achieve. So what we see here is Great Britain's target of getting to Getting to what is that 250? Megatons CO2 equivalent by 2030 and we can see they're kind of Their movement there They're not on track to hit right you can see the dotted line They're not on track to hit and that other one is the UK For their 2050 net zero target. They're also not on track to hit that one either, right? So we can do that for actors of all different kinds again cities and countries and corporates using this data one Analysis that I really like is For my home country here in Canada We have a national goal of getting down to 445 megatons of carbon by 2030 right so that's a 40% drop in emissions by 2030 based on 2005 baseline Which is an ambitious goal? Unfortunately, is that ambition is not shared by the provinces in Canada which have set much more conservative goals And the sum of their ambition is what about 140? Megatons more than what Canada shooting for which means again until those are resolved and harmonized It's unlikely that we're going to hit our hit that target, right? So the responsibility here is distributed Across those different parts and if we can't we can do the same kind of analysis for a State and its cities or even a city and the corporates that are in that city One of the big things that we are trying to do is map actions and plans to actual emissions and actual targets are your action are your action plans actually sufficient to reach the targets that you've set It is difficult because as bad as emissions and targets data are in terms of Data structures and being able to extract them These actions and plans are almost entirely in PDFs they almost all have lots of like green leafs on them and happy families walking through a field and Once you actually get to the part where they're going to reduce traffic by 25% by 2035 like it's very hard to actually get that information out Another thing that's important to us is easier contributions right now. We use a github flow for getting data in We want to make this process more like a Wikipedia for climate data So having our API easier for folks to both submit As well as read out of the API doing file uploads in our system. So being able to just say I got a file Here it is finally we are using we are doing some piloting of verify credentials and and dids for Bringing in trusted data from From the internet it is still an ongoing process But we're hoping to work with others in this area to be making standards around around this One of the things that we're pretty excited about is I mean if you have been staring at like 200 page PDFs full of jargon and Realizing that there is a magic machine that can read that entire report and tell you what is actually interesting in it It's pretty astounding. So we've been Really interested in large language models for exactly extracting data from PDFs The UNF triple C for example has over 3,000 PDFs That is collected for that global stock take in 2023 all of which are hundreds of pages Almost impossible for anyone to summarize all together And we think that LLMs are going to make it possible for us to extract the actual data out of those long documents and make them comparable Measurable and get trends out. So we're we're Working with a couple of different LLMs also looking at at training our own specifically for this for this problem Speaking about LLMs, they're also very interesting for data harmonization. So being able to do automated ETL With the automated transformation on data matching fields aggregating disaggregating We're seeing some Good performance for this however It still requires a human in the loop for doing the doing this effort We'd like to see if we could get it to an automated process. So that again, this is The previous one was for unstructured data in PDFs. This is for structured data in weird formats a Part one of the things when you provide a restful API is that there's always one more API requirement that people have For doing, you know, the kind of queries that we need we are planning a graphql interface to make it easier for folks to do the kind of structured ad hoc queries that Good structured ad hoc queries that that gets you to a Get you the information you need So if you want to find out what the emissions of farms in British Columbia between 2010 and 2020 where we can be providing that lastly Part of the work that part of what we've learned from this work from open climate is how the distribution of data coverage Really changes as you move up and down this geographical hierarchy Obviously all these all the countries in our system have Have emissions information most of them have targets information regions in the provinces Okay, but not bad but not great For worldwide regions and provinces It's when we get down to cities that things start to look really bad only about 5% of cities that we know of have Emissions data that they're tracking where they're where they have published emissions Emissions reports and we started looking into that because that seems like a really low number It turns out that it's really difficult for these cities Without the force of like kind of a federal government behind them difficult for them to get the necessary upstream data to actually Make the calculations there are Great tools for corporate actors So commercial software for corporate actors for building their greenhouse gas inventories There aren't a lot of good tools for building them in for cities the quote that we got from One city was that they put about eight months of work for a five-person team Into doing their one and only inventory in 2018 and they were like never again. That was far too much So we are putting a lot of expectation on cities to provide Not only policy, but data for climate Climate action, but we're not giving them the tools to actually complete that that measurement and that's why We are looking next at Open-source solutions for cities to help them do those inventories, right so Open climate for cities is a next project for us that we are hoping to provide Carbon and counting software so being able to put together those inventories pull them from data sources maybe international data sources, maybe local data sources and do the kind and bring down that like 40 person months into say one or two person months and And we also think that there is a real role for AI to play in in making that effort or easier so You know, we are still in the early prototype stage for this city's work But we are really excited about the opportunity to apply open-source software to this really critical part of the of the climate crisis That was a lot I whipped through it open climate network is our Is our Earl don't fall for any of the imitators you can get a lot of open climates if you search on Search on the web. I'm Evan at open earth.org and if you want to follow me on mastodon You can get me at Evan at mastodon.openearth.deaf There we go Any actually I'm not sure how I am on time. Sorry. I don't mean to quit the questions, but I think I have 12 1230 so yeah 1235 Yeah, I got five great. Cool. Let's do it Super good question. Yeah, so what we do is We tag all of our data sets with the stated methodology of that of that original data set, right? So if it comes from a machine learning module Machine learning model we use that if we were if we are comparing say All greenhouse gases or only carbon dioxide we tag it that way. So everything that we Bring in we are tagging methodologies for for all of it and that way we can say oh these two don't compare because this one is a This one, you know excludes methane. This one has all greenhouse gases. Yeah That makes sense. Yeah, so it's not perfect, right because we can get down to like levels of differences that are You know Too fine for us to tag, right? But we kind of leave that up to the to the client to say like, okay, you know I trust Primap and I trust UNF triple C And I can see that they're close, but I understand why they have differences Good Cool. Good question Any any others? Yeah, yeah, that's a that's a really good question So there are a couple of ways that's handled in like climate data tracking one is that the Different actors kind of split up responsibility Sometimes it's double booked right so sometimes you'll see the same emissions booked at one at the at one port and at the other port There are also Mechanisms to to say like this is these are Emissions that you know my city or my company is responsible for and these ones are ones that our suppliers are responsible for And we kind of put those in a different part of our accounting process It's called the scope one scope two scope three for for emissions accounting. Yeah So we're just delivering the data. We're not going down all the way down to the Are you talking about cities or you're talking about? overall Overall, yeah, so we are not doing enough Information for say one company to calculate its scope three based on what suppliers have right because then you actually have to break it down based on Your relative use of their you know, how much of their emissions are you responsible for etc? And since we don't have that internal information. We can't actually do that Yeah Yeah, so that's UGIH. I think that's what that's the one you're thinking about the UGIH is the capacity building hub for Four cities. Yeah. Yeah, that's that's us Different talk in interesting project I think I'm at time is Are there any other questions? Yeah, no But I'm going to make a note right now because it sounds interesting. What what does it do? cool Duly noted sounds great. So there's one thing I want to say before I go, which is I Don't know about how many people in the room here Feel the same way I do but I used to feel really panicked about climate change and one of the most satisfying things for me about doing the work That I do now is like I know what my part is in this Emergency, right? Like I know what I have to do like what I have to do is write this code and and bring this data in together And I think for anyone who's like worried about climate change concerned about climate change and is wondering like What do I need to do if you have? open-source experience software development school skills project management skills community skills like There are ways to make a contribution right now for us for OS climate for Green Software Foundation, right? Like there are dozens of open-source organizations that are that are making effects for climate change and It really helps you sleep at night if it's the kind of thing that keeps you up. All right. Thank you very much