 Hello everybody. I hope you all can hear me. This is Shweta and I am dialing in from India so it's quite early here. But I'm here from Semantic Climate to talk about how we've been building climate knowledge explorer using Wikimedia tools. A little bit about me. I am a volunteer developer and program manager at Semantic Climate and I've been working with Semantic Climate for close to three years now. So, at Semantic Climate what we've been doing is we've been building climate knowledge explorer liberating climate-related facts from scientific literature. So in the next 15 minutes I want to talk about what we do at Semantic Climate, how we integrate Wikimedia tools like Wikidata and Wikibase in our pipeline and how we work as an open science project. And I'd also like to hint a little bit about our future events. For that, let's get started. So the information to save the world from climate crisis is already available in these amazing reports that the UN puts out. So we are all aware of intergovernmental panel on climate change, IPCC, and they have several sets of reports. For example, they've got working group one, which talks about physical effects of climate change on Earth, working group two, that talks about effective climate change on natural and human systems. WTT-3 is on mitigation. And similarly, there are three other sets of special reports. One's on climate change in land, one's on global warming. And finally, we also have a special report on ocean and chai spear. And very recently, IPCC put out what's called synthesis report, which basically synthesized all the key findings from these three reports, WG1, WG2, WG3, and the special reports. And I think the UN Secretary General called synthesis report the surviving guide for humanity. But the problem is that all of this is a lot of information. These reports are really huge. Like, you know, there are seven reports that've got multiple chapters. It's more than 15 pages long. It's got so many terms. And it's written in a technical and jargon-filled language that you and I may not be able to immediately understand. So the climate information is actually rendered inaccessible to most of us. And worse, it's only available as PDFs for the most part. So how do we get this information from these knocked up PDFs and make these climate facts available to everybody? So that's where semantic climate comes into picture. With the help of Wikimedia, semantic climate is making climate information more accessible. So in the next set of slides, I'm going to describe the pipeline of our project. So here it goes. So like I said, we have different sets of climate reports, usually available as PDFs. So our tool called ByAmy converts these PDFs into HTML. Now, HTML is much more easier to work with, at least for the software and also for humans. So once we have that HTML, and it takes a lot of effort to get there, we can do a lot of interesting things. So I, along with some of my team members, and Daniel Meachan, I don't know if he's here, but he's here. Hello. So we have developed a tool called Doc Analysis that extract information, textual information, with the help of what we call dictionaries. Now, I'll talk about dictionaries in the subsequent slides, but dictionaries are basically set of terms that we think are important. So it could be all the abbreviations that I mentioned in these climate reports, or it could be all the climate-related terms of the glossary that IPCC gives us. So we can use that to search the literature, or to search the paragraphs in these IPCC reports to index them, and also to get occurrences. For example, if you want to, you know, find out which all paragraphs mention a geographical area that you're interested in, and a specific climate term, our tools can easily do that for you. So, initially, through this pipeline, what we are doing is we are going from the dumb PDF to climate facts, right? So before I go ahead and talk about what we do with these climate facts, I just want to take a quick detour to explain about dictionaries. So dictionaries, like I said, are a set of terms with unique Wikidata identifiers. So let's take greenhouse gas, for example, it's got a QID. I mean, I'm sure we all know what Wikidata is, so I'm not going to talk more about it. And more importantly, Wikidata gives us information about the specific term in different languages, right? And our dictionary, since it's linked to Wikidata, also has information about specific term in different languages. And we use these data to annotate the paragraphs. And therefore, when we do the linking, we can actually display what a specific jargon term means in a language that the reader is comfortable with. So this is an example of the annotated paragraphs. As you can see, the term CCS, I don't know if you can see it. The term CCS here is hyperlinked. And if you click on it, it will take you to Wikidata. And if you do not know what CCS is, you can just immediately understand what it means is carbon capture and storage, the process of capturing and storing waste carbon dioxide, right? So this was one of the two cases of being able to link into Wikidata and cementifying the paragraphs from IPCC reports. The next thing that I also briefly touched upon when I spoke about the pipeline is when we have these HTML, we can use both supervised and unsupervised methods to extract key phrases and correlations and co-occurrences like I described in the previous slide. So with all this useful information, what are we doing? And like I said, semantic climate is cementifying these climate facts, so we have these climate facts. So what do we do with it? That's where Wikibase comes into picture. Back in April this year, Egon and Lars Willekargen, along with Peter Murray-Rast, who is one of the co-founders of semantic climate, sat together to set up a Wikibase instance for us. And I'm sure you all have had a chance to learn more about Wikibase in yesterday's talks and so on. So what we do is that we take these paragraphs, which we call statements or climate facts, and put them in our Wikibase instance. So for example, let's say 2.2.2 c, the paragraph, and it's got its own QID. And in the statements we tell, okay, this is an instance of a section. It's a part of this bigger subsection, and it mentions these keywords and, you know, it also is talking about these specific geographical location and so on and so forth. And these climate facts, thanks to IPCC, have confidence level that we can be making use of and, you know, we can be documenting that in this item page, for example. So with all this information curated on this open database, we could be doing a lot of interesting things. So through this, what we can do is you could explore the climate facts in however way you want. Essentially, this becomes your climate knowledge explorer. Let's say, for example, you're interested in knowing which all paragraphs from IPCC reports mention a specific geographical area that you're interested in. You can write a quick sparkle query to do that once we have all the information in our Wikibase. And second, let's say you're interested in building a knowledge graph of all of the based on which paragraph references which other paragraph from the IPCC reports and you can do that again by writing a sparkle query. In other words, what we are doing is transforming the way anybody would read IPCC reports. And this can help them build newer connections that otherwise wouldn't have been possible with just simple PDF reports. Right. So this is what we've been doing with semantic climate. Now I would like to talk a bit more about what we, how we work. So we have been running a lot of hackathons and it's driven by all these amazing volunteers are, you know, our project is mainly run by volunteers. So in the past, we've been part of hackathon PRJ, which we've won this happened in the University of Geneva. This happened at the University of Geneva. And after this, we ran our own hackathon in Delhi. This was fully online this September. And very recently, we ran our in-person hackathon in Delhi with about 30 people, you know, trying out our Wikibase instance, trying out our tools and so on and so forth. We have more pictures in the subsequent slides. Finally, we were also part of Fiske recently, where we had about 10 people, I guess, trying out our tools and testing them. So this was in, this was during hackathon PRJ, where we had people run our Google collab notebooks that took them through the entire pipeline. And here we also had people who tried this out and translated the documentation into different languages. And this was truly learning by do. This was our Delhi event, September. And finally, this was our climate knowledge hunt in May 2023, where we had people, we had different breakout rooms and people tried running our Wikibase, tried, you know, writing sparkle queries, you know, you start tools, talk analysis by Amy, by get papers and so on. And these hackathons also produced a lot of volunteers for us. So for example, we have Shivani and Yasin, who were part of our climate knowledge hunt in May. We're now working with us and working on specific chapters. Shivani, for example, works closely in Ladakh, and she is interested in high-mountain structure. Similarly, Yasin works on food security. And all these people have been making YouTube videos to explain what their chapters mean. And they have been able to understand more about these chapters through both manual reading and with the help of our tools. So do check them out. I've put the link in the slide, and I'll share the slide soon as well. So what's next? So we would love to advance our technology and continue building the community. So we are part of many other events in the future as well. We have, we are part of this workshop code data is organizing in September 2023 where people show will get to. Similarly, we're being part of our camp open science and Berlin. I think this is being hosted by Wikimedia, Deutschland. And finally, I'll be at UN data on November this year. This is happening in Geneva and we would love to, you know, explore, continue building our climate knowledge explorer with the help of people we meet there as well. And maybe we could be running more hackathons with Wikimedia in the future. So yeah, please do check out our website. We are very active on our website. You can check out more about our events, test our tools, try out a Wikibase instance. And yeah, so please do come join us in building this climate knowledge explorer to make climate knowledge accessible to anybody. You can, you know, be exploring the content, for example, become chapter champions or you could, you know, improve our tools. We follow what's called open notebook philosophy. So all of what we do is put out online as soon as it's done, or you could be a part of our community process. We could engage us through various events that we are part of, or, you know, just test our tools, improve documentation, whatever, and we would love to have you on board. So this is the larger team. I would like to thank each and everybody who are those pictures are put up here and people who are some, I mean, we have had over 50 people work with us over the past three years. And I have not been able to include all of them here. I specifically like to thank Peter Murray Rust and Gita Anjali Yadav who are the co-founders of similar. Yeah, but that I'd like to thank Wikimania for giving me this opportunity. Here are some of the useful links and I'll share the slides. Thank you so much for your time Shweta. We actually have a question for you from event yay. The question is how do you have any ideas about how to enrich wiki data based on the information you've provided today? Thank you. Yes, we've done this before. Let me go back. So with the tool called Doc Analysis, what we were able to do is use semi supervised searching to extract information from the scientific literature. So we also work with scientific literature that that's available as XMN and what we did with Doc Analysis along with this was the work that I did with Daniel Nietzsche. Is that we were able to get information about the ethics that were mentioned in the scientific literature using the wiki data driven dictionary that we had created for organizations. And so for all the papers that we were able to analyze, we got the organization that had given ethics consent for that specific study. And like you know, wiki data has, you know, all the scientific papers indexed as well. I mean, as an item as well. So what we were able to do is then add information about the ethics for these papers. And we could do the same with let's say all the chemicals that are mentioned in a paper, we could extract those from our pipeline and put it back into wiki data. Or so that's the example we also have to that analyzes the, you know, gets the abbreviations from the reports automatically, there's a Python package with let's do that. And we could be thinking about putting in all the abbreviations that we extract from these IPCC reports into wiki data if they don't already exist. Yes, we have already been able to enhance enrich wiki data and we would love to continue. I hope that answers the question. I believe it did. Yeah, thank you so much. And with that, we're going to head to the break. Thank you for your presentation. Thank you so much.