 I'm Lea, here's Mohammed and we're going to introduce you to WikiData today. Yes, hi everyone. So in the course of the talk, if you do have a question just feel free to ask them in the chat and then we are going to try and answer them at the end of the talk. Yes, so let's dive straight in. What is WikiData? WikiData is a free knowledge base that is based on facts and references that anyone can edit and reuse. It is part of the Wikimedia projects and like all of us to start open projects, WikiData is multi-lingual and has no language barriers. Data in WikiData is released under CC0 license. That means WikiData's data is in the public domain and it has no exclusive intellectual property rights that is applied to it. WikiData is not a primary source of information. It only aggregates or collects structured data that is already available, some of which are linked to other databases. So it is not meant to be a place for original research. WikiData is made for humans and machines and is available for everyone to use, whether on other Wikimedia projects or outside of it. Next slide. So what is in WikiData? WikiData was launched some eight years ago and was originally created to solve the problem of unstructuredness in the plain text format that information in WikiData is rendered in and also to provide a central storage location where all of the different language WikiData can connect and talk to each other. Today, WikiData has since outgrown its intended purpose and has become so big and successful that it is not only the most edited Wikimedia projects, but WikiData's data is now used more outside of the Wikimedia projects than within it. There are more than 25,000 active editors. That means people who make at least one edit every month. WikiData is used across 800 plus Wikimedia projects in more than 300 languages. And it's interesting to note that the largest proportion of WikiData's items are in the category of scholarly items, comprising about 30% of the whole. Next slide. So far, people in bots have made more than 1.3 billion edits to WikiData and created more than 91 million items. This map you see here is a visual impression of geolocated items currently existing on WikiData. So the bright areas are items that have a coordinate location property added as a statement. Next slide. So WikiData has a vision. And what is this vision? WikiData's vision is to give more people more access to more knowledge. So WikiData gives access to information regardless of the language that people speak. Because WikiData is multilingual, it expects translations of so-called QO numbers into different languages. And so doing WikiData helps us support the smaller Wikimedia projects data. You know, by helping them to benefit from all of the work that the bigger projects are doing. And applications and projects outside of media are also able to benefit from the rich data sets in WikiData. So in a nutshell, WikiData can be taught off as an online repository of structured data that anyone can edit and reuse. Next slide. Okay. Now, how is WikiData connected to Wikipedia and to the other Wikimedia project? Among other things, WikiData can assist the project with more easily maintainable info boxes. So the table at the right corner, this article on WikiData is called an info box, which I'm sure you've seen before. WikiData is able to retrieve content on WikiData into those info boxes. And for smaller language, Wikipedia is like, you know, the band of Wikipedia or Welsh Wikipedia, that really leverages WikiData to see their content. And this is helpful because it helps to reduce editing workload for volunteers. Next slide. So what should you expect to see on a typical WikiData item? WikiData expresses relationships in the form of triples that use items starting with KO and property starting with P. Okay. And the item will typically be made up of at least one statement. So in this example you see on the screen, we have two statements about an entity called Douglas Adams. The first statement Douglas Adams was educated at P.69 St. John's College. What this means is that this statement is qualified by federal properties. That is the academic major, the academic degree, his start time, and then the end time. And qualifiers add more meaning to statements. So WikiData records not just statements, but also their sources. And as you can see here, this helps us to reflect the notion of verifiability on the project. So that statement Douglas Adams was educated at St. John's College. Has two open references that points to the source of that information. And the second statement Douglas Adams Q42 was educated at P.69 Brentwood School. Only has the qualifier start time and end time. And it has no references. So a single statement consists of a property that is made up of a value, with or without a reference or with or without qualifiers. Next slide. So a typical WikiData item looks like this. And you can edit by clicking on the edit button. It has a span symbol which edits next to it. As you can see, each item has a unique ID that is KO, followed by some number. In this case, the item Douglas Adams has a KO ID of Q42. And when you look at the top, there's a 10 box. We call it the 10 box at the top that contains the label in different languages. A description of the items that is more of a short phrase telling us what the item represents. It's easier in English that Douglas Adams is an English writer and humorist. Then there's the Ilias next to the description which, aside from the label, tells us what the item could also be known by here. Next slide. So creating a new item is as simple as going to any page on WikiData and clicking on create a new item. And once you click on create a new item, you get to fill in the form that is asking for a label, description, and an Ilias. And KO IDs are assigned automatically. Next slide. Next slide. Next slide, please. Right. So there are tools that allow us to edit WikiData more efficiently and make bulk edits to WikiData, such as quick statements and open refine. Please go to the previous slide. Okay. Yeah. Right. So yeah, quick statements and open refine allow us to make automated edits and changes to WikiData. Other tools are available that allow us to visualize WikiData's data. Some of them enhances the user interface of WikiData. And these could include scripts that editors can install or they could be gadgets that may be enabled in your preferences settings. Next slide. So far, Mohammed told you about how we describe concepts in WikiData that we've been doing for the first years of the project. But in 2018, we also started storing a new type of information in WikiData, which is lexicographical data, which is basically information about words and phrases in all kind of languages. And so you see on the left the data model that is a bit complex and that's why I'm not going to get too much into details now, but we can talk about this later. And you can see an example on the right, where we basically describe the word Luftballon in German and we indicate the language, the lexical category and all kind of informations that are not about the object anymore, but actually about the word and how it's composed of two words as we like to do in German and things like this. So again, if you want to know more about this, you can have a look at lexicographical data in WikiData or we can talk about it together later in the questions, for example. So WikiData doesn't come alone. It comes with a bunch of tools that have been, some of them have been developed by the development team of WikiData, some of them have been developed by the community themselves in order to do things more efficiently. That can be, for example, adding data and some of the tools have already been mentioned by Mohammed. That can also be matching data with other databases, querying the data, reusing the data. There are also a bunch of tools that are about watching the data and watching its quality, watching what edits have been done recently and so on. And you can find the page that is called WikiData Tools on WikiData to discover plenty of these tools and you can, of course, create your own. So we mentioned that the goal of WikiData is to be reused by everyone, but do we wonder who's actually reusing the data? Well, the first re-users of WikiData's data is actually the WikiData community itself, the WikiData editors, because all of these items are connected. So one item can be linked from another, the content of one item can be reused on one other and so on. The WikiMedia project, such as Wikipedia, but not only WikiMedia Commons, WikiSource, almost all of the WikiMedia project at that point reused part of the data that is coming from WikiData. And then we have companies from the biggest one to the small ones, because the data is in CC0, everyone can just reuse the content that they need. We have, of course, public institutions, such as museums, libraries, and so on. We also have journalists and, for example, data journalists. We have scientists and researchers and probably much more. And the thing is that we don't necessarily know who's reusing the data because it's here in the open, but there are probably many usages that we don't even imagine. So if you're reusing WikiData or if you would like to use WikiData data, let us know because we're always interested to discover more. Now, the question is how can one reuse WikiData? I'm going to present very quickly one of the most popular ways to query the data. I'm not going to get into details right now because there will actually be a workshop at the conference in two days on day three about the query service. I'm going to go there and discover more about how to use it. The query service is basically a sparkle endpoint, sparkle being a query language where you can basically ask questions to WikiData and get lists or visualizations as results. For example, here's the map of the airports of the world named after the person and the color of the dot represent the gender of the person. Or you can make a list of country flags that are including a sun because if the data is properly modeled in WikiData you're able to describe what are the different elements that compose a country flag. Or you can have this bubble charts with the occupation of accused witches because why not? That's the kind of data we have in WikiData. Now there are other ways of course to query the data. I'm not going to get into details right now but if you want to talk more about this you can for example join the WikiData meetups that are going to happen tomorrow. We have dumps of the data where you can download part of all of the data in a file. We have a bunch of APIs to access the data directly from your program. And on a WikiMedia project specifically the community developed a bunch of templates that are using WikiData's data using Lua. And now for something a bit different, WikiBase, you may have heard of it and you may even have wondered what's the difference between WikiData and WikiData? Well, WikiBase is basically the software powering WikiData and more precisely the MediaWiki extension that is turning MediaWiki into a database. And so WikiBase was started to power WikiData but it also started developing on its own. WikiData is still for now the biggest existing WikiBase instance but people can also install WikiBase directly on their server and basically create their own little personal or public WikiData. And the development is still ongoing there are all kind of super exciting features coming up soon and for example the ability to connect better WikiData and your own instance of WikiBase for example to be able to reuse data that is already in WikiData and to connect it to the data that you have in your own WikiBase. So if you're interested in WikiData if you want to know more there are a bunch of pages that you can find there is a help portal the project chat is the main discussion page on the Wiki where you can interact with the other editors, the community it's super important to get in touch with them if you want to get started with WikiData we also have a mailing list we have a newsletter that is called WikiSummary that you can find on Wiki but also if you subscribe to the mailing list you will also receive it and then we have some accounts on the social media, on Twitter there is a Facebook group, there is a telegram that is linked from the project chat and there is also an IRC channel so you can basically find people from the WikiData community everywhere so we are approaching the end of this session but it's not done we have more WikiData related sessions at the C3 in the Wiki Packer so for example tomorrow you're going to get an introduction to WikiData specifically for journalists and especially data journalists then in the afternoon we're going to have two WikiData meetups in German, the second one is going to be in English so depending on your preferred language you can attend one or the other or both and on day 3 as I mentioned before we're going to have a workshop to learn how to query WikiData as data with Sparkle so feel free to have a look and check them also in the main schedule of Wiki Packer thank you very much for attending this session these are our contact details if you want to contact us and of course you can now ask questions as we mentioned in the chat or with the hashtag and we will be very happy to answer all of your questions right now thank you for your input and the overview about WikiData there has been a few questions already answered by Joel in the IRC channel one was about the big dump of scholarly data and what scholarly data is and how this came to be in WikiData but there is one more question from the chat right now Tilla asked, can I add new types of data that are not yet tracked in WikiData? I'm wondering what do you mean exactly by type of data maybe you can give a bit more details because that can mean a lot of things WikiData the data model of WikiData is very flexible and it's absolutely not set in stone every week the community comes up with some new ways to describe things sometime we realize that there is an area of the world that we completely forgot to cover and then we create new properties to describe for example a certain type of I don't know of concept a certain type of building or objects or philosophical concept that we didn't describe yet so this is always in movement and in action when it comes to what we actually call data types which is for example a string of text or a date or a picture we have all kind of data types like this this is a bit more complicated and overall it's quite rare that we add a new data type and it needs a strong use case so we add that to the software I hope that it answered your question and if I didn't feel free to ask again yeah we've got a feedback the example till meant was there's an organization or a project called Parliament Watch in Germany there was one talk earlier today where they tried to track and scrape and analyze arbitrary protocols and one big issue they had was with structural data about all the members of parliament and how they are organized and stuff like that and if I remember correctly there actually was a project that tried to include the structural data of members of parliament in wiki data if I'm not mistaken absolutely it's a wiki project that is called something politicians all politicians I don't remember the exact name right now but indeed some people are already working on members of parliament and like political people in general so it's very likely that there is already a way to structure the data the best way is to contact the people directly involved on this wiki project wiki projects by the way is where basically people who have a specific topic of interest gather and can discuss about the specific questions about this topic so have a look at this project about politics and try to see if anything is missing but generally wiki data definitely welcome information about politicians, about member of parliament this kind of stuff what we do not do however like documents for example in that case the reports or the documents that belongs elsewhere maybe on wiki media comments for example if it's possible if the license allows it but on wiki data we'll be happy to store the metadata about them all right you will just posted the link to the wiki project every politician so if anybody looks for every politician on wiki data they will find the project so basically the bottom line is pretty much anything is possible with wiki data right? yeah thank you Jules and hi almost everything so on wiki data just like on wikipedia we still have some criteria to define what can get in wiki data and what not because we're aware that this knowledge base it needs to stay quite general and it cannot contain absolutely everything for example the community decided a while ago that they would not create them for each human living or who used to live on earth that's just not possible so there are some notability criteria that you can find in the help pages and I would say that the level of like how fine grained data should be has to be discussed with the community and the good thing about having wiki base also available as a separate instance of wiki data is that if some people want to work on a topic where they have some information that is very very specific and would maybe not fit the scope of wiki data they can create their own wiki base and then they can connect the content with what is already in wiki data so all together in this wiki base ecosystem yes pretty much everything is possible well the future is certainly here at least with wiki data thank you again Lea and Mohammed for your insightful introduction to wiki data and we're looking forward to joining you in your efforts thanks for your presentation thank you see you soon