 Okay, good afternoon to those who are still in the room. Welcome to the Wikidar data workshop. Please note on the program, there is an Etherpad link to the items. For this session we have Associate Professor Toby Hudson, otherwise known as 99 of nine. His name, and Margaret Donald both talking about aspects of working with Wikidata. So this is as much a workshop, but it's very useful. And I would commend both Toby and Margaret as very good introducers to issues that come with working with Wikidata. When I was at the London Wikimania in 2014, there were small workshops. As of a few minutes ago, there were 100,000,418,054 separate items. So between 2014. And now there's been a few items added. I would like to welcome Toby to come up and do his presentation. Thank you. Hello, and welcome to the Wikidata session. I'm super pleased that we're continuing on after a bunch of well education talks but also linguistic talks because I think you're going to see hopefully that the linguistic aspect of Wikidata is super important. And when I was thinking about the audience we might have today. I don't often present Wikidata to audiences who have so many languages. And I think that's a key and crucial thing that we're going to make use of today. And, and so thank you very much for coming to Australia thank you very much for speaking my language because I'm unfortunately only monolingual but I would love to connect with your languages, and I hope I'll show you how. So, I've been my wiki involvement has just turned 18 I've been in the Wikimedia project for 18 years and have moved around a bit with Wikipedia and Commons being particular interests, but now also Wikidata. So I'm going to tell you, mostly about Wikidata, and also about a, an app, a browser extension called entity explosion which I hope will make your use of wiki data take your use of wiki data up to another level. But I'll get to that so let me just start moving through this presentation is partly based on glamour module, which I think Mike was involved with and Wikimedia Australia. have been supporting so thank you for that. I just wanted to make sure we were all a little bit on the same page in terms of what wiki data is and how it works just in case some of us are new to it. Okay, so hopefully you can see that on zoom. I'm just apparently so video panel. Alright, so this is the rough outline of just the introduction to wiki data. It's simplified a bit we're going to talk about languages linked open data qids and triples and a bunch of stuff that you might have heard words, but not understood before. Here's what wiki data looks like if you jump in on a page and look at an article, or an item with that report. This item I've chosen is a generous of insects, I chose it because these insects have a little logo on their wing, and the logo is a little bit like entity So I've chosen to show you this one. But the first thing you'll see on this page is a languages section, because wiki data is multi lingual fully multi lingual you can see some languages on my page, but you might see different languages on your page and in fact we could click all entered languages and see it on in all entered languages. Part of the crucial opportunity of having all of us here together across Asia, because there are so many languages represented in the room, but those languages are not well represented on wiki data. In fact, we looked up the other day. Spanish, which you would think Spanish should be pretty well represented only half of the properties are translated into Spanish. So you can see that there's a lot to go in terms of language translation and when I say properties. I'm actually talking about one of the core ingredients of wiki data so I'll show you that in a second. So, given that languages are so important and what shows up for you actually depends on what language you speak. I'd love you to as a first activity, set up your user page. If you've not done this on wiki data, all I need you to do is go to wiki data.org, go to your user page. I know you're pretty familiar with user pages, and then put this kind of string on your page. That's all that's all we need it's a bit of complicated tech template string, but this basically shows wiki data, which languages you want to see, and it's kind of pretty important to what you want to see So let me just brief do try and copy this onto your page, but I'll just briefly explain what it's doing it's making a box on the side of your page, telling other people what languages you speak or what languages you want to see. And then you need these language codes right two letter or three letter codes depending on, oh well actually they can be even longer but codes that represent the language that you are speaking and so you'll recognize the end for English. And then the third is sub one oh and ZH is one of the Chinese scripts but but that was there was an interesting discussion about this morning so thank you for that. So, if anyone's having trouble putting this or something like it on your wiki page, stick your hand up and I think someone will be there to help out. So, put your languages don't put Chinese for everyone or someone for everyone put your languages unfortunately mine is my only speaking languages English but I put other languages in this list with the note that I don't know know them to make sure I see them. And so you you're welcome to put more on that you can see. So, no one had a problem with that everyone's got hands up who's got this on their user page. Something like this. Who doesn't have this on the user page, but. Okay, hands up is not listening. That's fine. But I would love you all to do this, even if you have not edited wiki data before because I think you'll see that it's it's quite important for you. Okay, so this is the next thing you'll see further down a wiki data page. You'll see a list of what we call statements. And so these statements I'll read them out to you just in case it's too small. There's a tax on name and there's a string tax on rank, and there's a blue genus, it says, but that because it's blue that means there's another item it's we've linked this item to another item called generous. So you'll see what links mean later parent tax on is linked to a biological parent of this particular genus, and so on so we have Commons categories where you can find pictures of this and main categories we can find a category of all these animals. That's what you see on the page. What does it do. What's the point of this. Well, as we heard cheers yesterday for linked open data so I understand that many of you are on board with linked open data, but just to make sure we're connected. Here's one of our items the national portrait gallery. I think it's the US National one so I don't want to pick on any particular country but the point is each of those statements that we show on the previous page is going somewhere it's linking off to other items and we've essentially established a network of concepts. A network of cross domain concepts across the whole of all of the concepts that Wikipedia covers and more hundred million as Tom said. So you'll see why this network matters. But let me just briefly show you what the wiki data code for each of those links is we have to say you wanted to represent this fact that Sydney is the capital of New South Wales. That's a fact in England in the English language, we'd like to translate it to a fact in wiki data, and the way we do that is to make a triple that the identifier for Sydney is q 31 30. The meaning of capital or the property for capital of is 13 p 1376 and you notice it gets a different prefix the P means a property. It's always the middle one of these triples and and in this case the triple has a value which is another item. So Sydney is connecting it with the New South Wales item and we put that triple when we're entering it we put that triple on the page of Sydney, and so Sydney, the page of Sydney is now connected to the page of New South Wales, or you could say Sydney is named after some guy. But you can see. So we've got the English way of saying those into some numbers these triples. These properties I mentioned already, and they're about 10,000 of them. And if your language does not know what P 1376 is, then you won't be able to tell that Sydney is the capital of New South Wales you'll just know that Sydney is somehow related to New South Wales. So this central one these properties are crucial to translate. And so that's why the next activity is actually to find out how many of your language how many of these properties are translated into your language. So if you go to w dot wiki slash, and there's a short link to get you somewhere else but the slash goes five ZB five this is case sensitive so you need the Z and the B to be lower case. If you can all just go there. I hope that we share. Let me just click on it as well. So if you think it will share. Oh, right. Okay. Yes, thank you. This is good. But I just need to adapt. So it's taken us to a search page on wiki data. And the language that I can't point but but if you if you see in the search field it says minus has label colon TL. So I chose the language toggle on to do this search in. If you've got a different language, could you change your CL to the short code for your language. So the two letter or three letter code and just type in that. And then you'll see which ones are not translated in your language. So here are some, some examples. You, you, depending on what field you edit you may recognize some of these things in English or not. The first one says series ordinal that's a weird English term but it means it says position of an item in its parent series so like are you first or second or third one or two or three so the value will be one or two or three for this thing. And it's like what order are you in. If you know a way of translate it now since since I've chosen to log if I knew to log and I could translate series ordinal into my language then the first thing that I'd love you to encourage you to do is to go to that item like click on series ordinal. And since your user page already has your language on it, it'll have a blank in the label that there will be no label for this property. So if you can try and put a new label into wiki data properties that will be super important for the development of wiki data in your language. It is possible for a single person to do an entire language, because there are only 10,000 and that sounds big but it's doable. So, if you're looking for a mission, that would be a great mission to take up. Does anyone need help with that again we have people who or who's who's managed first of all who's managed to change a label in a property. No, let me start right at the start who managed to search for their language and see some missing properties. I try that if you haven't managed the way to do that. How do I go back to these slides. I wonder what sorry. Yeah, we'll go to it. Sorry, we'll go. I'm trying to get the link back. Sorry. Okay. Apologies that I shouldn't have clicked away from the link the link is is I mean you could type it all out. Mine is has label but thank you. Sorry. I didn't even see what you did but there you go. Okay, so everyone. No, no, sorry. This is great. So, so there's the link wiki slash five ZB five. Go there. It'll be a search page. Choose your language instead of TL choose your code. And then you'll see the list of properties that are missing that we really need you for. I'll just dwell here for a minute if you do have need help stick your hand up and I know there are helpers keen to help with this help or comment. And so you can see like, you know, the mind know a little bit of fun. That would be amazing. Yeah. So you're right English is an exception. It's, it's one of the only ones that is has every property translated. Other real majority languages, massively spoken across the world Spanish we looked up with Heidi the other day and, and ha. But anyway, um, okay I'm going to press on but if you do get stuck there, do stick your hand up I'm sure someone will be keen to help. So what's the point of all of this. What do we do with wiki data, you will if you're wikipedia. I mean wiki data used already. Here are some places you will have seen. You may not know that it's using wiki data but see the language link on the top right. All of those connections to other languages are because a single wiki data item has listed all of those wikipedia articles as the same concept. So they're all joined together. The next one, you might not have noticed. This is the population of the suburb we're in. Ultimo and on many wikipedia is including the English wikipedia are editors used to have to go every five years and this when the Australian census came out every five years are. The editors had to go through 10s of thousands of articles and add in the new population. But wikipedia Australia recently funded a project to automate that if the data is on wiki data. So these populations this population is now drawn from wiki data, we can upload it automatically to wiki data during due to a certain process but once it's there we can draw it immediately onto wikipedia without all of that editing. That was a huge advance. Thank you wikipedia Australia. Thank you Maya Williams who did the work there. That is a really useful link. And I know this is just one of many ways that wiki data can be used in infoboxes, but it's a pretty critical one for Australians so I wanted to show that off. The other place you'll see wiki data used is in the tax on bar down the bottom or the or the authority control box down the bottom of most wikipedia articles. And you if you're an editor you may have just known to put that in brackets brackets authority control. But how does it get it, it goes and gets all of those identifiers from wiki data. So here's another kind of use which you might not be familiar with. This is a query on wiki the wiki data query service. You can build them with code or you can build this query builder but I just want to show you roughly what they can do that you can aggregate information on many of these items all into one answer. So if you've got a question that depends on data from lots of items, you can put them all together with a bit of a query and so this map is a map of all the volcanoes in the world, or at least the ones that wiki data knows about. And it's a single query you can literally just click a query and run it. What I like about that is once you get all of that data together it shows you something about geology. It shows you lines and perhaps rings of fire that you can't get a sense of unless you have all the data together and so wiki data having all the data together is a crucial kind of point of collection. The type of use of wiki data you again may have heard this this is the kind of thing that gets tweeted higher level subject browsers so the one that I'm going to tell you about is entity explosion, but you will have seen these ones as well because wiki data is a great project about academic literature, which and all of the contributors to academic literature and prizes and so on so on, which so it focuses on a field but then makes this browser where you don't have to know anything about wiki data to use it. The main goes with art. If you want to browse art, and you don't want to code wiki data or anything, you just go to this website and click around, because it is drawing everything from wiki dot. Now we're up to the stage where so we know what wiki data can do kind of broadly. What's in wiki data. How are we going so far and then in particular I want to concentrate on these properties because I think they're really really important. These are sorry for the small text. I'll read it to you just to make sure you know. This is counting what types of property we have on wiki data. And I'm going to tell you about the two biggest circles. I'll start with the second biggest circle, which is wiki base item. These are properties that link to other items so they can make that network that I was talking about. And then I'm going to talk about the big circle external IDs. So here is the. Here's a query which shows the power of the network. If you're linking to other items like a movie might link to an actor. Right, it might say, oh the actor in this movie was X and Y and all these people. Okay, the movie might have a date associated with publication, and perhaps the actor might have a date associated with their birth, because, of course, it's a proper it's a kind of true thing about that person. What's nice about having a network is we've got two dates, we could subtract them. What does that do subtract the date of the publication minus the birth date. Ah, it tells you the age of the actor when the film was published. Okay, let's aggregate those and figure out how old, all the actors in films are. But let's split it, according to their gender, because that's also in Wikipedia. And so the answer you get is this, you see the Hollywood beauty bias. There is the age of women, and there is the age of men in published films. This is a kind of queer kind of simple query, but a kind of query that can be readily adapted, and only works because of the power of this cross domain. Wikidata covers people and films. And if we just switch out author for actor for author and movie for publication we can rerun the query about scientific articles perhaps find out whether we're, we've got a beauty bias. I'm a chemist, so you know, maybe, maybe, maybe I'm punished for my lack of beauty or not. No, it's I'm not. There is a different kind of bias. The age doesn't matter so much, but there's something going on there as well and so you can start asking queries about all kinds of biases, both biases within Wikidata but also real biases biases in the real world that we're representing. And I sometimes I put up this, this is a colleague of mine who was right on the other end of this and help asked me to help help with the publication so I'm pleased to remember him when he was 92 publishing away. Okay, so I've talked about the second biggest circle I've talked about the biggest circle. That is identifiers here is what they look like on the page you just get a list of numbers. And it seems a bit useless right. Why would you spend all your time putting in a number. Because those numbers represent places on the internet where that same subject is described. This person is described in her publications. She's also described on Twitter. She has a Twitter account. And imagine if we could go from one place to the other we what Elon Musk has real trouble trying to verify people. Wikidata can do that. If you have a Twitter account, and it's connected to a Wikidata item, then we know who that is, and we can connect it up. We can find out who Vena is we can find all of her subsequent bits. There's a similar example from chemistry we can connect up databases about fragment masses with databases about thermal properties because they at their core are representing molecules represented in all of those databases and computers understand this we know about the linked open data cloud, but we'd like to be able to use it as people. And so that's the objective of this app, this entity explosion app, which is available on Chrome or Firefox and if you go here you can just search the name entity explosion you'll find it. And I'd love you to install it. It's really easy to install. I just want to show you a quick demonstration, because it actually lets you do those cross web links. It allows you to discover links and information about whatever topic you care about on all the other sites. So imagine, ignore that. Imagine you were on a page called I naturalist you contributed a picture and somebody had identified that picture as a platypus. That's a simple example you probably have seen platypus and you probably already know what it is. Imagine you're on platypus look up the top right of this screen just near my photo on the left of my phone in the small circle of photo up the top two buttons to the left is the little symbol representing entity explosion it looks like. And yeah I can't point sorry oh my back and but you can see it's to a crop it's a little red. It's got like six little red circles. When you're browsing around if if that ever goes red. That means we can use this we can we can get something out of it. We click there that should have gone to the next slide you click there whenever it lights up red. What happens is you get a drop down that drags all the information out of wiki data about the platypus. And it knew to do that because you see in the green box I naturalist in its URL had a number that was one of those identifiers so whenever we put identifiers in, we are aiding this massive connectivity. And so I'd love you to experiment and play with that. It's fully multilingual. We can do this in Chinese. We can do this in your language, you just click up the top that that top box is meant to say language, and you can choose your language. It's it's actually a little better than it was when I took the snapshot. So this is still in development if you want to develop come and join me and tell me but please try it on whatever you're interested in places or organizations or people or chemical compounds or organisms. It should work for all of them. It should work for over 6000 different sites because we have over 6000 different properties. And it'll work best if those are translated into your language remember so otherwise you'll be missing some lines. It works and you know all of these things which means it's a great way of telling people about wiki data without actually needing wiki data. I've used this I've presented this to chemists without ever mentioning wiki data because actually they can use it to do stuff across the web. Just with a little entity explosion button, they don't need to worry about what's behind it. I've been getting involved. It's free open source. It's in every language, your privacy is protected and here are some what I'm sorry here are some ways to get involved. I've just gone over my 30 minutes so I'm going to now hand over to Margaret. That's fine. Thank you very much. I'm very lucky to be following Toby Toby. He introduced me to wiki data along with Andy Mavit pigs on. And those demonstrations of the power of a graphical database are just fantastic. Who could not wish to exploit wiki data in all its power. The things that I'm interested in, which are biota, I don't get the right answers, because I don't have all the data. My wiki data journey has been to put names to apny plants, put, sorry, put author names to apny plants to upload us like and to upload us fungi, and then being involved in wiki loves earth somebody asked me how many animals are missing from. Oh, thank you. Okay, yeah, thank you. And I didn't know the answer. Well, I know it's roughly 170,000 because wikimedia Australia gave me a fellowship last year to help add more identifiers to animals and using the Australian forum directory as a basis. And there are a number of us who played with it so there was Toby obviously me, Annie or on say, and also Siobhan Leachman, who many of you will know. And my aim is to get correct answers I would like to be able to say that stepping whoever he might be that stepping authored 500 taxa. I can't do it if the data aren't there. So I have been putting up data. And it's a really, you know, the queries that Toby showed you are lovely because you only need a random selection of movie stars to get that story. There's a random selection of the total number of chemists. But if you want to actually find out how many taxa stepping authored whoever he may be. You really do need to have the data in there. And that's been my mission for a good long while. We'll get there eventually. I have a beautiful picture which you can't see it's a little amphiphond. I don't know how deep it was found but a lot of it's comes from the Victorian Museum and it's a very decorative little number. Can we go to the slides that I uploaded instead and forget my computer which is a real pity because I wanted to do a live demonstration of using open refine with you and I can't do that. I might be able to talk you through it perhaps so who knows. So, I wanted to show you a faster way of putting up the information that I'm interested in showing or finding. And I want to link. Taxa with their authors with their publications. And this is a relatively easy task if you're looking at plants. It's much much harder with animals. Remember animals cover everything from bacteria to Wales. And the people who work in fish know nothing of the people who work in bacteria. And so in fact, when we were uploading the Australian formal directory we found that we had one genus. It was a valid genus and the same genus name covered four kinds of animals fabulous stuff. Similarly, author names. They're meant to be unique. The author of zoological author abbreviation is meant to be unique. It's not. And it varies from database to database and you could see that at the while conference where somebody where the long time palm was put up, and the abbreviation was given as Lynn, yet, as far as I know the standard abbreviation for Linnaeus is L dot if you're talking about a plant. So it should have been L dot, but I'm absolutely certain that whoever put up that name consulted a database. So databases often have author abbreviations incorrect. They rarely have the author name. So a major problem is to find the author names. Now one of the nice things when we ever get my slides up about the Australian forum directory on which I have worked so hard. My plan is to look at the Australian forum directory publication, show you what it gives you. Okay, so this is this shows the problem in almost every wiki data. I pulled this up for the one of the publications I want to do live with you but obviously I'm not going to be doing. Stepping the author name for this genus is in black. And that's how it is in most wiki data is for many, many bioter items. No one has a clue who Stepping is hands up who knows who Stepping is wonderful. Okay, show me that. Okay, here's a bloke called Thomas Roscoe Reed Stepping. And I'm not sure what his birth dates are. But the point is, how do you find this out. There must be thousands of people called Stepping. So the trick is, if I can manage to go page down. I can't see. This is just showing what we're going to try to do. We're going to try to put in. We're going to try to put in a name. We're going as a qualifier for the tax on name. Oh my goodness me. We're going to try to put in the year and what we're also going to try to put in is stated in the reference in the reference for the name we're trying to put stated in the publication. Now that's quite complicated. We've got three things we've got the tax on the author, if not half a dozen and the publication. So quite difficult to do. And I'm not sure I go. This is, this is the valid name and in this one we put up something quite different. We want to put in the fact that it actually has an original combination and this is the author is still in under. It's a logical name future Stepping, but it's bracketed to show that the name has been changed and it's a recombination. So there's a whole pile of things we want to do when we get our things in open and fine. You can see this is our publication. There are two databases I know in in in animals that give publication IDs which should go up into wiki data. So the publication ID, my doing something strange. Oh, thank you. So one of the things about this is the left hand side will try this has the names as published by Stepping. And so they are the original names. The right hand side set of names are the names as the accepted names by the Australian Foreign Directory but notice or not notice but in fact, names accepted by the Australian Foreign Directory do not necessarily are not necessarily the accepted names by any other database. And so you really do need to reference this stuff. And the only ones we're interested in putting up are ones where we have the species epithet being the same as the same in both names in both the final valid name and the other ones. So for example, and Palesca as an ACES doesn't change. So the original name is the final name and they're the ones that we'd be interested in putting up if I were going to do my live presentation which I'm not. And, but we can see what we can see in that slide, which is a pity. That we've got this possibility of original name and accepted name current accepted names, which may not match. I copied and pasted that because after all these all came from the one publication. So I could put them up really easily by just saying this is the publication this is the author this is the year Stepping 1888 publication. Well, you don't know and you can't actually see because you I don't have my live computer. I can. All right, okay. So, this is, this is what the open, I will pull that I pulled the CSV file into open refine. Now I can refine is lovely. You can break up fields and I've obviously broken up the field that had the tax on name I've removed the Stepping 1888, sorry 1888. I've removed the name as it would reconcile in in in wiki data. I've broken up the. These are mainly species so I've broken them up into parent and reconcile them. And you can see that I've reconciled them because the reconciled names are in blue, and you can also see at the top of the column that is green. Only when you've reconciled various things can you start to create what's called a schema, which looks exactly like what you have when you go to wiki data. So in wiki data in wiki data. Sorry, this is what my schema for my open refine and this one. We're, we're actually putting up information for the valid name. So I'm putting up the tax on author. I'm putting up the tax on the year of publication, and I'm putting up a reference. I think, yes. And the reference happens to be the Australian for directly publication list that we had because. I have something like stated in the Australian for directory and I actually then put up the Australian public Australian for directory publication ID. And I put a date, which I haven't displayed because I wanted to display the fact that I put a tax on author citation, I put that in because of the fact the abbreviations do not match across databases. So it's a way of referencing the fact that this is the zoological author abbreviation for this person in this database might not be in another one. So it's useful. The other thing I've put because this particular set of items were ones where the valid name differed from the original combination. And I've linked them together and I've said the original combination for this new name, which is a recombination of the old is whatever we found it to be so it's the one that was also reconciled. It was the shortened valid name. And this moves on. Open Refine is really very nice. I actually am putting up an item, another key item as well. I'm putting up, I'm going to modify the key item for the original combination as well. And so in this one again I put up the same author names as qualifiers, same year as a qualifier, but my reference this time says stated in, you can't see it which is a real pity. I put it in the publication, and I actually bring up the publication for you because I can have it in the schema because it's common to the entire set. So I put it in the schema I haven't bothered to put it in the column arrangement. I just put it in the schema. This is the publication. And it matches it finds it because I had of course already put this into wiki data. I don't know if anyone else had not sure about this one. And the other thing that I've put up there. Can you see it. No. I've also put up that the subject has the role. Now if you think about the structure that Toby showed you another way of talking about that. The property queue item, which is one of the forms of the statement is subject verb object, and often the subject, which is the queue item that you're modifying is called the subject in a lot of the property descriptions. And sometimes when you're describing the object. So here, or going back. You'll see that we added a qualifier saying the object has the role. So the object there is the tax on name that we've given it which was, who knows it some. It's true that the valid name according to the Australian forum directory. And we've said this object has the role of being a recombination. So it's recombining and using the old episode together with a new genus, or a different genus that might be. What about. That. So why use Australian forum directory. There's one reason it. Sorry. This is this is worms. I know this is the. This is the world register of marine species. Which covers the entire world marine species. So you think that would be the obvious one to use as an Australian I do need to use the Australian forum directory, because I know that, while it should be the case that the Australian forum directory is a subset of the world. Well, you know, the marine fauna is a subset of the world register marine species. It's not. And these things overlap in really curious ways at one point I had, you know, Venn diagram showing the total mismatches, but it was too much to show. And I didn't. But worms because it is covers the whole world is probably one of the places you should go to. It suffers from the fact that as we go down it. If you, that's probably too hard to read, but we had a source, but we don't have any pages on this list of taxa, we can see that something is an original description, and we can see that it's now accepted as something else. But what we can see is what the page is in the, in the publication and worms typically does not give page pages whereas the Australian forum directory does. So if I know it's a publication relating to Australian fauna. I'll probably use Australian forum directory as first thing. I think I might be ready to move on. So this is a different publication. And it's, it's much shorter when we go down. So now that's looking good. This is stepping 1910. And we're going to have a look at putting it up. So I should have. You see this has just been pulled across just as a straight text file. I like using text files because I know what the, you know, things are at the end of the beginning. And I pull that into where I'm about to pull that into my open refine which I have already open I hope. So I need to choose a file. I think it was stepping 1910 worms. And I'm going to go next. And you can see that my header is not going in the right spot if I change the choice there it goes to the right spot which is good. And I'm going to just create the project. Now, all I'm going to do is to try and break up this line into original. And whatever else there is. So I'm going to use stepping 1888 as my device thing that I'm going to divide it on. So I'm going to use going to click on this column. And I'm going to say, edit column split into several columns. And what I'll type in there is, I don't want to remove the column. It's not a good idea to remove the column. It's your basic, you know, starting data. And it's much easier to leave it there. I'm going to do that. That starts to look good. Now you can see that's quite interesting. Some of you can see which one have a new name. So Amarillo's Barthescephala has changed his name to Bama Rooka Barthescephala. And that's not one that we're going to put up this time. We're going to just deal with the ones that are blank here. So let's select on that. So let's choose a facet here, text facet. And blank is what we choose, just 17, great apply. Now some of those look pretty horrible. What's really nice about having chosen to use Stepping 1888 as the break thing is that we've got some things that don't really belong. I think I'm going to leave anything that didn't break. That should have broken. Oh, no, it's 1910, I did 1888, silly deal. Okay, that's some undo. Sorry, that wasn't my best effort thinking about the other one. I'm breaking on Stepping 1910. Oops, I can just do it. No, I need to do it on all of them. So I need to reset. Oh, it's good when you make mistakes. I'm just going to undo. Yes, I'm 18 rows. Yes, 18 rows is about right. It's not quite. Blast. Yes, that one I should. Horrible, horrible. That's one that I would wish to exclude. So I've got to undo this. Why is it not undoing? Oh, I know, create project. Okay, I want to just create project. That's good. And now let's go to the split. And we're splitting on a different thing. We're splitting on. We're editing the column. We're splitting into several columns. We are using blank Stepping. Oops, 1910. That will be better. Oh, it helps if you write the right thing. We need a blank. That's how it goes. Now let's hope we're going to move that column and go okay. And what have I done? Long one. That's fine. What's happened to. I think you had a comment before. Ah, thank you very much. I know I've got grot on my screen and I thought it was just the grot. Not good. Not good. We'll get there eventually or not as the case may be choose files. Stepping 1910 worms. Next. Don't get yourself in it. And we want to go by text there and we create. Okay, we'll try again. Let's hope I can spell correctly this time. I'm editing the column and I'm splitting into several columns. And I don't know why there's that stuff there. And the reason I'm putting the full thing there and not just Stepping is that sometimes the database is wrong and it lists. It lists tax. Well, it's not wrong. It lists tax that are also mentioned or treated taxonomically but are not named as original. So it's not an original description for that tax on. So I don't want to deal with those I don't want to deal with taxonomic treatment at all today, or, but I'll keep it in. So I was, I don't want to remove the column. And I go, okay, looking slightly better. As you can see that first one which is the wrong publication, or it's not the publication I'm wishing to talk about the 1910 one. All these others are looking pretty good. So I'm going to do a click on that. And I'm going to pass it on all. And I'm going to pass it on flag. And I'm going to just include those there's probably heaps more. No, those are looking okay and I don't care. I'm going to rename this as you do I'm editing the column in renaming it. So I'm renaming the column and I'm going to call it original, if I could spell, and, and I'm now going to try and reconcile these. Now in reconciling taxa, it can be really tricky because lots of people put synonyms in the aliases and that means you often get a reconciliation when you probably don't want one. So what I'm concerned is synonym requires a reference and it should be written as a property, property synonym of such and such and a reference saying who synonymized it because after all these animals or these plants or whatever they are. There's one specimen in that museum and there's another specimen authored and discussed by a totally different author. Someone had to do the work to say, no, no, sorry, these two are the same. So you need a reference. So I hate, I reconcile very I unclick that, and I start reconciling and it makes a lot more work. It's really tedious, but it's tricky. So I just, I'm just going to match that cell that looks like the same name so I'll match it. That's 100% so I'll match it. And I might stop there because I don't really want to go very far. I'm going to just star these and we may or may not get there I may unstar them later. I'm going to start building my schema. So I go to wiki data and edit wiki beta skip schema. I've got the original. So that's good. I'm going to add an item, which is actually adding my subject. This is my subject in my Q item, property and potential Q item. So I want to drag that item there. I'm not going to add anything I'm not going to add a label I'm not going to add a description, but I am going to add a statement, and the statement is going to be tax on name. Oops. And Tom, you'll have to interrupt me when I'm totally whatever tax on name I drag down. Oops, I'm in the schema. Why can't I see I dragged down original. I dragged down a qualifier. Oh, it's nearly time is tax on author. Okay, I'll just. Tax on author. Let me just put in these qualifiers. We're going to not get very far. I need the tax on author. I'm going to put in now again. I'm not going to drag an item because this one is known. You see we have the choice it comes up just as it would in in wiki data. And not the only other thing I'm going to add now is I'm going to add a. May I add a year, just the year because we have the year in that. Oh, we don't have the year. I didn't grab the year. I would then what we can do is we can take a look at our preview and you can see that we've added the tax on year. So quick statements, which is the plan where I had hoped to be. If I were to use quick statements this does not add as a separate statement it adds it adds to the already existing statement. So quick statements is my preference for adding this stuff and I would go at this point to export to quick statements and that is where I would stop and I think Tom is telling me I more than had my time. I'm not at all. What I'd like to be able to do is thank both Toby and Margaret for bringing taking us into areas of wiki data which we've probably never been before. And given us a really good insight. So thank you very much. Thank you, Margaret and Toby really appreciate it. You're learning full well that if anybody anybody wishes to pursue any of the details to be well worth checking either the pad, either pad, either pad, or contacting other Toby or Margaret, because I'm sure they're really too pleased to help. I hope that for those who are not so familiar with data, I hope that's not too deep enough to give you an idea of how some marvelous things you can do. So thank you, Toby and I do see value in connecting the internet and connecting all these databases which are often professional databases so a tool that does that. I think is useful for science as well as the world. And I guess I'd like to see wiki data in the future being a database that is reliable and relied on and therefore all of us have a role to play in that. So going further with the IDs wiki data is the place that people go to for that. So, I naturalist copies wikipedia articles but gbiff copies all our identifiers so I was really thrilled when I first nominated the Northern Territory floor ID, and I suddenly appeared on gbiff. So it links and then people can see that there are these identifiers, you know, and you can see that they're in conflict. We bring it together. Other people don't, you know, they're all separate silos, speaking to themselves, and we bring it together. I'm a statistician but and for that reason I value data and it really irritates me when I can't get the right answer. Now I know I can't get the right answer but what I wanted first of all to do was to be able to say, James K. Now I can't do that if the wiki data aren't there. You can't actually do it on any database because they're just name streams on databases. And so you've got no idea. So I wanted to write an article for Ipni which does actually identify Heidi Mer. But for animals, you don't have any way of getting at them. And so if I'm writing an article about a person like James K. I want to know all of the taxa he authored. So I actually pulled down a CSV for James K for Lowry, and I identified. So I'm disambiguating his name. I use an IRMNG file which is an interim register of Marine genera and other bits and pieces. And it has a nice download that downloads by author very few. Well, no other database so far that I've encountered downloads by author. So I use that but I want to know what the tax are I mean I think the encyclopedia is failing. Badly, if it can't tell you how many taxa this man authored. And I also use it as a basis for generating, you know, future articles that relate to him and so he's not an orphan. So I've got at least, you know, written four articles for Lowry's taxa. But that's, you know, because you want it there. I want it there. Absolutely right nice nice summary. I hope that's answered your question. Was there anybody else with any further questions. Okay, well, I'll say thank you very much once again to Toby and Margaret. And thank you all for coming and listening. Thank you very much.