 Hi everyone. Hi. My name is Ratchit. I work with the Wikimedia foundation. Thank you all for joining us. I'm also helping around with Wikimedia. We have a next session and we will have our speakers on screen soon. So the next session is on data use and reuse. And we have members of the Open Knowledge Foundation. We have Carol. We have Sashi and we have Sarah who are going to be presenting. I hope you all can hear me. I said we can see you. And whenever you're ready, you can go for it. And I'm hoping you can see the audience as well. Thank you. Thanks very much. Welcome everyone to this Wikimedia 2023 session. We're really happy to be here, although sadly only on screen. We wish we could be there with you. But we'll do with what we have. So welcome to this session, which is going to be about data use and reuse. We're going to tell you a bit about how you can unlock the potential of frictionless data for Wikidata. Before diving into this presentation, I just wanted to give you a little overview of how the session is going to be structured so that you know. So, although in the program it said that the session will last one hour and 15 minutes, this was actually, this will actually be slightly shorter. So we'll have a one hour session, which will start with a general presentation about Open Knowledge Foundation and frictionless data by myself. I will be followed by my colleagues Sasi will give you an in depth analysis of how you can use frictionless data for Wikidata and she will also give you a tech demonstration about it. And then we will be closing the session with a questionnaire that we have on an etherpad that will be shared in the chat and among the participants as well that's going to be a dedicated slide. Since we're not there, I ask you that maybe if you have questions throughout the session, maybe you can note them down and you can add them in the etherpad at the bottom of it. So that maybe we can devote the last 10 minutes of this session to answering any question that you may have. I'm adding maybe just quick introduction from myself and also want to give the opportunity to my colleagues Sasi and Carol who are here to introduce themselves as well. I'm Sarapetti, I'm based in Bologna, Italy, and I've been at the foundation for a while now with different roles and wearing different hats. Today I'm here with my heart of frictionless data community manager, but I'm also the lead of the international network that I'm going to tell you a little bit about about later. Hello to Sasi and Carol to introduce themselves. Sasi, do you want to start? Yes. Hi everyone, I'm Sasi, and I'm from Nepal, and I work as a developer in open knowledge foundation, and I joined open knowledge foundation into 2022, and I'm contributing and I work as a maintainer and like manager of open source softwares under fixing this data in open knowledge foundation. So nice to meet you all. Hi all, I don't think you can see me but I'm Carol, I'm Brazilian, I'm based in Sao Paulo, Brazil. I'm talking to you since from this very early morning and I'm very happy to be joining this event. I'm partnerships lead at the open knowledge foundation and I'll be talking to you at the end about a partnership we're building upon this application. Thanks Carol, thanks Sasi. So now it's time. Let's get started. So, first thing that I wanted to do is basically explaining to you a little bit what open knowledge foundation is. The foundation is a global organization that has been around for almost 20 years now. We're turning 20 in 2024 next year. And we are renowned experts when it comes to open data, open government and open content more in general, little spoiler alert, we are the ones behind the open definition. So, if you're interested in that discussion, I would encourage you to stay in this room, because the session just after is going to be about that. So what we do at the foundation is that we develop open source technology and we develop also communities around these open source technologies in the field of open data and open government. We also have, we have maintained and we support a network of open advocates and open activists across the world in 40 countries. And together we promote open knowledge as a design principle. We are known globally as leaders when it comes to build a robust and sustainable open infrastructure for publishing, sharing and working with data. So what we actually do in a nutshell is that at the foundation we provide services tools and training that enable governments and organization and communities across the world to adopt open protocols as a design principle. Here you can see some of the tools that we developed. So frictionless data, of course, I'm going to tell you about this later, but we're also the ones behind the second project, which is a knowledge management system, which powers open data portals for governments and organizations like the United Nations around the world. We are also, as I said before, behind the open definition, which is basically a kind of standard that defines what criteria content needs to respect to be defined open. And again, if you're interested in that discussion, I'll definitely encourage you to stay in this room and join us later for a conversation about the open definition. And last thing that I wanted to mention, since you're probably an audience here that is very interested in data is the open data handbook, which is a guide with case studies and very useful resources that you can use to make your data open. And I definitely encourage you to go and have a look if you don't know it, very useful tool available in many, many languages open data handbook. If you can just go and look it online, I'm sure it's going to be very, very useful. On the other side, what we also do is we do advocacy work. So we are the ones behind very renowned global projects like the global open data index, open spending, open budget data, the justice program. And we're also very well known in the second decade of the 2000s for the open knowledge festivals, which were big global gathering of data enthusiasts from around the world. What we also do, as I mentioned, so since we develop open source tools, we are also community builders. So we have, of course, the communities that gather around our open source tools, a community of users and contributors, but we also have other communities as well that in with the example, we have a community of data trainers that get together under the umbrella of the School of Data. We have a global trust food community of data enthusiasts that gather, gather online once a year for the Open Data Day, which is actually now an open week happening at the beginning of March. We also support the open knowledge network, which basically maintains a curated list of open experts under the global directory, which is a list of people that offer themselves as available to peer experts from the open movement. We have the project repository which is a repository of projects from the open knowledge network. Very interesting. I would definitely encourage you to go and have a look. And since the beginning, we're also supporting CSP conference, which is the data conference for organized by community of data makers around the world. Now, time to talk about frictionless data. So we're all here for that. What exactly is frictionless data? Frictionless data is a toolkit, which aim is to remove the friction in data. So when we say friction, it basically means everything that can happen that prevents you from directly reusing someone else's data. So for example, you don't know who created the data. You don't know what the license is on the data, so you don't know what use you can make of that data. And removing the friction, it means basically that you can move from data to insight faster. Frictionless data is an open source project and is made of, I would say, two parts. One part is the standards for data and metadata interoperability. On the other side, it's also a collection of software tools that you can use on top of the standards to perform a certain number of functions on your data to make them more useful and open. It is also a range of best practices for data management. We have a big global community that is nurturing this project and also very important frictionless data is completely platform agnostic, which means that it's totally interoperable. Okay, that's all very well, but how exactly can you use frictionless and how can it be useful in your daily work with data. So frictionless, basically it's very help, because it can help you making your data open and fair. So I'm going to repeat this but I'm sure that everybody knows it here fair means findable accessible interoperable and reusable. If you want to know more about it and if you want a good guide for this, I would invite you to visit a project called the touring way, which is very, very helpful in that I can paste your link after the presentation in the chat or I can put it in the data pad actually. Just a little reminder here also that open data and fair data is not synonym data can be open without being fair and it can be fair without being open and frictionless basically helps in both ways. What I'm going to do with you today is running through a series of like, I would say, basic functions that frictionless can do for you and then Susie will give you a more in depth analysis later on about how you can use this with wiki data. Frictionless started with a very simple idea which is not new. It's the idea of basically building containerizing your data basically the thing is when you share your data. If you want someone else to reuse your data directly, they need to have a certain amount of information about the data that you're sharing. And who created the data, what is the license on the data, what do the column names exactly mean. How was the raw data collected for example and all this, all the answers to these questions are typically contained in what we call the descriptor of the data which is metadata so data about the data and a schema. And the idea behind frictionless days that to make it very easy for people to directly reuse your data, you would share your data and the descriptor together packaged together in what we called a data package. It means that if I take my data I package them together in a data package and then I send it to Susie to reuse she can directly use it without coming to me and asking questions. This is the core idea of frictionless, but there's a number of things that frictionless can help you with actually. So, for example, if you use the software of frictionless, you can take your data set, and basically, thanks to frictionless frictionless can help you to describe your data. So, what frictionless will do is basically it will look into your data set and then it will provide you with metadata. So as I said, data about the data, and also a schema which is basically a description of the structure of your data. So this means the schema would give you an idea of what has to be expected in a certain cell. So say, for example, there is a cell for dates. And I will tell you that in that particular column, all the all the cells will contain a date. So frictionless basically will provide this for you extracted from your data set, and it will provide it to you in a human and both machine readable format. So once you have that what you can do, which is very exciting, is that you can then check your data for errors. So that means that basically what frictionless will do is that it will check against your, your schema, all your data set, and it will tell you if there are mistakes or errors, if there are things that are not expected that are happening there. And it will do that in a visual format so giving you a visual report like the one that you can see here in the slide, which is very helpful because it will tell you directly what the error is, and it will tell you what kind of error it is. So for example, here you can see, there is a blank label so it frictionless will alert you saying be careful because there you need to, you need to put a label, for example. In that particular column, for example, there is a format that is unexpected so go and have a look. So why is it important to validate your data. There's a number of reasons of course and it's quite obvious, but to make it even more obvious I wanted to bring to you an example from computational biology. Basically what happens in this field is that a lot of people use Microsoft Excel for their data collection. Now, Microsoft Excel is a proprietary software that would basically sometimes auto correct some things. And one of the things that it does unfortunately is that sometimes it can mistake some gene names for dates. So you would say, well, not, it's not a big deal. But actually if you have a very big data set you cannot go and check it every time and you would not immediately spot the error. And another reason is that sometimes you would publish the data or use the data to perform an analysis, but actually the analysis is not correct, and the data is corrupted. 30% of the published article in genetics have mistakes because of this Excel auto correction. And the thing is, if you would be able to do a simple validation like the one that I showed you before with friction is so checking your data against your schema, you would quickly actually spot the mistake and corrected. So preventing you from having all those errors and from corrupting your data. So that's why validation is important. So once you go and correct your mistakes, it is also very important to keep track of all the cleaning steps that you have. And here I gave you an example this screenshot comes from a pilot that we did with the code and I'm going to tell you more about this later on. But you can see that basically what friction is will do is that it will record all that happened from the road data to the clean data. So here you can see that there is a date that has been corrected. Here there has been a conversion to decimal degrees. All of this would have happened because the friction is validation has spotted a mistake, alerted the the author would go and then quickly make a correction. But once they make the correction, the correction would be recorded so that when I go and look at the clean data set, I can also know what happened in between the row and the clean data set. So it's very good for transparency and reproducibility. So that's in a nutshell what frictionless can do for you. It's an excellent tool for data integration, you can it goes from packaging data, transform data, validate data and all this kind of things. It is also very useful for publishing and storing data because frictionless has plugins for accessing and storing data, for example in an SQL database. The nice thing about frictionless is also that it has a sort of like holistic approach to open and fair data, so it provides tool for the full data pipeline from the standards to the frictionless framework which is basically a powerful Python framework also available in other programming languages that will help you to perform the basic functions that I showed you just like describe extract and all of that we also have a frictionless application for the people that are non coders. So it's UI on available online that will perform the same functions that you saw for the Python framework just before in in a user interface so with a clear UI. We also have a frictionless repository which is a GitHub actions that validates tabular data on every commit that you do on your repository, and we also have live mark which is a web site generator that will extend markdown or your markdown files with charge tables and all of that. So, before leaving you to the very skilled hands of Susie who will tell you about how you can use frictionless for wiki data I just wanted to give you some real life examples of people from our community that use frictionless in the daily work with data so that maybe you can understand a bit better how this can be useful for your particular use cases. And I will start with libraries hacked, which is a project from the UK from England, working with library data. So the problem here was that there was a big luck in England of public data about libraries. There was no central guidance on what data needed to be shared, and especially how it needed to be shared, there were very few standards for libraries. So, the idea here when an initial database started to be created. It was very important to define what data would be more useful and how that data needed to be structured. The thing is, you can easily easily decide what data you want, but you can very easily as well fail to describe it properly. What does it mean? It means that, for example, if you could have a column for example this database for closing date of the library or like opening times of the libraries, in principle you would expect that everyone can, it's very easy for everyone to fill that column that what will happen in practice that you will end up actually with all kind of different formats. So what you need here is a standard. And in this particular case, libraries hacked is using table schema, which is another friction standard, which would allow not only to have, it will allow them not only to have a standard as I said, but using the friction standard it meant also that they could take advantage of the friction as validation that I showed you before. So that librarians inputting this database could have instructions whenever they had mistakes in the database and they could quickly correct them. Another interesting case is Bicodema. It's a pilot that we did with the biological and chemical oceanography data management office. What Bicodema does is that it is a publicly accessible data repository for oceanography data. They provide data management services, so from planning to publication and archiving and they have a web-based catalog for all this data. So, at the time, what Bicodema needed is that they needed to improve the transportation of data to and from Bicodema. They needed more improved support for data reproducibility. And also they wanted to have more efficiency and consistency when it comes to data, because they didn't want to rely solely on the particular skills of one data manager. So what happened is that we joined forces with the Bicodema team and we implemented what we call data package pipelines. So we developed web application integrated into the Bicodema submission on the website and web app integrated in the management system basically allowed them to have the friction as data main functionalities, which means that when people were, for example, submitting a data set into Bicodema, when uploading it, the website would automatically do data validation behind it. And as I told you, as I showed you before, it would also like record all the cleaning steps so that basically the data set once uploaded into Bicodema will contain all of this very precious information. Another interesting use case comes from Deploy Solutions, also part of our communities. Deploy Solutions are Canadian developers and basically they decided to build a climate change software prototype to help in emergency response for climate related disasters. So this prototype is basically like filled with information coming from government officials but also from people on the ground where this disaster is taking place. So what happens is that this prototype needs to manage a wide variety of disaster related data sets from different kind of providers with very varying degrees of reliability and quality. And to make sure that the information is accurate and actually usable. It was essential that the data uploaded was valid. And so to prevent, you know, mistakes and errors that can happen to everyone. Integrating a validation check from frictionless data helped them actually making sure that there were no incomplete records or invalid information in the prototype. Another thing that I want to bring to you today, it's an extension of the data package, so the standard, the core standard of frictionless data for camera dropping data. And the main reason why I want to talk about this today is to show you the fact that friction is incredibly extensible and easy to use, and you can make it yours actually. So here for example, what happened is that the Belgian Institute for nature and forest research is doing a lot of research with using camera dropping data camera dropping data is very useful to study observation of wild fauna. And what they did is basically that they extended the frictionless data package to include a series of other stuff that is not normally included in a data package just because they needed them. And they wanted to have a standard format to share and to exchange this data. So, this particular data package so the camera dropping one also contains for example information about the camera location time. It normally contains a media file URL with a time stamp as well to know when the photo was taking observation about this file. Sometimes for example you would the camera will detect movement and take a shot but actually nothing is there. You just want to maybe, you know, write down that, and also like metadata about the project in general. So all of this just to show you that frictionless is very extensible. And basically you can use it and make yours to fit your particular use case. So that was a general presentation about frictionless and this extensibility I think it would be very useful when you want to use frictionless data actually in combination we really believe that there is powerful combination that can happen there, but I leave the flow to my colleague Susie to talk about that. Susie the flow is yours. Today I'll be demonstrating about how we can use frictionless data set frictionless tools to in the wiki data workflow and by wiki data workflow I mean that the steps the pre processing data steps that we do before we upload the data to the wiki data. So I'll show it with the examples. So, I'll be sharing my screen now. Is it visible. Yes. So, so this demo is mostly about it's a simple demo, because frictionless tool has a lot of frictionless has a lot of features, but I'll be using few features to demonstrate how we can validate and clean the data. We uploaded to the wiki data. So, so here is the data that I will be using. This is the data that I'll be using and this is the dummy data that I prepared for this demo. And this data has the users to ID of of the like wiki data pages and these are the sandbox pages that I'll be using to update the features, features of the studio to pages sandbox pages, and I'll update the property at the three different properties of these pages. So this is the data that I have and I have also prepared two files which is resource file and pipeline files. The data has already described about the metadata. So, this, this resource file is the metadata for this table of data. Here you can see both metadata of the of this data, and this data are both useful for humans plus lessons like this because of this machine readable plus humans are plus humans also can like see the know about this data by looking into the description of the data title of the data and also the fields here and and other softwares can easily read this data using this data data from this file. So here, like Sarah said, this metadata describes the provides the description of the data plus it also defines the fields. Like what fields does this table have it has three fields which is QID property and value. And all of these three fields are type string, but this, this type could be different like integer date or anything, but for simplicity I have used made it a string. And here I have added additional constraint, which is, which is that this third field should not be null, the value should not be null. This is what this descriptor describes and then I also have the another file ready which is pipeline this transforms the cleans and transform the data, and I have already defined the steps, what steps to apply to the data. And I will describe a little while during the demo, like while I'm running the demo. So, and let's see in this data that you can see here in this data that this data is not valid, because the this field is empty, but we have mentioned that this field should not be empty, it has to be there. And then there is another thing that is not valid in this data is the format of the date, because to be compatible to the wiki data format we have to add plus or minus stream to specify whether it's ahead of UTC time or behind the UTC time. So, that is not there. So, the data is not valid. Now, I'll show you how we can use frictionless tools to identify this, and then transform the data so that it is compatible to the format of wiki data and then upload that data to the wiki data. For that, what you have to do is, you have to install frictionless library. And this, the library that I'm using is Python frictionless Python library. And there are other, there are same library are extended in many different programming languages as well. You can use in Java or the different languages, you can use that in those programming languages as well. So, but today I'll also using frictionless file library. Here I'm installing frictionless. And I'm also installing Excel plugin because I'll be using that to convert the data to the Excel file. So once I install this, because I've already run this code so I don't need to install this one so I'll skip that. And the next one is I'll import two classes, which is resource and pipeline and load the data using the resource suggestion file. The resource class will load the data also and all the metadata of this data. And when I print the data using to view function, I'll see the data that is here. It's printed here. And the last one has no value. It is converted to none and others is same. Now the next is, let's see, let's use frictionless validate function to see whether it to validate the data. So I'm using validate function. And when I run this, sorry, I have to load this one, because I have already ran that code. So, yeah, the data was loaded. And when I validated the data, when I ran the validate code on that data, then it found that this data has errors and the error. And that's why this report that valid fill is set to false. And it also tells you what error the data has. It says that cell in row acquisition 8, here acquisition 8 and fill value acquisition 3 here does not confirm to the constraint, which is required is true, which you may remember that we have mentioned it here. So it, it says that you have no, no value here. So this data is invalid. So now what we do is we for simplicity, I will just remove this data using the frictionless transformation tool. But here another question arises, why didn't it identify the invalid format of the date here. It always complained about this, but it didn't complain about this. It's because I haven't mentioned that here as a constant in the data. So it doesn't identify that problem in well validating the data, but we can do that as well. Using a custom function we can pass checks here. And then when we run validate on on the data using good checks, then it can identify that as well. So for now, it doesn't identify just identify this one, but we have this pipeline steps. Now the next step I will do is I'll clean the data. And by cleaning what I mean right now is I'll remove this data and also add plus sign in front of the state, which I have already defined in pipeline. You can see in the pipeline. The second step says that the value remove, remove all the, it applies row filter and remove all the rows that doesn't have value on the value field. And then add plus sign to all the value in the value field, which has property p 571. So what it does is it will remove this row. And add plus sign here. We plus sign here when I transform the data. So let's run this. Here you can see the data was there were there were seven rows here now there is only six rows the last row is removed. And it adds plus sign in front of the date format. So now this data data is ready to be uploaded to the wiki data. But before that let's see whether the validation process still finds error in the data or not. So now it says the data is valid and there are no errors. The next step is transforming this data to the Excel format. Here we do not yet have integrated fixing this data tools with wiki data API. So we can't directly publish the data to the wiki data that we are working on. So now what I will do is I will, I will, I will save the data in the Excel format copy and then upload it to the wiki. So let me delete this file first because it's already created. So when I run this using this write function, it will write the data to the Excel format here. I'll download this one and open it. So now this is ready to be uploaded to the wiki data. Now it's the same process. I will use quick statements to upload. This is the current state of this file. So I will represent to see the new changes if it is applied or not. So here we see the new changes are applied. The last three steps are same, which we are planning to like make it more easier. But here, so this is the last step to where we upload the data to the wiki data. And this is not the only way we can use fixed list data. You can use it in many different ways and it has a lot of features, which I couldn't demonstrate here in this demo. But I just wanted to show how we can include this tool in the wiki data workflow. And I would like to mention that we are still for the non-tick users. These steps would be complicated. So we are working on the graphical user interface to use the same feature. We are developing an application called frictionless application, which does the same thing using GUI. And we can like see all the errors, load the files and run all the transformation, apply validation and see the errors. So we can do it through the graphical user interface. So this is the frictionless application that we are currently working on. It is not fully complete. That's why I'm not demo this one, but we are very, it's about to be completed. So we'll release it soon. And with this, I would like to end my demo and hand it over to Sarah. Thank you very much. Thanks, that was excellent. Actually, maybe before we jump into questions, I just want to give an opportunity to Carol, maybe to say a few words about this powerful combination of frictionless and wiki data. Yeah, thank you, Sarah. So right now, what we are working on trying to do with in partnership with the team from Ricky Media Argentina is that we are trying to develop an application for Glam professionals to be able to do this that Sasage just showed in an automated way. So this is an application that would be very non-tag friendly user and help people to upload clean data and validated data to the wiki data in a more simple and easy way, even if you are not a developer and you are not usually usually works with spreadsheets and values and data. So for this, I would really like to ask that at the end, while the link is here, you join the other pad and tell us what are your struggles when using wiki data and uploading data or preparing data to upload. I think it's very important that we listen to the community and that we address actual problems or actual struggles that you are having so we can develop a solution that is according to reality. We really do have a great team of developers in house here at the open knowledge foundation, but we constantly look out to listening to the community and to real world problems so we can not be a part but be working together and developing solutions that actually work in the best way and that you don't feel you are using an application or technology or just like handling your data and taking care of the topics you work with and not the tool itself. So please do enter the other pad and leave your comments questions and suggestions. We also encourage you to share use cases cases where you struggle or you had difficulties uploading data or preparing data to upload to wiki data. Thanks so much Carol. I think that at this point we can maybe take some questions from the audience if there is any. If you go also on the other pad there is actually a section just below where you can add also your question and writing, in case there is some kind of infrastructure problem but otherwise, I don't know exactly how this will function in the room but just feel free to fire your questions now, if you have any. We have a mic. Okay. So we have two hands. Hello. Given that I am for the end up. Given that operin fine as a good wiki data integration and works well for its purpose. Why would you create a new framework that uses frictionless data and suggest it to glam institutions. Thank you. Thanks very much for the number I'll start answering this questions but Karen Susie please feel free to join jump on if you have like anything else that you would like to say. It's very interesting that you mentioned that actually in the past we were so partnered with open refine. Our idea here is mainly that open refine is a very, very valuable and quality tool. Sometimes a bit complicated so we're not here to compete with open refine but what we are looking at is like a completely different audience, which might be less familiar with like programming tools and would feel more comfortable in using a user interface so the idea here is ready to integrate the app so not having to use all the software libraries or frictionless data but just use the functionalities through user interface and an application. So I'm really thinking about non-coder users here and I don't know if you want to add anything to this. Yeah, definitely we're just. This is not like a situation where we are competing with other two we're just trying to offer an alternative that has an interface and is more friendly to non tech users. The glam environments have been working with librarians and archivists for a long time and sometimes I feel that the most easy in order to like just provide the data and get a bottom and upload the data tweak the data, the better so we're trying to develop here or something that is easier to use than open refine but not competing of course this anyone can use the tool that they choose to use. Thank you for your question. Thank you for the answer. Yeah, thank you. By the instrument from burn. Switzerland. Also, and representing an affiliate of open knowledge foundation. To follow up on the previous question just for precision, the easier to use tool compared to open refine. Does it already exist. Or are you just planning to develop it. That's my first question. Second question. What is the difference between friction less data and linked data. Would you elaborate on that. Maybe. And the third one. That's where I see. This synergies with the wiki media community that's in the area of tabular data has been discussions about establishing tabular data on wiki media commons to make it interoperable with querying wiki data and tabular data. Is that anyhow in some area kind of on your radar or is that kind of are you working on a completely different task. Thanks very much. And lovely to see a face from the open knowledge network here. So, I'll start answering some of your questions for so starting with the first one about the application as Susie mentioned in a presentation we, we have been developing this in the past year. It is now in a beta version, but we are hoping to launch a sort of like 1.0 version anytime soon. Probably like mid September or something like this. So work is already happening there and we're building also on the experience that we had in another tool that we used in the past from friction is called good tables that was a user interface as well that people with known expertise could use basically to validate their data. The idea here is to broaden up what good tables was doing and doing it in a different way as well because we had some problems with good tables. So adding basically all the friction is functionalities to the non coder application. So, this is for your first questions I don't know Susie Carol if you want to add something that's the moment to jump on this. Right now we are in the way of developing this tool it's not ready yet. So if that answers your question we are at the moment where we are consulting with the community to find out what's the best way to proceed with this, including knowing more about how people use open refine if they would like a new tool or not and so your feedback is really valuable. And hello for the open knowledge network. So my second question. My second question was about what is the difference between what you're doing on the side of frictionless data and linked data. Because in my understanding in before actually ingesting your frictionless data into wiki data you actually have to turn it into link data going through all this mapping and matching process. So you have to map against what is already in wiki data. Is that some part of your pipeline process. So thanks very much for this question and let Susie answer this because she's the most skilled here to answer this question but I just wanted to say it's something that comes up a lot in our community actually linked open data being very important part of data pipelines. We don't cover that yet but at the moment we are extending the frictionless standards and so we would like this to be part actually of the frictionless data standard but let Susie expand a little bit on this. Yeah, I think I on on top of what I had said, we haven't yet decided to add like to work on the link data part while integrating the wiki data for now we are just trying to use wiki data if integrate frictionless data with wiki data API so that we can push the data with one button, push the data from frictionless application after it's processed through that application so that's what we are currently working on. And we, we do not fully support the link data but I think the discussion like Sarah said the discussion is going on and we're working on it. Okay, thank you. Yeah, I think that's, yeah that answers the question and I think that's still an area of challenge, because it's not just about automatically pushing. It's about mapping and matching before pushing. And my third question was kind of addressed both to you and maybe also to the audience in the room, like time series data statistical data population data voting data which are extensively used in the context of certain wikipedia. But we're actually kind of lacking a proper solution within our movement that have been proposals to support tabular data. More extensively but I don't think we have made much progress in that area since COVID, I would say. Oh, we can't hear you anymore. Yeah, we lost your mic. The mic was dead for a while. So, that might have, there was just one question what is the way forward in this area like tabular data within the wiki community. Well, I think that's maybe more a question for the audience there I can say that tabular data is at the core of frictionless and that's how it started pretty. So do you want to add anything on this, or maybe the audience actually if anyone would like to jump on this question. What I would like to add is, like we could bring this, this to the, we could discuss internally with the technical team, and then see how it could include that. So what do you want to see from from outside. Hi, I'm Niccolò I'm a user of Italian Wikipedia and commons and other projects. I believe we should start using more and more the tab, the tabular data on commons it's already possible to host them on commons week but nobody's nobody's using them, mainly because it's complicated. You don't even know that you can upload them. And it's complicated to use them on Wikipedia of course. We could start using using that namespace on commons as we use it for as we use images on commons why do we have the same data on Italian Wikipedia English Wikipedia and so on. I mean, we should work on that probably I suppose and of course we need the good data to. Of course, but then technically it's already possible. I am good. I'm a media developer. I wanted to add to that that as you might have seen, Wikipedia has an extension called graph, which is used to display among other things tabular data, which has been taken down to due to security issues. The foundation is currently trying to figure out what to do about that. So I think this is a good time to get in touch with the people. The product manager of the editing team is the one thinking about that. So this might be a good time to talk to him about what you think about the future of tabular data because the future of the graph extension is somewhat intermingled with that. Hi, my name is Ginoy. Actually, I was updating their tabular data on comments for the India, the COVID cases for the last two years till the extension was taken down. So like every two weeks I updated the information on the community comments. I used the added this graph extension on the Wikipedia articles. It shows showing on the English Wikipedia or Malayalam Wikipedia. It's showing for the last two years, but because of our security issues, the graph has been taken down. So right now it's stopped updating because there was no use at all. So yeah, it's the graph extensions. But yeah, we can have so many information like elation information we can add on the Wikipedia comments. Yeah, so this is a great example of usage that Peter Pelberg that Gergen mentioned probably isn't aware of unless he somehow encountered this work. So it would be great if you could do a write up or if you already have a write up some blog post or something that kind of showcases this work and what it looked like when it did work. And why it's important and what articles it's displayed in etc. If you could write that up or prepare some three slide story about it and send that to Peter Pelberg and Gergen or I can help you reach. Is he here Gergen? Is he here at Wikimedia? I don't know. That would help right because when when the foundation is trying to think I mean obviously this is not trivial to fix if it were trivial to fix maybe it would have been fixed already but there's some issue there that makes it a question do we invest in this. Will there be someone to maintain this after we invest in this. And one of the ways the foundation makes such decisions is how much of an impact is it really going to have. Is this extension used in a thousand pages or in a hundred thousand pages. That makes a difference right because everything the foundation does is always the expense of other things we could be doing right so so someone has to make that decision. And one of the things that can help make that decision is stories like that. Yes it's used much more than you might think it's used in this way as well not just for you know the graphs that you can easily imagine but also for this other use case. That's the kind of information that is sometimes difficult to see from the foundation headquarters or or it is obscured by assumptions that we make right about what is what is typical what is likely what are the uses we've personally encountered. That can really be augmented and influenced by community input. So again I'm drawing two strong bold underlines under what Gurgis said. Oh sorry I didn't introduce myself I'm a saffron the foundation. Any other comments or thoughts from the room. And if not then back to the speakers I think we have one more slide left. Yes, thanks very much thanks everyone that's a very insightful conversation and I mean if ever you want us to participate in that we would love. We'd like to do part on that we are very expect when it comes to top of our data so if you need help you know which though you need to knock. But yeah just again the link to the other pad. If you want to complete the question I would gladly hear actually what are the problems that you encounter with with data and just say you thank you for joining this conversation with us today. Thanks for the links to the friction data website. I really encourage you to go and have a look at the project website to see all the universe that frictionless has to offer to you. We have a community chat on Slack, which is also accessible via a matrix bridge if you prefer to use an open protocol. And we have a Twitter account and we also have a general newsletter for open knowledge foundation that I would really encourage you to subscribe to. Thank you very much. Thank you for your pressure to be with you today. I'll buy it on the online. Again, if you're interesting in discussion in discussion around the open definition bear with us in this room. We will be back in 1015 minutes. But yeah, thanks again for everything. It was a pleasure to be here with you today. Thank you. Thank you. Thank you very much to the speakers and thank you for the audience for a great discussion. We're going to start the next session in the next 15 minutes, which is updating the open definition to meet the challenges of today. And we'll have some of them, some of the folks from open knowledge foundation back and we have folks in the room as well who will be helping. It'll be a workshop. So you'll have some sessions. So you'll have some groups, etc. So please do come back. We'll start that shop at six in the next 14 minutes now. Thank you very much.