 So I'd like to say thank you to her afterwards, but thanks a lot to our local hosts. And yeah, I enjoy the day, that's all I have to say. Thank you. Thank you, Nadja, for giving us the context of this workshop. Now let's go to our keynote. Keynote of today is Dr. Eric Mannens. He is a research unit leader of the Future Media and Imaging Department at I-Mind. Again, university department. Now he managed several many, many projects until now. He gained his PhD degree in computer science engineering and his major expertise is centered around metadata modeling. Very important for us. Semantic web technologies, broadcasting workflows, IDTV and web development in general. And he's also a co-chair of the W3C media fragments working group and actively participating in other W3C semantic web standardization activities. He also recently co-founded the Belgian chapter of the Open Knowledge Foundation. So Eric, please, I give the floor to you. Thank you. All right, good morning everybody. Just to set the scene, I have four slides on myself. The first one is the web, okay, it's about 22 years old and I worked on with the web for 19 years. Now I know my wife 15 years, so I have a longer relationship with the web than with my wife. And every day I spend more time on the web than with my wife. For the rest, there's nothing similar. For the rest, they are quite similar, right? They're complex, high maintainability, and I cannot live it out for every day without the web. Now, the second thing you have to know about me is that I adhere to standards. We all have to adhere to standards and so does the web. Without standards on the web, well, there should be only less or not a lot of interoperability on it, right? And the standards, I also adhere to open standards, open air, open. That's why I actively participate in W2C and not MPEC, right? I really want open standards. And the last one, that's my research. I'm very into semantics, primarily the W2C stack, so everything that has to do with the semantic web. So on this one, if my computer was a dog, my computer would understand garbage collection. That's a difficult one. Anyone? Okay, I can tell my wife there were no nerds in the room. I was the only one that got this one on garbage collection. Inge asked me to do a keynote on open science. So I started out thinking, by the way, it's my first keynote, so I do not know how I'm doing, but fine. I will enjoy it. It's maybe my last keynote, right? So the first thing that came into my mind is on open science is open data. Only half a decade ago, primarily in government institution, they wondered why there were all kinds of silos of data out there, and we already paid for it, and it's not reusable because it's not open. So Tim Berners-Lee was the first to say that we should stop hugging our data and really make it open. Now a side note, there will be a couple of times that you see that slide again. So I learned something from the social scientists. Whatever I say is something important. It sticks better if you have something actively doing. So whenever you see that slide, you hug your neighbor. Not this time, the next time. So Tim Berners-Lee said we should make it open. All data should be open, so we could reuse it for whatever purposes and get extra knowledge. So that about open. And then there's the semantic web, my area of expertise, and this was primarily used within the research area. And okay, we can infer new knowledge out of existing knowledge and machines can do it for us. Now the problem with both the open data area and the semantic web area is that there was not one killer use case, and that came along the linked open data paradigm. And for this one, we now, for both the open data and the semantic web, we have a killer application, namely linked open data. I was there when the linked open data movement was found back in 2008 in the dub-dub-dub being, and at this first Aldo workshop, they only had, let's say, 25 linked datasets. Now this is the image in 2010, and since then they just stopped making that image because now this image is far, far, far too big and there's already an immense linked open data cloud out there in different areas, also in library stuff, but in all sorts of sciences. So we have a paradigm to connect our silos now. Have you arise, do it via HTTP, have extra knowledge via RDF and then see that you can interlink. This is for the simple four principles of linked open data. My two humble sense on open data. Thou shall know your data, librarians. What you should do is make all the scientists aware that they, before they hand out the data to you, that they understand all their datasets, how it's collected, generated, and stuff like that. That, of course, over 80% of the time is time spent cleaning their data. So all the things they can do, you do not have to do anymore. And most importantly, also in the linked open data movement and beyond, as we will see later on in the read-write web, you have to make sure that you get all this missing metadata. And in this one, the provenance and the versioning is very, very important because as scientists, how could you redo your experiment? So this is information you need to disclose to. In the end, hopefully in the not so far future, data should be a commodity, right? As this fourth T was in the early 1900s being the commodity for everybody's everyday car, right? Instead of a faster horse. So the first thing, you should embrace open data. I'm going to do that. And you too, every term. One, embrace open data. Now the second thing I came across, if I had to think about open science, is open software, right? So what if all people should be equal? What if all scientists have the same tools that they could use? What if you have a domain expert who wants to work on some tool to augment it and to make it better? Then he could maybe give it back to the community so other researchers can use it. So I'm for one and the rest of my team are really believers of open source. So small side note, and this is the technical part of my presentation. We briefly talk about two open source projects that my team is working on. And the first one, the data tank, together with the Library of GAN's Kathmandu project. This will be a workshop later this week that two of my people together with Patrick from GAN University will lead, right? So what is the data tank about? In fact, it's a 15 minutes open data publishing framework. It's already used in Florence quite extensively. A few cities and the Polis and a few other organizations use it. And there we publish according this linked open data five star principle of Tim Berners-Lee. So we start from two stars and we can go up to five stars. And the thing is, in fact, what we... The bottom line is that we end up with a RESTful API for developers. So whatever comes in, if it's CSV, XLS, JSON, we can put out also whatever you want, even we also have a semantic version of it. But the bottom line is 15 minutes publishing open data and you get the RESTful API so you can easily reuse it in your applications. Now the next step, what we are working on right now and what we published just two weeks ago at the last dub-dub-dub conference and got great response is raw base. You spell it raw base or you read it as raw base, but it's read and write base. It's kind of a git for triples because the next thing on linked data, it's the read-write web. For the moment, you only read stuff and you want to link it, but we want to have a complete read-write web. Now, before you can write, you have to think of, of course, ownership, provenance, versioning, and the current triple stores up to it. I don't think so, so we came up with a solution. We have a complete distributed triple version control system as a hit, so whatever we commit something, it is stored as a delta, but it's described as a virtual graph and this virtual graph identifies some version and a version resolves the delta. It's as simple as that. Of course, we also want to have a huge amount of triples and we want it streamed, a lightweight algorithm to do so. This is a slide that I recovered from one of my peers and all the other images I have are really something, images that relate to the stuff. Now I see three children I have to ask why it relates to triples or lightweight and what are they looking at, the algorithm? Okay, I don't know. Anyway, so we store the triples as quads, so you have normal triples as subject predicate, object predicate and the versioning, we store that as context. Very simple. All the ads we have have odd numbers, zero, two, no, even numbers, zero, two, four, and if we delete something, you give it an odd number, one, three, five. So that's simply said what's behind it. So in the end, RawBase provides a version control for triple stores with direct provenance and provenance for you is very important and direct graphic sense. So again, my two humble sense on this one, open source, is that you shall provide tooling and make the scientists aware of that there's probably a lot of things out there and you have to consider security and privacy and most important as I worked with a lot of organizations is that even if you use open source, see that you have an IT partner that knows how to handle it within your organization. In the end, Huck open source 2. I'm just laughing. Well, you had open data, open source, what about open research? There, so I worked in industry for 15 years and then I came to IMAIDs, worked within the engineering department but from the first day I also worked with other scientists like the social sciences of the gamma sciences and what I directly saw is that we should not stick within our own vertical thing like I'm a data researcher so I'm probably best at statistics but there's a data creator from the social sciences who's probably better in analyzing big data or doing maths so we become smarter if we learn from the other scientists. So for me, if you do science, it should be by default interdisciplinary and multi-disciplinary. Now, as an example, we have the Large Hadron Collider in CERN. I just took here an image. It looks like a Large Hadron Collider. I don't know if it is but let's assume it is large and you can probably collide something. There's a Large Hadron Collider in CERN which produces a vast amount of data, millions or trillions of particles colliding all having data and they were not able to analyze it themselves so they formed a consortium, the open science grid and wherever there's a supercomputer somewhere in the world they can get some data and process it. Now what happened? That some supercomputer in San Diego without even knowing found the Higgs boson. So they just analyzed stuff and then all of a sudden the Higgs boson is this rat arrow. So let's just find the rat arrow. It's simple, right? Not rocket science. So they found it. Now my two humble sense on open research in two slides, prepare to hug. Thou shall be the catalysts of all the different sort of science you have out there, the alpha, beta, gamma sciences. If they publish in the right way and you librarians you are the catalyst to getting all this data out there. Maybe it will be easier for scientists to get out of their comfort zone and try to work more interdisciplinary. So cherish open research. It's only me. All right. Open access. That's probably the most important one for you. I only have one slide and then it's already a hug again. I only have one slide on open access. It's my personal view. Tear down that paywall. Because for all research, primarily in universities it's already paid for. So why do we have to pay an extra to have this springer or whatever? So I'm all for open access. But not only the papers. Also accompanying data. Also the algorithms. Also a plan of how they use the algorithm to gather the data to become all that is written down in the papers so somebody else can really in an easy way redo your stuff and see if it was correct. So my humble two cents on open access is embrace open access by default. Now what I just heard yesterday I was only preparing this slide set this night is that before you can open up as a library both the PDFs and the data itself and all accompanying metadata it seems that you have to have an entrant metadata workflow in place otherwise you have to deal with all the the inconsequences by yourself. So at least have your authors and organizations on the front page of your PDFs have it rigidly formalized and formalize it as soon as possible so add the source so you don't have to bother with it anymore. But love open access. So we had open data or I thought about open software open research and open access so for me that's open science but only specifically for me as within my own domain. So if we link all these different open sciences then we get linked open science and that for me is open knowledge. Now two extras. Once we have open knowledge there should be a feedback channel so we can go to open learning. I have three little examples of it. What about this massive online open course of MIT at Stanford millions of open courses. Now if one way or another we could also relate that to everything US librarians put out in the open. That's not for the moment it's not happening but what if and the second thing I heard this guy Louis von Anh from Guatemala. He got rich by selling his CAPTCHA software to Google and then he did nothing and thought what can I do for humanity and he came up with Duolingo. What is it? It's a complete free tool to learn languages but as you learn the language the exercises you get are parts of or OCR books that they really do not know what it means or it's kind of a Wikipedia so it's completely free but while you learn you translate other books automatically and of course behind the scenes there will be some algorithms to check if already five people did it the same way. Okay this is the real translation but it's already open and free and at the same time it adds more knowledge openly back to the community. For me that's unbelievable. This is how it should be. I do not know how this slide came about but every time I give a presentation this slide pops in. It's my team's neatest demo of what we can do with linked open data. We will put it open and if you try it at home later tonight it's fun. You can log in with your Facebook and then you will see. You will see something really interesting. So we will learn. It's not open yet but you will learn. My second extra it's about open ranking. So we have all these open ranking university rankings. Are they objective? I do not know. Every year from Thompson I get a document that I have to fill in. What do you think in your area or the best researcher? What is the best team you work with? That's not objective. So I met a guy from Sydney University and he wanted to see if he could make a ranking based on the linked open data cloud as it is right now. So he got some from the linked open data cloud. Some features that he said, well okay let's see what I can get out if I take these features, this feature list and see what ranking I can come up with. So a doctoral student who is the author of a publication, was a notable work, what is the Alma Mater employee of some application and he made a kind of a waiting system and there were also ins and outs so he knew where he could go further down the line. Now he came up with this. So his ranking is the pick score and then you see the major ones, OS and you see that, well it's not the same but it might be a better ranking in the end, it's more objective ranking than what we have so far. And now this is just research, he will put it out openly and then probably adjust the waiting factors. There's no Belgian university in the top 100 of the open ranking. I could end with this one, with this bombshell, do something, make more open data out there before we get up there very quickly. I will not end. So we have open data, open software, open research, open access, open science. Is that good? This is better, right? We should not talk about open, open, this should be default. Data, software, research, access, that's science. Everything should be open by default. Off open, period. All right, about time. Okay, I'm Eric. Yeah, we kept this schedule, thank you very, very much. Do you have questions?