 Kipaka Vigi for hosting this and for keeping us all awake. So probably it's not wrong to say good morning, everyone. OK, what I would like to do, so all of this has been announced as a discussion. So it's probably no point in me talking to you for something like 55 minutes straight. So I would just like to give you a couple of slides on what we could discuss and then see where we want to go with this one. OK, so to start off with, who of you considers to be him or herself to be scientist? OK, who has the pleasure to work within the European scientific system? OK, and within the German one? OK, and so negative control. Who knows what the capital of North Dakota is? OK, so there's no rigor mortis in your arms. OK, so yeah, topic today is free software for open science. And as I have some associated with the Free Software Foundation Europe, well, we should probably start with the definition for this one. So number one, what do we consider to be free software? In this one, it's pretty much every software that would be released under an either FSF or OSI compliant license. So this is what most people know also as an open source. And main point here is, so as the FSF and OSI definitions pretty much standardize the same things they just have different ways to say it. It should be made sure that it guarantees the full freedoms to the user. So to use, to study, to improve, and to share the piece of software. And of course, this does require the existence and openness of a source code and the ability to actually create derivatives. So and I think for everyone who's been working in science, it's pretty clear that those four core freedoms are very well aligned with what we're trying to do in science. We're trying to build up on the work of others and to get humanity along and increase our overall knowledge. So for that reason, what we're doing there is exactly that we're exercising those four freedoms, just not necessarily that we do it in a digital or code based manner. OK, so that's the first thing then. What actually is open science? So first of all, open science is a class A buzzword. Nevertheless, the European Commission took the liberty to get a committee in there, in that case, the OSPP, the Open Science Policy Platform. And those people developed a lot of bits or paper, however. And what they defined is eight key areas. They are sometimes called ambitions. Sometimes they are called priorities, which is the key things that need to be addressed in the midterm, to move European science to what they consider to be open science. And this is not only, and that's very important, about the classical things that you might know, like open access and open data. Open access and open data are basically incorporated in here, so scholarly communication. It says future of scholarly communication, which can be everything from open access to just going digital. However, we should all be aware that European Commission now has endorsed Plan S, which is a rather far-reaching push towards more or rather radical program in terms of publishing requirements. So we can consider that this part for scholarly communication is really meant to be open access. And then the other thing, so open data is what is called here to be fair data, because the Commission typically tries to avoid the term open, because openness, of course, is not fair and fair, unfortunately, is not open. But this is where we lead our discussions. So this means that we only have two of the classical open science points that are in here. Everything else are things like incentives. So this is how can we generate better citation, or so how can we make sure that the people who do the work get the credit? So we might need some reform in how we do citations. Then indicators is, was that me, or was that, OK. So indicators is kind of a way to try to overcome the simple citation indices, and of course, especially the impact factor. EOSC, for those of you who have not heard that term, that's a very large project. That's the European Open Science Cloud. It's still rather ill-defined what it should be. It's getting better along the way, but the term has been out there for three years. In the end, what this is about is to really create a large federated European infrastructure for scientific data. The main funding for that one will come from the national states. And so, for example, the German implementation is called NFDE, National Research Data Infrastructure, and will be heavily funded by nearly 1 billion euros over the next 10 years. So this is the scale that we are talking about. Integrity means how to assure integrity skills is how to train the next generation of scientists, and CES is the Apprivation for Citizen Science. So with all of this, you see that what Open Science is not just trying to do tick marks, what they're really trying to push for is a rather fundamental change in the way how we do our work towards really becoming a more egalitarian system and a more open and participatory system. OK, so now the question is, so what is the role that free software can play in this? And so one of the things that we need to define here are we talking about free software for Open Science, which is the thing that this talk was announced for. But of course, we could also, that's the general interest to talk about free software in Open Science, or in Science in general. So distinction would be that the four Open Science is mainly, here we're talking about software as a research product. So this is mainly domain-focused software that is created by the scientists themselves. And here we then have, of course, issues like how to sustain it, how to ensure quality, and how to choose proper licensing or licensing models for it, while the in Science is more generally talking about generic software tools. So this is operating system, office suits, and so on, that are just used by scientists in more general. In both cases, the main point of how free software can contribute to the scientific endeavor is, of course, by promoting the reproducibility because everyone can use these tools. There is no paywall in that case, so you don't need to purchase a given Microsoft Office version to recreate an Excel table or something like this. And of course, also the attempt to reduce black boxing. The other thing that is more specific for free software for open science is the general thing that we already said, OK, so somehow the ideas of free software align well with what you're trying to do in science. But more importantly, the question right now is, does it fit the policies under which we are operating? And so of course, the main policy that most people know is FAIR. So FAIR stands for Findable, Accessible, Interoperatable, and Reusable. And it's kind of a paradigm that was defined, so published 2016, was in the making for a couple of years before that. And this is something that was primarily geared towards data. The nice thing about FAIR is that the 2016 paper also operationalizes this. So they give criteria on what you need to do or what you need to ensure that, for example, data set is findable, what it means, how it needs to be accessible, and so on and so forth. And of course, reuse also says something about, well, you need to put a license on it. But otherwise, it's not that specific. OK, now importantly for this one, stuff that is FAIR does not necessarily align with free software, because free software means that there are basically no restrictions in use, while the reusability for FAIR simply says people somehow need to be able to reuse it. So there needs to be a clear pathway that can still be a proprietary license. And that license might still not allow you to do everything with it. There just needs to be this ability. So that's one of the main things where FAIR does not fit the free software definitions. On the other hand, of course, free software doesn't say anything about, oh no, I killed the alpaca. OK, I'm probably going to be kicked off the stage any minute. OK, sorry. All right, so on the other hand, I can write beautiful code and put it under an open source license and put it on a USB stick and bury it somewhere in my garden. So then it's neither findable nor accessible. And this is, of course, also something that where the classical definitions for free software don't necessarily match these two criteria, which nevertheless also for software do make sense. Finally, one last thing is that FAIR defines a product. So it says, OK, so the outcome of your research needs to comply with different criteria. And that's, of course, a relatively easy thing to test. What it does not do, and maybe from a software development perspective, this is something that is more important. It doesn't define a process how we do things. And this is one of the things that also one of the German committees, so the RFEEE, has recently started to criticize for FAIR. But we say, OK, FAIR data just says this one. But you can have completely rubbish data, and it can still be FAIR. But what we want to have is high-quality FAIR data. So FAIR clearly is some kind of minimal consensus. It's Kondizu Simechwa known, but we probably need to extend it at this point. And of course, with this one, we can also discuss on how we want to get this into or align this with free software. OK, so that's more or less the brief introduction. Now, there are a couple of things that we can discuss further, depending on your interest. And that would be basically what about the current European policies, brief overview, what about the current German policies, what about generic free software tools. But maybe that's the point where you could say something to get us going a bit. I think it's working. You mentioned that the current software standards might not be in line with the policies. What were you exactly referring to? Can you repeat this? You mentioned before that the current software procedures or standards might not be in line with the policies in the European Union. What exactly did you mean by that? So the thing is that I can comply with OSI regulations for open source software, but none of our funding bodies says you need to be OSI compliant. What they say typically is you should do stuff that is fair. But right now, one of the issues, this is what basically this slide then says, is the question whether any of the policymakers really define code as a primary research object. And that's right now not the case. So therefore, everyone assumes that code behaves like data. And to equal code with data is something where some people get code shivers. Others don't, because it's an operation that you can do. It's a lossy operation, but it might help us in some ways. And the main point here is that code has some idiosyncrasies that make it distinct from data, and this is where our policies break. On the other hand, some of the policies that we came up, not for research, but in general, so from the free software perspective that we made up there didn't make it into the policy documents, and so therefore are not incorporated there. So fair criteria, and the other ones don't completely overlap. So most people might write code, but it still wouldn't align with the fair criteria if you would take it one to one. So a question about the topic added in the start, the licensing. So when we say we have a commercial company who, like Microsoft, who develops an office package, and when you say free software for open science, it would be better to invest the money not into license costs, we're reoccurring, but better for a bigger thing like a country to invest more in open code or open programs. Is this tackled by what you mean with the fair or the open source? This is one of the things that is not necessarily, so you could construct it in a way that it actually overlaps with fair, because you're talking about reproducibility, so fair doesn't say reproducibility, but it says accessibility. And if you're using formats that are proprietary, you could say, OK, well, this is not accessible to everyone because you need to pay for it. Now, the thing is that there are a lot of things where you have to pay for. So this was one of the things that was never on the agenda to try to be eradicated. So the generic software part is just something that came into this whole process later. Initially it was really geared towards how can scientists make sure that or how does the software produced by scientists is both free software and contributes to open science and what do we need to do to create potentially additional funding opportunities for, because this is where it typically breaks, to say, well, I can write better code if I have more man or woman power, if I have people who curate, if I have people who do issue fixing, and so on and so forth, which right now is not considered part of the research process, but in reality, so by the policymakers, but in reality it already has become that. Now, if you're saying, well, you're using generic or generic office suits for that one, then yes, we are investing a lot in these things in the tertiary education and in the research sector. And personal opinion, yes, we should spend this on things that doesn't nudge people towards proprietary solutions. But the question, but that's something that is because it has a stronger education component, also for student education. So I wanted to bring it up here because I thought, OK, maybe it's something that more people here are interested in. But I agree that it doesn't overlap completely, doesn't strongly overlap with the open science part, right? OK, I've heard some people work on the fair principles specific for software. You've heard about it, and you know what kind of the differences are. Yes, so thanks for this input. So let me check. OK, I missed that one. So yeah, there's a recent paper that just came out a couple of weeks ago by so Anna-Lena Lambrecht. She's from the Netherlands East Science Center. So what they try to do is they to use the catalog or the original fair criteria and check for each of those ones. Does it apply to software? Yes or no? And then change them, amend them in a way to make sure that it then better fits into the process. So they, for example, say, well, so there needs to be some kind of documented quality control. They're more talking, of course, about software repositories. They then include versioning, which is one of the huge things that sets code apart from data, which is once it's released typically a rather static object. So they're trying to get somewhere, and I think it's a good document to start with. But in my personal opinion, I think it wasn't bold enough. You might have been, I mean, we had this discussion at the RSE 19 conference also, where Annalena also was there. And it tries to stick very closely to fair because they assume that this is what people know, which I think is good. On the other hand, there's a very clear recommendation for most bodies that fair should not be extended. So we don't need, as they say, we don't need additional letters for fair. And they really want to have this basically as one concept to stick with data. So therefore, I think it would have been necessary to have a bolder step to try to work in all the established development policies that we already have, then just to stick as close as possible to fair and then just change the nitty gritty details, which is what they did. But nevertheless, I think it's something that's clearly worth reading. Thanks a lot for your talk. This resonated a lot with me. And as someone working in research infrastructure, I think it's super important that we focus on recognizing research infrastructure. So all kinds of services like sustainable data storage for researchers, tools that help make data discoverable and things like that, that this should be considered a public good, right? And so next to what you mentioned and rightly so with Microsoft, the other risk that I currently see is that legacy publishers like Elsevier, like Springer Nature and so on, try to capture the whole market. So this all trying to deliver on all the needs that researchers have in the digital area with huge platforms. And this is like a battle that we almost have lost already, as it seems. So there are many interesting, very good, free and open source alternatives to what they deliver. But it's really not recognized very well why this is so important. This is my impression. Yeah, I mean I would second that. So I think it's interesting to see the large publishing companies now really moving away from their traditional business because apparently they have recognized that they might be on a losing path there, but really to offer a wholesale data management solutions to institutes. I mean, this is probably just an anecdote, but so apparently Elsevier offered to, I think the Netherlands or the Dutch government to say that they said, okay, we do all of your data management or basically you get everything for free, but each and every institution has to deliver, but we become your central data deposition platform. Which, well, unfortunately it might appeal to some politicians. I think it doesn't appeal to anyone else in this room, given that probably Elsevier is a company that is even more hated than Microsoft. For reasons completely unknown, I mean, they just make a revenue of 35% every year. So maybe we should just buy stock options. Hello, thank you for your talk. What I not completely understand is why we use the fair concept as a point of reference at all because I feel like the concept of open access in science is far more applicable to code. So in the end, code is tax and it's part of the scientific publication system. So we have references from and to code and such things. So, and the open access, yeah, yeah, the concept of open access has the same ancestors like the scientific publication system with the Mertonian norms of science and such. So why don't, yeah, treat code like a scientific publications? Okay, honestly, I'm relatively open to this idea because this is, I mean, the reason why we're having this discussion. Mainly, what I'm presenting to you now is mainly developed out of the existing EU policies. And the EU talks about fair a lot because for them, it's an operationalized thing. It's something that they would like to test. In the end, it's something that they would like to score and so on and so forth so that paper pushers have something to do with. But I agree that we can simply say, well, in the end, the openness is more important and fair as we already said, isn't open. So therefore the open access would maybe the better point to hook this up. So, yeah, I agree on that.