 Yeah, thanks a lot to wikipakavigi for hosting this and for keeping us all awake. So probably it's not wrong to say good morning everyone. Okay, what I would like to do, so this all of this has been announced as a discussion. So it's probably no point in me talking to you for something like 55 minutes straight. So I would just like to give you a couple of slides on what we could discuss and then see where we want to go with this one, okay? So to start off with who of you considers to be him or herself to be scientist? Okay, who has the pleasure to work within the European scientific system? Okay, and within the German one? Okay, and so negative control, who knows what the capital of North Dakota is? Okay, so there's no rigor or mortis in your arms. Okay, so yeah, topic today is free software for open science and as I have some associated with free software foundation Europe. Well, we should probably start with the definition for this one. So number one, what do we consider to be free software? In this one, it's pretty much every software that would be released under an either FSF or OSI compliant license. So this is what most people know also as an open source. And main point here is so as the FSF and OSI definitions pretty much standardized the same things that they just have different ways to say it. It should be made sure that it guarantees the four freedoms to the user. So to use, to study, to improve and to share the piece of software. And of course this does require the existence and openness of a source code and the ability to actually create derivatives. And I think for everyone who's been working in science, it's pretty clear that those four core freedoms are very well aligned with what we're trying to do in science. Okay, we're trying to build up on the work of others and to get humanity along and increase our overall knowledge. So for that reason, what we're doing there is exactly that we're exercising those four freedoms, just not necessarily that we're doing it in a digital or code based manner. Okay, so that's the first thing then what actually is open science. So first of all, open science is a class A buzzword. Nevertheless, the European Commission took the liberty to get a committee in there, in that case the OSPP, the Open Science Policy Platform. And those people developed a lot of bits or paper, however. And what they defined is eight key areas. They are sometimes called ambitions, sometimes they are called priorities, which is the key things that need to be addressed in the midterm to move European science to what they consider to be open science. And this is not only, and that's very important, about the classical things that you might know like open access and open data. Open access and open data are basically incorporated in here, so scholarly communication. It says future of scholarly communication, which can be everything from open access to just going digital. However, we should all be aware that European Commission now has endorsed Plan S, which is a rather far reaching push towards more or rather radical program in terms of publishing requirements. So we can consider that this part for scholarly communication is really meant to be open access. And then the other thing, so open data is what is called here to be fair data, because the Commission typically tries to avoid the term open, because openness of course is not fair and fair, unfortunately, is not open. But this is where we lead our discussions. So this means that we only have two of the classical open science points that are in here. Everything else are things like incentives. So this is how can we generate better citation? Or so how can we make sure that the people who do the work get the credit? So we might need some reform in how we do citations. Then indicators is, was that me or was that? Okay, so indicators is kind of a way to try to overcome the simple citation indices and of course especially the impact factor. EOSC, for those of you who have not heard that term, that's a very large project. That's the European Open Science Cloud. It's still rather ill-defined what it should be. It's getting better along the way, but the term has been out there for three years. In the end what this is about is to really create a large federated European infrastructure for scientific data. The main funding for that one will come from the national states. And so for example the German implementation is called NFDE, National Research Data Infrastructure, and will be heavily funded by nearly 1 billion euros over the next 10 years. So this is the scale that we are talking about. Integrity means how to assure integrity skills is how to train the next generation of scientists and CS is the abbreviation for citizen science. So with all of this you see that what open science is not just trying to do tick marks, what they're really trying to push for is a rather fundamental change in the way how we do our work to what's really becoming a more egalitarian system and a more open and participatory system. So now the question is what is the role that free software can play in this? And so one of the things that we need to define here are we talking about free software for open science, which is the thing that this talk was announced for. But of course we could also, that's the general interest to talk about free software in open science, or in science in general. So distinction would be that the for open science is mainly, here we're talking about software as a research product. So this is mainly domain focused software that is created by the scientists themselves. And here we then have of course issues like how to sustain it, how to ensure quality, and how to choose proper licensing or licensing models for it. While the in science is more generally talking about generic software tools, so this is operating system, office suits and so on, that are just used by scientists in more general. In both cases, the main point of course, or how free software can contribute to the scientific endeavor is of course by promoting the reproducibility because everyone can use these tools. There is no paywall in that case, so you don't need to purchase a given Microsoft Office version to recreate an Excel table or something like this. And of course also the attempt to reduce black boxing. The other thing that is more specific for free software for open science is the general thing that we already said, okay, so somehow the ideas of free software align well with what you're trying to do in science. But more importantly, the question right now is, does it fit the policies under which we're operating? And so of course the main policy that most people know is FAIR. So FAIR stands for Findable, Accessible, Interoperatable and Reusable. And it's a kind of a paradigm that was defined, so published 2016, was in the making for a couple of years before that. And this is something that was primarily geared towards data. The nice thing about FAIR is that the 2016 paper also operationalizes this. So they give criteria on what you need to do or what you need to ensure that, for example, data set is findable, what it means, how it needs to be accessible, and so on and so forth. And of course reuse also says something about, well, you need to put a license on it, but otherwise it's not that specific. Okay, now importantly for this one, stuff that is FAIR does not necessarily align with free software, because free software means that there are basically no restrictions in use, while the reusability for FAIR simply says people somehow need to be able to reuse it. So there needs to be a clear pathway that can still be a proprietary license, and that license might still not allow you to do everything with it. There just needs to be this ability. So that's one of the main things where FAIR does not fit the free software definitions. On the other hand, of course, free software doesn't say anything about, oh no, I killed the alpaca. Okay, I'm probably going to be kicked off the stage any minute. Okay, sorry. All right, so on the other hand, I can write beautiful code and put it under an open source license and put it on a USB stick and bury it somewhere in my garden. Okay, so then it's neither findable nor accessible. And this is of course also something where the classical definitions for free software don't necessarily match these two criteria, which nevertheless also for software do make sense. Finally, one last thing is that FAIR defines a product. So it says, okay, so the outcome of your research needs to comply with different criteria. And that's of course a relatively easy thing to test. What it does not do, and maybe from a software development perspective, this is something that is more important. It doesn't define a process how we do things. And this is one of the things that also one of the German committees, so the RFEE has recently started to criticize for FAIR that we say, okay, FAIR data just says this one, but you can have to completely rubbish data and it can still be FAIR, but what we want to have is high quality FAIR data. So FAIR clearly is some kind of minimal consensus. It's a Condizucenic Fanon, but we probably need to extend it at this point. And of course with this one, we can also discuss on how we want to continue, how we want to get this into or align this with free software. Okay, so that's more or less the brief introduction. Now there are a couple of things that we can discuss further depending on your interest. And that would be basically what about the current European policies, brief overview, what about the current German policies, what about generic free software tools. But maybe that's the point where you could say something to get us going a bit. I think it's working out. You mentioned that the current software standards might not be in line with the policies. What were you exactly referring to? Can you repeat this? You mentioned before that the current software procedures or standards might not be in line with the policies in the European Union. What exactly did you mean by that? So the thing is that the... So I can comply with OSI regulations for open source software, but none of our funding bodies says you need to be OSI compliant. What they say typically is you should do stuff that is fair. But right now one of the issues, this is what basically the slide then says, is the question whether any of the policy makers really define code as a primary research object. And that's right now not the case. So therefore everyone assumes that code behaves like data. And to equal code with data is something where some people get code shivers. Others don't because it's an operation that you can do. It's a lossy operation, but it might help us in some ways. And the main point here is that code has some idiosyncrasies that make it distinct from data and this is where our policies break. On the other hand, some of the policies that we came up, not for research but in general, so from the free software perspective that we made up there, didn't make it into the policy documents and so therefore are not incorporated there. So fair criteria and the other ones don't completely overlap. So most people might write code but it still wouldn't align with the fair criteria if you would take it one to one. So a question about the topic at the start, the licensing. So when we say we have a commercial company like Microsoft who develops an office package and when you say free software for open science, it would be better to invest the money not into license costs that we are reoccurring but better for like a bigger thing like a country to invest more in open code or open programs. Is this kind of tackled by what you mean with the fair or the open source? This is one of the things that is not necessarily... So you could construct it in a way that it actually overlaps with fair because you're talking about reproducibility. So fair doesn't say reproducibility but it says accessibility and if you're using formats that are proprietary you could say, okay well this is not accessible to everyone because you need to pay for it. Now the thing is that there are a lot of things where you have to pay for so this was one of the things that was never on the agenda to try to be eradicated. So the generic software part is just something that came into this whole process later. Initially it was really geared towards how can scientists make sure that or how does the software produced by scientists is both free software and contributes to open science and what do we need to do to create potentially additional funding opportunities because this is where it typically breaks to say well I can write better code if I have more man or woman power, if I have people who curate, if I have people who do issue fixing and so on and so forth which right now is not considered part of the research process but in reality, so by the policy makers but in reality it already has become that. Now if you're saying well you're using generic or generic office suits for that one then yes we are investing a lot in these things in the in tertiary education and in the research sector and personal opinion yes we should spend this on things that doesn't nudge people towards proprietary solutions but the question, but that's something that is because it has a stronger education component also for student education so I wanted to bring it up here because I thought maybe it's something that more people here are interested in but I agree that it doesn't overlap doesn't strongly overlap with the open science part I've heard some people work on the fair principle specific for software you've heard about it and you know what kind of the differences are yes so thanks for this input so let me check okay I missed that one so yeah there's a reason paper that just came out a couple of weeks ago by so Anna-Lena Lamprecht she's from the Netherlands East Science Centre so what they try to do is they to use the catalogue or this the original criteria fair criteria and check for each of those ones does it apply to software yes or no and then change them, amend them in a way to make sure that it then better fits into the process so they for example say well so there needs to be some kind of documented quality control they're more talking of course about software repositories they then include versioning which is one of the huge things that sets code apart from data which is once it's released typically a rather static object so they're trying to get somewhere and I think it's it's a good document to start with but in my personal opinion I think it wasn't bold enough you might have been I mean we had this discussion at the RSE 19 conference also where Anna-Lena also was there and it tries to stick very closely to fair because they assume that this is what people know which I think is good on the other hand there's a very clear recommendation from most bodies that fair should not be extended so we don't need as they say we don't need additional letters for fair and they really want to have this basically as one concept to stick with data so therefore I think it would have been necessary to have a bolder step to try to work in all the established development policies that we already have then just to stick as close as possible to fair and then just change the nitty gritty details which is what they did but nevertheless I think it's something that is clearly worth reading Thanks a lot for your talk this resonated a lot with me and as someone working in research infrastructure I think it's super important that we focus on recognizing research infrastructure so all kinds of services like sustainable data storage for researchers tools that help make data discoverable and things like that that this should be considered a public good and so next to what you mentioned and rightly so with Microsoft the other risk that I currently see is that legacy publishers like Elsevier, like Springer Nature and so on try to capture the whole market so this all trying to deliver on all the needs that researchers have in the digital area with huge platforms and this is like a battle that we almost have lost already as it seems so there are many interesting very good free and open source alternatives to what they deliver but it's really not recognized very well why this is so important this is my impression yeah I mean I would second that so I think this is interesting to see the large publishing companies now really moving away from the traditional business because apparently they have recognized that they might be on a losing path there but really to offer wholesale data management solutions to institutes I mean this is probably just an anecdote but so apparently Elsevier offered to I think the Netherlands or the Dutch government to say that they said okay we do all of your data management or basically you get everything for free but each and every institution has to deliver but we become your central data position platform which well unfortunately it might appeal to some politicians I think it doesn't appeal to anyone else in this room given that probably Elsevier is a company that is even more hated than Microsoft for reasons completely unknown I mean they just make a revenue of 35% every year so maybe we should just buy stock options what I not completely understand is why we use the fair concept as a point of reference at all because I feel like the concept of open access in science is far more applicable to code so in the end code is tax and it's part of the scientific publication system so we have references from and to code and such things so the open access the concept of open access has the same answers I think the scientific publication system with the norms of science and such so why don't treat code like scientific publications I'm relatively open to this idea the reason why we're having this discussion what I'm presenting to you now is mainly developed out of the existing EU policies and EU talks about fair a lot because for them it's an operationalized thing it's something that they would like to test in the end it's something that they would like to score and so on so forth so that paper pushers have something to do with but I agree that we can simply say in the end the openness is more important and fair as we already said isn't open so therefore the open access would maybe be the better point to hook this up so yeah I agree on that