 What are we doing today? We're talking about open source, because this is the open source conference, as applied to services. Services where oftentimes we end up building them or operating them in a non-open source way. And so we're going to talk about how to change that. I'm going to talk about some of the theory. So what principles we need to apply and how we might go about it. Michele is going to talk about the actual use cases. How it is happening in certain services and how open source is being used, where the gaps are, and so on. Kind of a scorecard analysis. So we have two back-to-back slots. And that's what we're going to cover in this session. So first I want to talk about what the hell our service is. I mean, we are all very familiar with this idea that some software is a service, but what does it actually mean? And someone asked this good question. It was Eric Helms like, wait a second. Tell me specifically why is this software a service? So we figured out, and I think it's obvious once you think about it, that services are software that you run for someone else, rather than asking that person to run or operate the software themselves. So when you run software for yourself on your own laptop, that's not a service, but when you run it for someone else, when you're operating for someone else, that's what we consider a service. And why is this interesting? Well, it turns out that operating the software is hard, especially as it gets more and more complicated, and many people don't want to do it. And they would like to have someone else, like this guy here, run the software for them. In fact, people who I think that's a fundamental principle that we have to remember is that if we force people to run the software themselves, even if it's a service, we've lost the fundamental reason that they came and are using the service in the first place. If they wanted to operate it, they would have chosen software that they can run themselves. And secondly, one of the big advantages of services is that you can bring a lot more benefits, a lot more value to the user of the service than just what is in the source code or what is in the actual application, the actual binary. The fact that it can be connected really easily with other services, or the fact that you can do the operations for it, the fact that in the background, you can scale it, depending on its workload, the fact that you can have multiple users interact with each other in that service really easily. There's so many different advantages. Of course, some of the value of a service is in the source code, absolutely. But it's far less than in software that you run yourself, like the percentage. Let's imagine, on your laptop, when you're running software, the percentage of the interesting parts, the value that's in the source code, written there, that are providing you with some benefit, is very high. On a service, it's much less of a percentile. There's many other things that provide the value as well. And fundamentally, this means that it is harder to take a service and make a second copy of it, because you not only have to make a second copy of the service. I'm sorry, of the software. But you would have to make a second copy of all of these things the way it authenticates with other services, the way it's operated, which users are using it, the network effect, and so on. And for any interesting service, that becomes very prohibitive to pull off. So those are the fundamental concepts around services that I wanted to share. So let's go into open source. So we're all familiar with this about open source, is that we have people who come. We share the source code. And the ideal is that someone comes by and gives a patch, or checks it out, makes a change. We have a community, a community build. There's interest around the software that we run. And this is a fundamental function that open source converts users into contributors. A small fraction of the users, depending, of course, on the project, some convert most of the users, some convert a very small amount. But it does what we do that. The problem is that open source stars when you inhibit that function of converting users to contributors. So here's an example of an amazing community, Postgres, where, I don't know if any of you actually, I haven't contributed in a while, but when I did contribute, I was super impressed with how they accepted me as a contributor, looked at my patch, actually worked on it, reviewed it, brought it through their cycle. I felt like suddenly I was part of the team. I was blown away by how that community works. We also have Amazon running Postgres for different users. And of course, there's tons of benefits there. There's high availability, there's backups. You don't have to operate it yourself. There's a lot of less complexity and so on. But it is not possible, if you're using Postgres in that way, as Amazon RDS, to figure out what is it doing, introspect the code, actually change something, make a contribution, and become part of the community. And so over time, if all the software is run in that way, we start to starve for contributors in open source. And this is a problem that I'd like to contribute to solving. You can see this diagram here. Open source software, we slowly whittle down the amount of participants. Each time we take a step here from the top down. So first time you might notice, okay, if the software is broken, I should look at it. Maybe it needs a change. And then the user figures out, oh, I can contribute. Then, okay, let's figure out what code I can change. What's going on here? Maybe I'll put a printf line. Like, how do I run this thing that I just changed? Oh, my change broke. How do I fix it? How can other people use my change? How can I contribute it to the project and so on? And each time, there's less and less people that stay with it. And that's why we have a small fraction of the users become contributors. But with software as a service, it's way harder and we lose everyone pretty much right away today. So how do we change this? And if we regress from this open source development model that we depend on, we take for granted, we take for granted that we can go and look at the code that's running and see what happens when we pass a different flag to an API, for example, or whatever. And when we lose that, we lose our basic practices that we use to our advantage. So because services don't have a distribution where you copy the service to everyone's computer, licenses, that is to say copyright licenses, are insufficient to power open source in this concept. They're necessary but insufficient because a service is not necessarily copied multiple times, like we talked about earlier. There's oftentimes just one viable place for the service to run or a couple. It's very difficult to copy all of those capabilities because it's not all in the source code. It's in the interconnections, it's in the users, it's in their data, it's in how the service is used. And so we can't rely completely on open source licenses to be the thing that powers open source and services. So we were talking about this with some of you and in the operate first community, this is a community that has a goal of operating services together with, not just in one company, but together with different companies, different communities, different participants. And we came up with two self-evident minimum open source requirements for services. And I'd like to share these with you. And it took a lot of discussion back around on how this would work and what would come about. And I'd be interested in your contributions, your participation, or your feedback on these. But it's pretty basic and simple, yeah? Okay, so I'll definitely explain that once we show the, oh, okay, so the question is why is this self-evident? What does that mean? And I'll get back to that. So the two, there's two principles, two requirements. The first one is that all of the services, code, assets necessary to operate the service are shared under an open source license and publicly accessible. It's pretty obvious. It's not just the source code, it's what is necessary to operate it, everything that's part of the service. And the second one is that a public contributor can use the same workflow as a typical team member to make a change to the service. These are really simple things. So the question is why are they self-evident? We discovered this ourselves. We didn't start with the idea of we're gonna make some things self-evident. The reason that they, I believe that they're self-evident is because by doing these two things on a service, you enable someone outside of the team working on the service, a public contributor, to take the open source aspect further, to fork, to build a community, to implement the ability to run it somewhere else on a different infrastructure if necessary, to operate in a different way. You enable, this is the core capabilities that when a service has them, these two requirements, you enable others to progress it to more full open source capabilities, more full open source methodology. So this is just the minimum bar that is necessary to enable that. Yeah, the minimum bar, this is not about the best practices in open source. We can all think about how we've experienced open source in really much better ways than these minimum requirements. But these are the minimum requirements that enable others to take the project or service and pursue those better open source practices. So again, all of the services, code and assets necessary to operate the service are shared under an open source license and publicly accessible. And a public contributor can use the same workflow as a typical team member to make a change to the service. There's of course a link down there to the full non-slide version of this with a lot more details in it. And I'll share that link at the end of the talk. So this leaves a lot of questions, of course. Like wait, these are very, very broad requirements. Like they could be applied in different ways. Like what about X, Y, Z? So in that link, if you follow it, you'll notice that there is a FAQ with some basic questions and answers of wait a second, what does this actually mean for this aspect or the other? I'm gonna go through some simple slide versions of those answers. Of course, they're much more fully described in the document and as with all of this, it is open for contributions for pull requests directly and the link to do pull requests we shared at the end of this talk. So the first question, what is the service? We already talked about this. A service is software operated for the user. Software the user is operating for themselves is not considered a service. Pretty obvious. Moving on. What assets, we talk about assets here should be open source. And the answer is make everything open by default except for law, security, privacy or common sense as otherwise. It's not required that dependencies of the service, whether it's deployment dependencies or other services that are connected to are themselves open source for the service to be open source. Could you speak up? No. So the question is when we talk about assets, do we mean the hardware or other things that the software is running on? And no, we're using assets here in the term that many developers use for different pieces not just code, but the operational scripts, the images, the API definitions and other such assets. Right. Right. So the word is confusing because there's initiative for open hardware. So one can think of assets in those terms. That's a good point. Then the question, should tests be shared with the code? Well, tests that are used during the operation of the service, maybe monitoring metrics, availability tests and so on, absolutely are part of the service. And according to requirement one should be open. And secondly, tests run when making changes to the service during the contribution workflow, the CI tests, all of that in order to meet these requirements must be open source. It's of course recommended that further tests are open source, that's always a good practice, but further testing, whether it's performance or intrusion testing or all sorts of other things that are not the two requirements don't apply to, it's up to the service whether to apply open source there. So the minimum requirements just require that those tests that are used in those two requirements are open. Should anyone, this is a common question that comes up, should anyone be able to trigger CI or test suites? Well, no, this is not required. Most of the projects that you run, if they have CI, have a small set of core contributors who perhaps are able to trigger the CI or let CI run on someone else's pull request. The same principle applies here. Must a contribution workflow be documented? Well, documentation is typically necessary in order for a team member or someone to come up to speed and make changes to the service. So if that's required, then yes, we should share that documentation about how to contribute. Must it be possible for anyone to operate the service themselves to make another instance of it? Well, if a typical team member takes the source code or the service and runs it themselves in order to make a change to it, well, then yes. You should share that ability with folks in the public. But if that's not necessary, there's many services where the team members are not launching a whole instance of the service in order to change it. Well, then it's not necessary to make sure that everyone can operate it. These requirements serve to make it sure that someone else can add that ability if necessary to run it somewhere else on a different infrastructure or to run multiple copies of it or to change it in some way. These are the minimum requirements that enable such a thing, but it's not necessary to meet that up front. So must all the dependencies for a service be open source too? Well, no, this is not required. A service can be open source without all of its infrastructure, its dependency, the hardware that it runs on or the APIs that it uses or other services that it interacts with to be open source as well. But these requirements again enable a public contributor to abstract out things that are proprietary, to change them, to replace them with open source alternatives and that is the goal here. So by meeting this minimum bar, you enable others to take this aspect further. And so must there be community for the service? And this is amazing when a community forms around a project, but the reality is most of our open source projects do not have a community around them and many of my projects, your projects, have no more than a single contributor. So it is great when that happens and we want to enable that for a service, but it's not necessary in order to apply open source minimum requirements. And so here are two links that go into the requirements and of course opening a pull request to these to change things, to have discussion about, wait a second, here's an aspect that's not covered or here's something that makes this not be that self-evident minimum set of requirements that we should meet in order to have open source applied to our services. And I would encourage anyone who wants to participate, to participate in all sorts of ways on this. But I'm hoping and Michele's gonna talk about services that are applying this or how different services meet these requirements or don't meet them where the gaps are and specific applications of this. So how much time do we have? Six minutes. So we'll have questions now and then questions after Michele talks about more of the concrete stuff. Yep, that's a very good question and one that we should have discussion around. The place I would start is with the current licenses. The choice of license for the code already makes a prescriptive choice for the service about the libraries that are included. The question is, I'm sorry for not repeating it, that what if we include libraries in the service? Where do we draw the line between what we require to be open source and what we don't? And so I think we start with the licenses first. That is the first requirement. We choose a license that reflects the behavior that the service would like on whether it's it, everything that is launched with the service, for example, with the GPL is required also to be open source or not. But the second thing is that things that are outside of this, I believe are things that are deploying the service, things that are monitoring the service, things that are obviously outside of the service itself. Those things should be available to contributors so that they can contribute. But the line, I believe, the minimum requirements don't require that those things are open source upfront. If we can replace them with open source alternatives, I think we have a real win there. And over time, we build more open source services. But to meet these minimum requirements, I don't think we have to be that contagious with everything that the service touches. Next question? Yeah? Right. Right, yeah. So the question is, are we required to take contributions that we don't like from a business perspective or for any other reason into the service itself? And the answer is no, the same principle applies as your other open source projects. These minimum requirements serve to enable a contributor who you don't agree with to fork your service. It's difficult, like we said, to make a second copy of the service or to duplicate that. But these enable that behavior just like forking an open source project. So that same open source dynamic applies here. We either use whatever mechanisms to come to an agreement, or we have branches or some other mechanism to involve other people in the project and the service itself, or it enables forking so that we can have that drastic escape hatch for people who literally don't agree and they can kind of do an evolutionary move of survival of the fittest and see who wins, right? Next question, yep. So we've spoken about this in previous dev cons. I mean, it was a remote DevCon 2021, I think it was. I would love to see a world where we enable contributors to contribute to a service without having to operate it themselves where they see their change running against their data and can do this. I believe that would be an amazing place to do. And I'm very excited by that kind of outcome. But that is far beyond these minimum requirements. So I did, I shared this to kind of share a bit of the why. Why am I concerned about services with regards to open source? Why are they currently seem to be diverging and not reconciled? And I believe we have a lot of cool things we can do with this. And we need to go far beyond these minimum requirements, but I think they just get us started and enable the rest to be discovered. Cool, so I'm gonna hand over to Michael. So, welcome to the second part. Now Steph, basically the motivational management make it happen part of it. I'm going to be the engineering complement of actually making those things happen and how to make those things happen. I said like, I think it's actually, I mean the reasoning behind why we would want those things is very sound, it's the same reasoning basically behind open source software by itself. But it's not as evident for an engineering type of person of how this now actually can be implemented. And so I think the discussion about minimum requirements kind of like might obscure that this is actually more of a journey than an end goal. A service can be partially open source. So it is an incremental happening and that's what engineers do, right? Like we don't mind too much to work on the same project for a couple of years and I mean we know that it's not ideal that's why we continue to improve it and the same goes for the open sourcing of something like this. And then we're going to talk about the engineering side but there's another aspect that needs to be taken account and that's the social aspect. So if we are talking about incremental improvements for an engineer that's kind of like changing code, changing deployments, changing whatever. But there are also attitudes involved of people, there are fears of people involved and they might actually need changing as well to get to this place of open source services and that's not included in the talk. So that's maybe another talk for another DEF CONF to do. So we thought about like how would you actually strategically go about this whole thing? Getting this marching order of open source your service. And we thought that it would be interesting for taking this general idea and decomposing it into something that's actually actionable. So we tried to come up with this scorecard of like how open source is your service. Kind of like this card game, right? Like you can say like my service is more open source than yours. And it should allow us to actually identify gaps like the ones where we would lose, like which pieces are not where they should be. And actually applying this scorecard to different services might allow us to pick ideas from other projects because they're all open source, right? Like so if somebody solves one of those aspects in a better way then it actually might be something that we should adopt in one way or the other. So I will put down the scope again. So this is about services but a lot of this is actually also applicable to other open source projects. But just here it's now limited to services where software is run for users. Now those users might actually be not just these strangers on the internet. If you are a public service, yes that's true but if you are in a company environment you might have those internal customers that are really angry if your service is down and that might be something where you would invite contributions as well to get stuff fixed. So if you could pick their brains or if they could file a merge request to fix the issues that they see you would already gain quite a bit. So thinking about community and what this might mean for whatever project you have might make sense. And this is a list of stuff. So this is like me at night brainstorming after we had the talk settled. Coming out of the questions and answers that in the previous talk. So it was about assets like the code. Everything that is in the Git repository basically. The workflow that people can actually contribute in an open way. How do you deploy this service and that's like the secret source of most services like how Amazon deploys Postgres into their infrastructure? I don't know, nobody knows I suppose but would be kind of cool. And it would be required to kind of hack on RDS. But it's also related to the communication style. Like is this actually a transparent process? Like how hard is it actually to talk to people? Is there documentation available? Do they have procedures for running it if something fails? What do they do? I mean if you want to run the service that's something you would also need. And if you have to develop this from scratch that's basically not possible to do. Are issues tracked in a transparent way or is it like this secret stuff where you see issues and then 95% of the comments are private and you don't even see them. So you don't see that there's stuff happening on them. Or you don't see those issues at all. And then can you actually see people operating the service? Is this something where you can observe and learn how it's done so that you can improve on it? But yeah, so I could go into the details but maybe that's not the best. So we just directly skip to some examples. So we are going to take two services and then try to figure out like how they are doing. So luckily it's not Santa doing the evaluation. So we're going to look at two projects. One is the one that Veronica talked about in the previous talk, which is the continuous integration as a service project for kernel developers, which is the CKI project. It's an internal project. It's aimed at internal developers. That's the business reason for it but there's an extended business reason that it tries to prevent bugs from getting into the upstream kernel first because that is actually going, I mean we are a redhead team. We want to prevent those bugs to actually hit the redhead kernel. And we would like to get stuff fixed before it actually happens. And the other project we are going to look at is GitLab itself. So GitLab is like this other Git forge. I don't know whether people have experienced with it but it's not GitHub. They have a different business model. They are selling you local instances that you can install as a community edition. It's an open core business but they also have a managed instance which is managed in a really interesting way in a very transparent way and they consider themselves as the open company. And the bottom line contains how we score this, right? Like this is now beyond the, we nowadays use emojis for scoring, right? Like we don't do plus one and minus one. So you use emojis for this one. So I don't just ask, just look at Veronica all the time and I talk about CKI whether she agrees or not. But so CKI is regarding open assets, pretty okay. So they have all their source code in GitLab.com but it took them 18 months to get there. So this was like mixed models, stuff in the open, stuff internal and especially moving stuff from the internal space and consolidating it into the public space is harder than it sounds most of the times because people tend to kind of like get messy if nobody is looking, right? Like you mix internal information into it makes it really hard to actually open it up. But so this happened. Dependencies are open source as well but then we get test on well. So well is private. So this is something that external people have problems maybe reproducing bugs for example on well. And they are, it would be kind of nice to have external tests available but then this doesn't really apply for CKI because there are no penetration tests or something running against it. But for other services it might be applicable. So okay, that's a check mark. This is hosted on GitLab.com and GitLab.com is actually pretty good in enabling a workflow for internal and external contributors. So it's the same workflow for everybody obviously because it's a small project. There's no documentation available how to do this properly but it's merge requests so it's not hard to basically fork it and open a pull request. And there are bots coming into it and they can talk to them and if you say please they will be in color or something but there's something to trigger for example integration tests that is mostly self-evident hopefully. Most of those features are mitigated by permissions and I think it's very similar to what people would do on GitHub. So you try to kind of like limit the exposure of your internal infrastructure but still try to make it possible for people to post it. A pretty important aspect is actually that people will feel welcome so you need to be nice to them but that also means that you're actually going to look at their contributions if those contributions come in. And for CKI that's pretty okay. Just looking at the statistics most merge requests are merged within a couple of hours but this is the same for external or internal ones. So let's go to some uglier aspects of the whole thing. So how do you deploy kernel testing as a service externally? Yeah you just don't. So there's I think one lap where people do this but it's because it was never required really hard to do. It's really annoying to set up. I don't think most team members would be able to do it. It's this thing that exists. It's GitObsified but it's still not, it was never necessary. It's a huge barrier to entry to actually get this thing running. There's no way to run it on your laptop while there might not actually be a good reason that this is that hard. Like you can run a GitLab instance on your laptop we will see it. And yet the infrastructure repository is also not available. And why? Because it mixes secrets and internal information. It's kind of like this mess that somebody would need to detangle and nobody can be bothered. It's just like thankless work because you will just break something. We have good documentation not how to set up the service but actually how to operate the service because that's what we are doing. There's some company internal guidelines that can't be opened up but other than that everything is visible. And also the operating procedures are actually accessible. It's possible to do this even if there's internal stuff in there you can actually tease those things apart as something that is done. But as a company internal project we don't have a code of conduct. Now if you talk about communicating it becomes interesting to discuss like who you want to communicate with. So if you're talking about communication with upstream like this is one of the goals of the CKI project and it's actually pretty hard to do it as a mailing list and that's basically it. You can't look at the source as it's made or the source address it's made. It's very transparent process. But if you're talking about internal kernel developers which is a really important audience also for contributions we have a channel you can jump in the CKI project pretty friendly folks. So that's a pretty okay experience but then there are easy gains like having a meeting where people can just jump in and ask questions like in-person video meetings. We are a remote team so. And then an interesting aspect is that next to official issues and the team channel a lot of discussions happens in those documents that people just create. I mean Google document, the markdown and something. And that's also something that can be actually solved and it would lower the barrier of entry but it's not something that's done properly. All the issue tracking is on GitLab so that's fine. Even though it's an internal team and most external contributors will never look at it but it still shows how it works. And then yeah we just skip it because it's the very entrance parameter. I'm not going over the list but how CKI is run is not visible on the outside. So this is a huge gap that would prevent people from actually running it because they can't see how it's done. They can't observe. So we have two parts like there's a good part there's a bad part and we are not talking about the ugly part there's no ugly part. So let's look at something else which is the GitLab project. So they consider themselves as the open source company or the open company. They have everything on GitLab itself that's their main product. They are their first user. They are the heaviest user most likely. Everything that they need to run is open source as well because you can just spin it up on your laptop. Whether there's anything missing from the public view you can't figure out from the public view. So we don't know whether there's anything secret for testing for example but let's say open assets it's there. We have people contributing from Reddit to GitLab. So it's this contributor workflow like Reddit people are annoyed. It doesn't work and they come in and fix it and Lukas is actually kernel developer basically contributing to Ruby code working on those issues that annoy us as the users that they don't, like GitLab doesn't care enough but we care and we have a developer basically fixing those issues and they accept them. They have a really good workflow. They have coaches to move those things along. So they spend a huge effort and it works. It's something that is open for contribution in the best. You can just go through their documentation on how they do deployment. You can go through various possibilities of how they do deployment. There's some things that are missing but it's an engineer's paradise if you want to go in and figure out how they do it. You can just spend weeks and weeks reading code and documentation. And so it's just, yeah, it's available and there's nothing happening. Like there's internal information in those operating procedures for example and they just accept it. It's something that they accept as the price for the openness. They have something interesting from the engineering, from the management perspective. They have their management handbook online. So you can look at how they actually want people to work and it's a very interesting one. It tells how people should communicate that they should be open, that it should happen in a public venue. And this shows there might be internal, internal aspects to it but it's not something that seems to be missing. And yeah, obviously issues are also tricked. And you can read about their strategy. So if you're wondering like where is GitLab going as a customer? Is this moving in the direction you care about? You can actually look at those documents. So there's no management secret source as well. Right, this is not something that you would normally require from an open source service but it's very interesting for a customer to know where the company's going and whether this actually matches your use case. And you can look at them, why they run it, especially if it fails, it's beautiful. It fails, your monitoring goes off, you're using it and our monitoring goes off and you go in there and you can look at them freaking out or not freaking out. I mean, these are like people that don't freak out. And you can watch how they triage issues in real time. And it's highly interesting for learning if you want to do it yourself, for example. So they are doing some really, really beautiful things. And some of those things, there's mostly behind some of those points. So sometimes even they have a limit where they say like, okay, we are not going to share this, this is not something. But so let's recap that because I'm an engineer. I'm trying to run a service as an open source service. And having this scorecard actually was quite eye-opening, pointing it at one of our services to find out where gaps are and what kind of things we would need to fix to actually move it to minimal requirements or to move it further. And it shows that it's more of a journey than really like getting this bar and then basically being done with it. It's never done. And I think stuff has this tendency to regress in places that people keep stuff in terminal. Don't share certain information. It's actually not that, it's never finished. We heard this sentence before about Linux and running an open source service is also something that needs to be reinforced once in a while. And would be interesting to continue this because just having a scorecard is one part, but most likely you could actually make something like a playbook out of it where you could, if somebody comes and says like, I would want to open my service, actually tell them different steps that they would need to follow to get somewhere. So yeah, the earlier you start, the better it is. So that's basically the thing. And so if somebody has a service that they want to evaluate in the following couple of minutes, I have empty slides with all those points so if somebody is up for it, we can evaluate the openness of the service within a couple of minutes. Otherwise, that's it. I don't dare to admit it, but this was like me at night coming up with slides. But I think it is quite, yeah, you could actually, but then it would need to do it properly. This was like a wrong person. Let's let people review it, scorecard. But then Steph didn't really look funny while I presented it, so most likely it's not too bad. But I think it would need to be done properly to be actually something to move into policy. Oh yeah, and the question was like, is this public? This is basically what there is. So there's not more than this.