 So, operate first, how to open source... We can see you, Marcel. I joined late. Sorry. Great. Everything looks good. And I can see you. That's awesome. And now I will leave. By why? Okay. So, operate first, how to open source cloud operations. That's a pretty big theme. And it takes some time to understand what I'm talking about or what I tried and it has multiple facets even what we're trying to accomplish here. It took more than a year to articulate it in our teams. So, I'm trying to go slow and I'm trying to focus on just some of the one or two of the aspects. And don't worry if it doesn't click immediately because it requires some out-of-the-box thinking. Now, if I just could click here. So, that's me in the corporate world. I'm a senior manager in the AI center of excellence in the CTO office at Red Hat. I'm working remotely out of QGermany and I'm looking into Red Hat's AI strategy and for three years I've been working on the topic of AI ops and now with operate first also on the topic of operations. On the internet, obviously you have to be a little bit cooler. So, that's my tag name do random and I call myself an old school open source hacker and demon zombie slayer. And you old guys that actually try or the young guys also I don't think it's a matter of age. How close you are at the processes, you know what demons and zombies are. So, that's where we also need to go back to, right? So, it all started back in the days with Unix when we had these old mainframes where you basically bought a room full of silicon, full of machines and what you got handed was a green screen monitor with a huge manual to program the machine. But you could only program the machine. You could not actually change the machine or change the hardware or change the operating system. So, that happened when communities were founded around Linux, around open source software and these were founded because people wanted to tinker with the machines, wanted to tinker with the operating system, wanted to improve and add features to it. So, that's this typical user to contributor funnel that we have in open source software. So, typically you have a user or 100 users that love your software, that love your operating system. And then they start having issues with your software. So, out of these 100 users, you have 10 users that are not content or that experience a bug and they open an issue with you. And then maybe out of these 10 users, you have one user actually fixing this issue because he cannot wait until the issue is being fixed or he has a real interest in doing this. And then he becomes a contributor. So, out of 100, maybe you get 10 people reporting back and one person being a contributor. And with that funnel, it's basically like a sales funnel. You want to have as large as a community as possible so that everybody can use your stuff so that eventually you're standing on the shoulders of your community and it's not just on you to support this stuff anymore, but you can do it as a community. And it's not just about reporting bugs, it's also obviously about driving the feature set of that software and shaping the future of that software. And although it started at universities and in the private sector, not the private sector, but with private people doing stuff in their spare time, nowadays most of the open source software is actually being driven by people that are being paid for it. So, we changed the default model for writing software and for collaboration to the open source way. That's a matter of fact. Recently, something changed. So, in the age of cloud, I think, and a lot of other folks also think so, that it's suddenly more important to operate software than the actual software itself. Databases are ubiquitous. You can go to your hyperscaler to your cloud provider and order compute, order a queuing service. So, a lot of folks, a lot of people are thinking in services, are thinking in commoditized software and somebody has to make sure that the software is up and running. This is a tweet from an cloud open source executive at AWS and he posed the question what actually happens if you open source everything? He came up with this example from that Yagabyte company which had a software and they open sourced it and it was great and because they realized that the software doesn't really have a value if people cannot use the software. So, are they really open sourcing everything? Although the whole software stack of the database that they have is open and free for inspection, the platform where it runs, that's not open. So, everything in the cloud world in the as a service world is everything but the service itself. So, how do I get this contributor funnel for something running as a service or the operations? So, if you want a software as a service is by definition some software that you offer as a service and the added value to that is your operations and that contributor funnel is being essentially tried out because yes, you can say I have a fix and you can meet at Stack Overflow how to work around that fix and maybe you can even open an issue with the provider of that service but you cannot contribute something back in terms of operational code. So, if the value in IT is in operations and operations are proprietary then open source has a problem and all operations are proprietary by default because you have private data in your logs, in your metrics, in your configuration. So, I don't know of any cooperation of any cloud that is getting people in the internet or even for researchers access to their production cloud setups or their production environments. An operate first is something intended to fix that. So, operate first is a concept to incorporate operational experience that means how you operate stuff how you operate that software into the software development cycle itself by extending the development to include operating, testing and providing code in the production environment. In essence that means shift left and work with your developers work with your ops guys to make the software itself more operable so that it becomes easier to operate the software and therefore encapsulate and ship that operational excellence with the software itself to the users. Ideally operate first becomes a partner to upstream first as a basic tenet of our workflow. That's upstream first is the workflow that Red Hat operates by but also a lot of other companies are operating by. That means basically every single line of code should make it into the upstream project so that we don't have a gap between the upstream code and the productized version so you don't want to have a fork of something because that increases the maintenance burden and that has a lot of bad side effects and obviously the better side effect is that you're not sharing stuff with others so you deviate from your upstream community and essentially at some point you're not the same project as upstream anymore but still the question remains how do we do something like this with something that is not software that is not a line of code but it's encapsulated in the operational experience in the minds of the people in data, right? So let's look back what open source software or what open source made to free it's from the chains of proprietary enslavement basically how do we turn users into contributors it meant that we hadn't read only access to all the data it meant that we haven't read only access to all the source code so we have to have something similar for operations that means have read access to all the config files to all the metrics, logs and all the stuff being produced and accumulated while running such an environment it also means that we must be super inclusive in terms of onboarding not only your customers the projects that will come to you as a cloud provider but also in terms of contributors and contributors start at a different level of experience at a different level of what they want to do so you have these power users that want to dig right into the core of something or you have these beginners that just want to get practice and get some learning experience with the software itself or with how to operate software right? You remember that maybe from some projects would have some nice to start labels on issues or getting started issues or where you have community architects community people selecting stuff how to grow that community so that in the end you go from reading something reading source code reading issues reporting issues and then resolving issues so we want to create a community that is inclusive to all personas it's inclusive to the people that operate the cloud it's inclusive to those who develop workloads on the cloud or that develop the cloud stack itself meaning Kubernetes these days or something on the Linux layer it must be inclusive to the users so that we get as many users as possible for running their workloads there it can include product supporters so if we see a problem in that open source cloud environment we can replicate that same problem into a commercial support or maybe you have a commercial supporter that also does some community support in his spare time you have architects using these Lego building blocks to develop something new and now suddenly they have a place to host their long-living reference architectures instead of setting something up for a customer or for a demo and then tearing it down again and then half a year later coming back just to realize that it doesn't work anymore now you could have a long-running demo in such an open cloud environment and hopefully finally we also please our AI overlords feed them all the data and then we can do all the good stuff and the cloud will be operated by machines so finally or in one sentence what we want to build what's the implementation of that operate-first mentality so we need to have a cloud with full visibility into the operation center and that cloud can be anywhere and it's really in the broadest definition of cloud it's not just something in a real cloud environment but you can also have a hybrid cloud where you have something running on-premise it could even extend out to the edge it could even extend out to your small Raspberry Pi running in the cellar so some communities are doing something like this already so the home automation community has a lot of user-contributed hardware which is really which has a real huge variety so that you have this primordial soup of ideas challenging each other so really cloud here is a synonym of all the stuff that we can do to operate machines and stacks in a forward-looking fashion that was the philosophical background and that's something that we've talked about for some time and since almost a year now or maybe three quarters of a year we're also really growing a community doing this and we have an actual environment for doing stuff so let's get real the first setup is at the MOC which is the Mass Open Cloud it's not the Massachusetts Open Cloud anymore but it's hosted at the Boston University so in this beautiful city which we're virtually now having that conference here there's a data center and there are some racks with some donated machines which are quite beefy so it's in the ballpark of 300 CPU cores and 3 terabytes of RAM so like really nine node clusters on bare metal so you have a Kubernetes slash OpenShift environment running on these machines and we have another setup at Hetzner in Germany which is something like a rack space which is a small environment so and we have clusters running on AWS which are managed by that environment as a matter of fact the workshop which is now running which is about how to do cloud native data science is running in an operate-first environment which is hosted on AWS set up just in a couple of days to a fully functional environment because we had all the pieces available and in the future we're working to get more and more providers connected to that environment so it's really multi-geo multi- yeah multi-hardware what do you say it's a hybrid cloud it's a hybrid cloud environment and on top of that we have obviously workloads so as the most prominent workload we have this open data hub initiative or this open data project which is a collection of tools for doing AI and machine learning and data engineering on top of Kubernetes we have projects like apqrio or quarkus or pulp hosting workloads in that environment we have set up some management and automation tools so we have advanced cluster manager which is in as a matter of fact the productized version of the ocm stuff that my previous speaker talked about so that's already somewhat included in that environment we have argocd and we use that for setting up clusters and maintaining clusters we use argocd for deployment of workloads and configuration management of the clusters themselves we use prow from the Kubernetes and from the Kubernetes communities to do CI CD testing we use tecton pipelines etc and we try to treat everything as a service so open data hub is being treated as a service you can come to it as a user and just use open data hub resources we have Kafka so you can just use Kafka we have Prometheus and Loki for your monitoring and data dashboard needs and you can actually install or work with us installing every operator every community operator out there and get it installed in that cluster although it might break the cluster but then we figured out that it might break the cluster so we giving something back and operators are that notion of encapsulated operational excellence into a piece of code and obviously we want to create operational data sets because data is everything these days so all the alerts, all the issues, all the logs, all the metrics, everything that we produce while setting up these clusters and running these clusters is collected for posterity so that we can train machine learning models on top of that really that's the the longest running question that I have since I'm working in that space where can I get a decent data set for my machine learning stuff that was always the longest running question and I think now we're coming to a now we're getting to a situation where we can actually create useful data sets and obviously using that for community building so community where are we right now these are pretty recent numbers as of yesterday the first of September we have about 26 repositories in the Operate First organization we have 108 individual external contributors and we have 26 internal contributors so the most that's already pretty inclusive I would say we measured that by we decided who's internal and external not just by company affiliation but it's who is part of the Operate First team in GitHub and who has who can we assign issues and tickets to those 26 folks are mostly people from Red Hat it's mostly people from the AICOE or people that that I manage but we're growing and we're trying to get more and more folks into it in terms of issues created over time you see that this we're going to the upper right corner that's what you want to see on every chart and in 2000 beginning this year 2021 we broke the 200 issues barrier so that's issues created per month so you might ask what kind of issues are these are these issues opened by machines like alerts or incidents no it's actual issues that are created by humans and if we look at the repositories here it's that the apps repository and the support repository here the support repository get the most attention obviously that's because the apps repository is our GitOps repository where all the configuration is created so every user request probably also results in a change to that repository and the support repository is just there for finding user support so if somebody has a question or somebody wants to onboard they're going to that repository and create issues there the orange color here identifying the number of external external people creating issues here it's a nice indicator that we also get external folks doing they're doing GitOps in that organization so reusing the infrastructure that we provide in terms of contributors to these issues it looks just the other way around obviously so it's 13% of the issues are worked on by external people and 86% of the issues are being worked on by internal people which is also it's obvious that it's going this way and probably most of the time hopefully this is something that it's a trajectory that we also carry on because maybe you want to include most of the people into your community so I don't know if this external internal split even makes sense for what looking I don't know but these are some first good numbers so 26 at the end call to action you just I don't want you to go away with nothing but I want you to go and try out stuff so go to the operate first cloud website and you will see here two buttons two call to action buttons if you are an open source developer and that means if you are running some of your projects or if you're developing projects that can run in a cloud native environment or if you're working on components that are that are making up Kubernetes you can go there and deploy some of your operators in that environment you can deploy some of your workloads into that environment if you by the hand and explains how you get some of your workloads deployed on operate first if you want to help operating these workloads or if you want to see how these workloads are being operated if you want to dip your toes into SRE best practices this is the other path that you can follow click that button and we take you to the back office where you see all the good stuff that I was talking about and where you can ask questions and collaborate asking questions means collaborating so in the upper right corner there are these nice little icons that everybody knows the cat from github the slack icon for chatting and the youtube icon for re-watching some of the sprint meetings that we did or some of the bits and pieces that we published on youtube there's also a mailing list that I urge you to subscribe and if you just go there and subscribe to the mailing list so that you get updates into your old fashioned SMTP port 25 box delivered to stay up to date that's it thank you two minutes for questions let me see if I see some no questions that's sad yes you will find me in the breakout room I guess I hope so maybe you find me in the breakdown room actually I'm here in EMEA so it's 5.30 for me I'm not sure if I will stay long on the breakout room but I'll check on this right now alright just one question can we build an event out of this content to learn by doing yes that's great and we are curating a lot of this content that we have there there will be a webinar I don't have a URL for it but monitor the linux foundation website of their webinars it will be on the 5th of October somewhere on the internet where we have a cooking show for how to do stuff there we have workshops longer workshops at DEF CON ZZ and in the future how to do this so there's already an awful lot of content it's not in the best shape because it's created by doers and it's created bottoms up so if you want to participate in shaping that content or consuming that content and then feedback by suggestions that's always appreciated alright thank you so much for coming to talk Marcel it's been a real pleasure okay