 I'm working as an serial liability engineer in the OpenShift dedicated team. And I've joined from a software engineering role one and a half years ago. And since then some of the practices have been different. And I want to share today with you what I deem most useful from the classical agile practices like things you find in Scrum for SREs and what was not too helpful for us as a team. So teams of SREs typically consist of people from the software engineering world as well as people with more of a operations or administration background. And you put that in a ratio of 50-50, that's what usually happens. And then the hope is that magically people from with more an operational background will learn from the software engineers on the other side will learn from the operations folks how to, what it means to maintain to run a system. And then you get that new species of engineers which is the site reliability engineer who's striving to automate all the operations tasks that you have in an environment. But when an incident happens has all the professionalism to resolve it as quickly as possible. So the idea of site reliability engineering is to apply software engineering best practices to operations tasks. Some say it is what naturally happens when you tell a software engineer to operate a system because the software engineer would then naturally try to automate things when they do it the second or third time. But what also happens is that the software engineers will bring their favorite agile framework like Scrum, Kanban or Extreme Programming and try to fit that into the operations world. And because of that, it felt quite natural for me when I joined the OpenShift SRE team that from my previous team where we were doing Scrum, it felt quite natural that we had three week sprints, we had daily stand-ups and all stuff like that. So that felt very familiar to me. But then what happened in the probably second or third week was that we out of a sudden dropped the stand-up meetings altogether and that was really hard for me because I think in the six years of software development I had before, I guess in most of the days I had a stand-up meeting and that was the most natural thing to me. So we removed them because it turns out site reliability engineers are not very interested in having regular meetings on their calendar. And that is because software site reliability engineers are trained and comfortable in very stressful situations and need to resolve incidents very quickly when they come up. So they are, when an incident appears, they start working on it and work on it until it's fixed and that there is no meeting that will stop them from doing that. And that means a meeting with the whole team, scheduling meeting with the whole team, like a planning meeting which can for three week iterations can take one or two hours conducting such a meeting is very hard because people will just drop out or they aren't shift so they won't even show up or are working on an actually important customer ticket. So scheduling something like that is very hard and you anyways need to make sure that all the information that you have in the meeting somewhat persisted so that whoever wasn't able to participate can still know what was happening in the meeting. And additionally, SRE teams are typically spread across different time zones because you want to have 24 seven someone available to work on issues. And that means it can be hard or even impossible to find a spot in the calendar of everybody if the team is spread across the globe. And that's especially true for meetings that happen daily like standard meetings. You can't expect people to get up at night every day to join the daily standard meeting of the team, right? So that's, here you see a calendar, a typical calendar of scrum. There is a lot of meetings in there, stand-ups, planning, grooming, review, retro and many of them they consume a lot of time. So is this just not fitting at all to SRE or what do we need to do to have a good agile practice in an SRE team? And we found it helpful to stick to the iterations and we still do three weeks prints. But we dropped all the things that were not helpful to us or adopted them. And today I want to share with you what for our specific work environment were the most helpful things to keep. And now that I bashed about meetings that much, let me say three out of those five practices that I want to share are actually meetings. But all of them don't necessarily have to be meetings. You can also find ways to do them in an asynchronous way. And if your team is very much distributed this is probably what you want to do. So let me start with the first one that might be a surprising one. The retrospective is often deemed not very important but for me it is one of the most important tools that you have in the agile world. And that is because the retro is the tool for the team to get together and talk about what happened during the sprint. Not just we finished this item and now we can support this feature and we fixed that back but also focus on how did we perform as a team what went good and what went bad and define action items, how to improve how you work and also improve the agile practices themselves. So for example, I said we dropped the stand-up meetings at all and we utilize the retrospective meeting to bring them back in and talk about how can we make stand-ups useful and evolve also our personal agile framework that we want to adopt in the team. Often those retrospective meetings are considered very boring and not very helpful and people come up with all kinds of ideas to make it more interesting, more fun like put retrospective games in place. You can look up several different games to play retrospectives. But I can't recommend that because I think those retro games just make the retrospective less authentic. So I can only advise to focus on the content and that is what did go good and what didn't go good and then define action items from there and improve and check in the next retrospective if things are better. So really focusing on the content and instead of trying to play games to make it more fun. So now we talked about the stand-up already and that we brought it back in. What we had in the end had or what we are doing now is performing stand-up meetings twice a week, not every day. And this is in the end an outcome of the retrospective. I brought that point up because I think it's really important to stay in touch with the team and to have a feeling of what everybody is working on and also a forum where everybody can put questions, everybody can raise concerns or tell what's blocking him and where they need help. And I think it's very important to have some meeting where everybody with the whole team gets together and talks about what's going on. So you have a feeling what the direction of the overall team is that you're working in. So how comes you put daily stand-ups in question? And as I said before, if you're spread across multiple time zones, it's hard to find a spot where you can meet every day. And so some of the teams in the SRE organization decided to do the stand-ups, not in-person, but have a daily reminder on Slack where then everybody puts the stand-up. That's what I did. That's what I'm doing today. That's what's locking me into that Slack thread. So everybody can do the stand-up when they join. But for us, since we're not too scattered in this team, we do the in-person stand-up at the time that it's suitable for everybody on the team and do it only twice a week, which was for us enough to update each other and talk about what happened. And as I said, I can only encourage to utilize the retrospective to figure out the right way to perform stand-ups rather than just dropping them at all. The next agile practice I want to advertise here is the planning meeting. It's also something we adopted from Scrum. Probably we should relay with it because what you typically have in Scrum, a long planning meeting in the beginning of the sprint of the iteration where you commit to sprint goals, estimate, and commit to a sprint goal. But what we do is we conduct a meeting every week on Monday, which is only half an hour to talk about the backlog items which are currently in the sprint and rebalance the priorities that we have in those backlog items. So what often happens in an SRE team is the team which is on shift realizes some bug, something that is creating, that is a source of a huge amount of toil that you want to fight immediately. And by rebalancing the sprint priorities every week, we can quickly react to such situations, create new backlog items and put them into the sprint and probably at a high priority because they might be very painful to whoever is currently on shift. So it's not exactly a sprint planning, but it's more a rebalancing and making sure that the top priorities that a team is working on are understood by everybody. So everybody is able to pick up the next top priority item when the current item is finished. We even estimate those backlog items even if we don't commit to a sprint goal, we still estimate them because estimating items is useful to foster discussion about the backlog items. So it's not about finding out a velocity, it's more to make sure everybody understands what this is about and why somebody thinks it's complex and somebody else thinks it's less complex. Okay, so that's enough for meetings. Another thing is testing and why did I put this here? In the SRE world, we often write software that replaces things that have been previously manual actions or bash scripts. And then sometimes you think, why should I test this rudimentary operations tasks? And now that we are writing, automating those operations tasks in code, we actually have the ability to write tests for it and we can make use of tests just as in every other piece of software. And what tests do everywhere else is they increase the confidence in the code that you write. They not only ensure that there are no bugs, they help structure the code, make it better readable and they do just the same for that simple operations script that you're now automating. And yeah, when you have tests refactoring your code to make it more readable is much easier because you don't have to fear everything breaks because you know you have a good test suite that will ensure that the base functionality is still in place. And this confidence is even more important in an SRE team where the code that you're writing is running unattentedly on a number of customer systems, customer clusters and performs operations in parallel on all those systems. So when I push something to the production environment, I want to really make sure that I don't wipe accidentally customer data. So I really want to get all the confidence that I can get into my code. And that's what you get from a good test suite. Okay, so speaking of confidence, the next topic is as well about confidence in the code and quality of the code. And, but first let me share with you the my worst and best onboarding experiences I made in my carrier. And the worst one was I joined the team and my mentor gave me a book and told me, read this book and we meet at the end of the week and you tell me what you learned. So I spent the whole week not knowing which parts of that book are important, which part of the product I'm even supposed to work on. And if I should reach out to my mentor with questions and I didn't have much confidence in what I was doing in that week because it was also my very first job that I ever joined, right? So this was not the best experience. You shouldn't do that. And the best onboarding experience was when I sat down with a member of the team and we worked together on the current task that that team member was currently working on. And thereby we created all the accounts that I needed. You know, when you're joining your team you need to create, I don't know, GitLab accounts and account here and generate keys, distribute keys there. And we did all that while working on that task. So I got already used to the tools that a team was using and I built a relationship with the team with the members that I set together. And at the same time, automatically all my accounts got created and I was productive and contributed content to the teamwork on the first day. So the next practice that I want to encourage everybody to use to make use of is pair programming. And that's especially useful for onboarding new team members especially in a distributed environment where you can easily feel lost or disconnected especially when you're new and they're sitting together with somebody to just work on whatever you're working on is very helpful. It's not only helpful for onboarding it's also useful for all the code that you develop in the team. The code will benefit, it will be a better quality because if one developer is working alone on a code you on one hand you get stuck easily and when you're working with another developer the other one might know an answer to your question or might have a new idea to follow. And also when you're working in a pair it's much less probable that you will take a shortcut and do the quick and dirty way than the clean way because the other one will point it out and say, shouldn't we do it like this? And also the knowledge of that part of the code that you're working on, that new feature or that bug fix is automatically shared with a pair. So you don't have to sit together and talk to it later to describe it because you already at least in that pair know what's going on and so the knowledge is spread automatically. And even the code review as you have it today might not be too important because you already have one review automatically. That doesn't mean you shouldn't perform reviews when you do pair programming. You can do that when you think you need something but you don't need to have a review for everything. Your code is automatically reviewed or seen by at least one other developer. But not only for the development is pair programming useful it's also when performing manual interactions with a system as SREs do from time to time to resolve incidents. It's useful there as well for the confidence again that whatever you're doing your manual interaction and the commands that you're entering in the command line that they don't break the customer system because someone else will see them. So you already know that at least two developers think it's a good idea to do what you're trying to do. Okay, those were the five practices I wanted to share with you. And if I can give you only one to take home I would say take the retrospective to your team and start talking about how you're performing and what do you think where you should adopt to get to a better place to adopt your practices so that you all feel comfortable with and have the right amount of meetings and asynchronous ways of communication. And in the end that means the agile framework that you have shouldn't be static you should evolve it over time and after every iteration talk about how it went and improve how you're doing and how you're working. So that's all I have for today. Thanks for listening. If you have any questions you can ask questions now or reach out on Discord or by ML. Thank you, Manuel. We have one more question in the Q and A from Balash Bokharadi. Would you prefer a retrospective with or without the manager participating? So we do retrospectives without manager. I think it's up to the team. You can ask the team to decide that. You can even do that in a retrospective, right? I've seen managers who wanted to take part of some of the retrospectives but then the content is a bit different because some people might not feel comfortable talking about some things while the manager is there and think they need to look good in front of the manager or whatever. That might be true or not but it's, I think you should at least have the content of everybody on the team. Everybody should feel safe to talk in their retrospectives. That's what you need to strive for.