 Hi, everyone. Folks, welcome to the session. We have our speaker for the day here, Bianca. She's a senior software engineer who lives in Rio, Brazil. Works at Red Hat. Stages all yours, Bianca. Hello. OK, can I start? Sure. Please go ahead. Sure. Sharing my screen here. So here, I wanted to give a talk about shutting down GoalRoutines gracefully at DevConf. And thank you for the introduction. I am a senior software engineer at Red Hat, as previously introduced. And I'm also, like, at Red Hat, helping build a product that will help customers manage their route for edge devices. And the inspiration for this talk actually came from an example that happened during the early development of this product, since it's an MVP. We are, instead of creating a bunch of microservices, we decided to create a big goal application. And instead of using queries and doing development more asynchronously, we're using GoalRoutines for some things. So this is an example that comes from a real-world example. Before Red Hat, I worked as an SRE tech lead manager at Stone. It is a Brazilian fintech that helps Brazilian entrepreneurs with some of their needs, receiving payments, banking, or a couple of examples. And I joined it first as a developer, and then eventually got interested in the SRE path. So I'm more like a back-end engineer that likes a lot of infrastructure in DevOps, and is very interested in that. But I'm not that good as a SRE as I am as a software engineer. Python and Goal are my two favorite languages. But I've worked with a lot of languages, technologies, for these last two jobs. So in the end, I think I found my passion in Python and Goal. And if anyone has any questions, I'll be happy to answer by the end of the Q&A, or afterwards, on Twitter, or whatever platform you find me. I can probably chat and answer those questions. So first, I want to start with an agenda. So you can have an idea of what to expect of this talk. We're going to start having some fun with Goal routines. I say fun because I have a lot of fun. But I don't think my definition of fun is the same as everyone else's. And then we're going to see a lot of live code. Hope that the goddesses of live code are with me today. If they're not, we're going to have a problem. And we will learn how to finish the work that these Goal routines that we created that are running. We're going to learn how to finish the work that they are doing properly. And then how to listen to system signals. And eventually, we'll enter into a topic about how to handle interrupts in a production environment. Because in this whole example that we faced in HeadHat, everything was working properly, locally. But then we got into a production environment, and it wasn't. So there are some gotchas that I kind of wanted to share. So first, I wanted to say hello. And by hello, I say hello and actually go. I'm going to be sharing the VS Code window and the Chrome tab a couple of times. There it is. Hope you can all see it. So I created a folder called DevConf here. If the phone size is not nice, we can adjust. We're going to create a file named Mingo. And there we are. So in our first hello, wording goal, we're going to create a function main. And it's actually pretty simple. As a Python developer, I got pleasantly surprised at when I first started working with Goal, because it's actually pretty simple. You just need to declare your package, import the FMT package, and just do your hello work. There we go. So we have the first hello word running, which is super cool and actually super simple. And going back to the presentation, I'm happy to actually be interrupted at any time if someone has questions with me in a while. So just shout out. And after we give the first, actually, you said again, after we give our hello word, I wanted to spawn some GoRoutines. Like we said, we're going to have some fun with GoRoutines here. And I wanted to start by saying, what are GoRoutines? GoRoutines that are functions within a Go code, they are not exactly OS threads, but they are in green threads ether. They're a special kind of GoRoutine. They are deeply integrated with Go's runtime. And instead of defining their own suspensions or reentry points, GoRoutine will observe the behavior of those GoRoutines and automatically suspend them and resume depending on what is running inside of them. In the sense that when you make a blocking call in a GoRoutine, like a request to an external service or you were waiting for something to happen, the GoRoutine time will recognize that this is a blocking call and there is some IO blocking this call. And we'll stop the execution and only come back when it's expected to have it ready, the result of this call. So other GoRoutine can run instead of this one. So it gives the idea of parallelism because it feels like everything's running together, just like other GoRoutines if you look at how GoRoutines work in other languages. It feels like they are running out together, but actually what's happening is that GoRoutine is observing what's going on in the background and stopping and resuming the GoRoutines depending on what's going on. So this is like the whole concurrency parallelism talk which could be a whole one hour talk. But the most special thing here is that we have a main function that is actually GoRoutine itself. And it can launch on other GoRoutines that run concurrently to it. And in this order, GoRoutines can spawn other GoRoutines. So you can have a lot of GoRoutines running in your program and they are super lightweight. So they are not heavy things. You don't have to worry too much about like creating too many GoRoutines. You have to worry about memory and synchronization issues as always when you do things concurrently and et cetera. So since we're here like introduction again, let's spawn some GoRoutines and see how they're working. Just go again, there we go. We're gonna use the goal keyword to create this GoRoutine and we can create an animals function to run it. Like we can say, I am a GoRoutine. And this, since it's an animal's function, I have to call it here. So it has the parenthesis. I could also define a new function here, call it GoRoutine and also run it like that. Both examples say they are perfectly fine. You're gonna see a lot of those two and there we go. Oh, and there is a tricky thing here. We are spawning the GoRoutines but we are not actually waiting for them to finish. So this is like one of the main things that I wanted to talk in this presentation is that when we create or GoRoutines, the main program that creates them, it's for team that GoRoutine, but we are not creating a joint point. Go is using a for joint method and when we just create them, we're for team but you have to actually create the joint point so you can have, so you can wait for the results of this GoRoutine for your program to actually stop. So what we can do here is that we can use system, we can use wait groups for that. So I'm gonna go back to these slides super quick to talk about wait groups and then we go back to the code. Okay, so about finishing work properly. As I said, like program execution is gonna begin with the main package and then invoking the function main. When that function invocation is returns, your program is gonna exit. It doesn't wait for other non-main GoRoutines to complete. So the question is how do we do if we wanna wait? And if we wanna wait until the GoRoutine is done, we're gonna need to tell the main program that we wanna wait. We're gonna have to somehow warn it that, hey, I have some work going on inside of another GoRoutine and I want you to wait for it before actually finishing closing. We can do this with, Go has a bunch of synchronization techniques. It's super worthwhile to take a look at Go Sync package for that. You will have mutex, you'll have wait groups, channels. There are so many ways to conditions. There are so many ways to figure out how to do synchronization with in Go. And depending on what is your use case, you might want one or the other. For this one, I think a wait group is perfectly fine. So let's give it a go to actually create that wait group. So we see our GoRoutines actually or main GoRoutine waiting for the other ones. And one thing that is nice to see here is that if you are on a live example, if this were an HTTP server that is normally always running until a deployment happens or it exits for some reason, you wouldn't be seeing that the GoRoutine is not being waited for because what's gonna happen is that for some reason you have some of the processes blocking your main function, like if it was an HTTP server and it wouldn't be seeing that the GoRoutines aren't running. You would have the illusion that they are. And then when your program exits, you can't stop any of them or not. You wouldn't be aware of it. It wouldn't be so clear. So this example makes it super clear. I'm gonna delete that one and just keep the anonymous function. Like we said, we're gonna create a wait group. Wait group is on the sync package. Like I said, we have a bunch of other tools that we can use like mutex. We have posts, like if you're thinking of connection post database posts, they use the same idea. We can use the once block to make something execute just once. And here we have the wait group. For the wait group, I'm gonna tell that it needs to actually wait for my function to be to run so I can close the program. And another thing that I can do here is every time that I create something a GoRoutine, I can add something to my wait group. But what I'm saying here, hey, I'm spawning one GoRoutine. Like what are comments? Yeah, one GoRoutine is being created. And I want you to wait for it. Our GoRoutines are asleep that log. Oh yeah, of course. Let's just add a time sleep of all of seconds, import time, and add an extra nest. Okay, just setting a time sleep here just to make sure that the GoRoutine started because everything was happening so fast that when we were waiting to GoRoutine did it actually. I don't exactly know why it's complaining about the deadlock, but oh yeah, sure. It's complaining about the deadlock because I forgot something very important here. We have to, on our wait group, we have to add Don to let the wait group know that our GoRoutine finished it. And I obviously forgot it. So by the time it reaches this block of code and there are no GoRoutines running in our program and Don hasn't been called it, GoRuntime is gonna realize that what am I waiting for? I'm not waiting for anything. There we go. Now it's gonna sleep for 10 seconds and it's gonna close as soon as it gets done. That is better, a lot better. Okay, and instead of creating just one, we can actually create a bunch of GoRoutines. Let's make sure we are creating a pool of GoRoutines here to make this example a little bit better for the next exciting thing. So I'm gonna use 50 GoRoutines running for 10 seconds each. So now I'm creating 50 GoRoutines and I'm running. I'm gonna at this time sleep inside of my GoRoutine to make sure it's a long running GoRoutine. So each one of those will run for about 10 seconds. Let's make it five, make it easier. Five seconds, 50 GoRoutines running for five seconds each and in the end we have to wait for all of them and we can say that, hey, it's gone. Oh yeah, I did. Thank you. I wanted to use the constant seconds. And now, all right, we're creating some GoRoutines. We're waiting for them. We have to make sure we are adding them to the wait group before the GoRoutine actually starts. If you do that, actually Go is gonna complain if you do that inside of this block of code. This is actually wrong because you can reach the wait block before you added all of your GoRoutines to the wait group. So this has to be done in the synchronous block and then you can do your synchronous block. So this is actually how we can wait for some work to be done in GoRoutines, wait groups. So if you're doing some sort of program that is creating a lot of GoRoutines, you wanna make sure that inside of your GoRoutines you have a way of telling the main GoRoutine that your work is done in those GoRoutines that you created. Otherwise your program can just exit and you are gonna lose the work that you're doing inside of them, which was what was happening with us in the beginning. Coming back to the slides. And there is a problem here that what if I wanna exit earlier because what I just showed there, we were creating the GoRoutines, we were doing something inside of them running for five seconds each, but I couldn't stop before five seconds is a, it's not a lot of time, but if those were GoRoutines that kept running for half an hour for some reason, like maybe I'm waiting for something on external service, maybe I created something and I'm waiting for an answer of an external service. What can I do to exit earlier and still make sure that whatever is happening inside of my GoRoutines are not, like the work is not being terminated in a way that is gonna somehow hurt my application, somehow get my data in a bad situation where I don't know what is the status of what was currently happening and work can be just simply lost depending on what you're doing there. So can we act on an interruption? Can we actually exit earlier and hey, I wanna stop my program before my GoRoutines and can we do that? And the answer is yes. We can do that by listening to system signals and when we go to system signals, we can start listening to them to handle the execution of our program. When I say like system signals, I mean actual physics signals, you can go to like any Linux page that shows you what they are, what they mean. And the first signal set, I kinda wanted to briefly go over here. There are so many of them, we definitely don't have the time to go through all of them. But the first three ones that are here, SIGN, SIGTAR, and SIGQ, they are termination signals. When your application gets a SIGN, it means that an interrupt signal was received from normally a keyboard, some sort of input from the user. And SIGTARM is gonna mean that a generic terminal, it's a generic terminal signal. And your program has to simply terminate. When you go to SIGQ, it's more like an instant queue signal and it can be interrupted. Normally when we're Linux users and we can't, something is like not working on our computers, we are very used to doing like SIGQ by doing the queue minus nine when we are early Linux users. And that is actually pretty bad for applications because it sends like an instant queue and does not give the app time to do anything. It's just like an instant queue, no time to wrap up anything, it cannot be interrupted, it cannot be acted upon. So we should avoid SIGQs as much as we can. And one that I kind of wanted to talk about but it's not gonna be super important for us right now is SIGTARM. It is not a throw initial signal, it's a stop one. And it means that whenever you can, whenever you want, you can send a SIG continuation like a SIGCON signal to a program and your program should resume execution. So just like SIGQ, SIGSTARM cannot be interrupted or caught or acted upon because it's mostly handled by the operational system. And we normally don't have to care too much about this system signals because the language runtime, your operational system is gonna handle them for you. But if you're worried about having something happening on the background on your Go application that you actually wanna act upon before your Go application closes, system signals are the way to go because you can actually listen to them, subscribe to them and figure out what's going on. So yes, we can do that and we can listen to a signal by using Go's notify function and passing a channel to it. So something that we can do here is I'm gonna first, yeah. If you're wondering like what a Go channel is, a Go channel is like a mechanism for Go routines communicate against each other. It's pretty common that you're gonna see communication by channel amongst Go routines. It's actually the right way to Go routines to communicate. And it's mainly like a publisher, subscribe sort of thing in memory within the Go runtime. So whenever you receive that signal, if you're gonna receive a value in that side, in that channel. So let's say that I have a SIG term sent by my keyboard, we're gonna create a channel in it in Go and listen to that SIG term and we are gonna be able to run some code before we actually exit the Go program. Like I said, communication by channel is super common against amongst Go routines. And it is like the right way to pass values from one Go routine to another because if you just pass by value and not by channel, like if you create a variable and using side of the Go routine, you don't actually know when the value is gonna be available inside of your Go routine vice versa. So it's very hard to, because Go routines, you don't have a guarantee about the time that this code is gonna execute. So it's very hard to know if your value is gonna be available. So channels are really good for you to know that you have a value available at this time in the inside of your Go routines. So let's go to this. I'm gonna share this code again, window. So when we, something here, and like I said, we need a channel and let's create it here under there. See, right, that is okay. And it can be a channel of OS signal like the Go allows us to use. Normally we make that channel of OS signal. I think I might need to do it like, I can create that channel of OS signal and like I said, we can use Go's notify function to do that. So go back here. Signal is the packaging goal that we wanna use here and Go's notify function is gonna help us to listen to that signal. And whenever an interrupt happens, that channel is gonna receive a value. So this is super cool because we can actually listen to that channel now. And whenever we get a value here, what we're actually telling Go now is that, wait here until I have a value for that variable. I went to do it. And everything here is gonna be synchronous from now on. So if I say something here, like receive the interrupt, if I say something like here, like I received an interrupt, this block of code, we only run after my interrupt has been received by my application. So in the situation that we have now, I'm gonna add like more seconds here, 120. We have super long running Go routines and I am not gonna wait for them anymore because otherwise this is gonna take forever. I just wanna show that interrupt's gonna be working. So, okay, when I say that the interrupt's gonna be working here, oh, yes, sure. I actually wanna do that inside of my Go routines because if I do it outside, actually no, no, no, going back, I get nervous, I get super nervous. So we were seeing the interrupt and let me go back to, just wanna make sure that I'm gonna get that interrupt correctly. And we go back to the, yeah, we have the hello word and that's it. We have the hello word and we have the signal waiting for the interrupts and now we can interrupt C and we can catch that interrupt and actually do something after we get it, which is super cool. But now we have to integrate with our Go routine, otherwise we're not waiting for them, we're not doing what we wanted. So let's do that now. I want to first create a better structure for the tasks, otherwise we're not gonna be seeing exactly what is going on here. So let's rewrite our code a little bit. So I'm gonna create a type test struct for the status. It's gonna create a type test struct for the status and instead of just doing something here that adds one here, I'm gonna create a new task for each one of those things. Task and the ID is gonna be the integer and the status is gonna be started. So if I have my task T with the status started, what I want is that after it's done here, I wanna change the status to done. But if it gets interrupted by an OS signal, I want the status to be interrupted. So we can achieve that by, let's come here, create a list of tasks so we don't track of them and get task T inside of that list. Actually need to create a list of pointer of tasks here. Okay, this list has, I have to declare size for it otherwise it's not gonna work. You can drop this actually. So I'm gonna create a test list with the size n, which is the amount of tasks that we are creating. And every time we're gonna have that weight here, right, leaving the weight here and getting the code where we get the OS signal inside of a different color team. So what we want with that, we wanna make sure that we have something running before we wait for our tasks that we'll get the OS signals. And when we have that here, we can go through our tasks and check if they are not done. And if they are not done, we can actually interrupt it. So we can go like branch to the tasks. And if tasks I can put that for simplicity, we can mark it as finishing. This is only gonna work because I'm, this is only gonna work because I am using like a pointer of a task here. If I weren't changing the status here, we're gonna change anything, but since I'm using the same object, it's okay, like I'm using the pointer of a task here and everything's perfectly fine, task status here. And one thing that we do have to make sure is that when we pass values inside, to inside of our GoRoutine, we have to make sure that we are sending them as parameters, otherwise the value of T can change throughout the execution. You might not have the correct task here. So what you wanna is actually send the task here to the task here is T, I think, yeah. Okay, I wrote a lot of code. Now I'm gonna read this to see if I forgot something. Okay, creating tasks, adding to the weight group, Go function that does a super long wait, change the status if done. Okay, we're gonna more kind of finish it. Like I said, we can actually say that or working in that GoRoutine is done here. I think that makes sense for me. Don't know if it makes sense for everyone else at this point. Let's go. Okay, so something's happening in the background. Like I said, stop sign-on, we received an interrupt and the wait got done here, like the wait method returned it because we call it the done method for each one of the tests that were not finished. What we can do here to have like a better overview of what just happened is that we can say that, kind of I'm gonna copy and paste because it's easier. We can say that SKD, has the status finish it. Let's go. Whoopsie. Actually finish it running, said finish it, finish it, finish it. Because I said I didn't finish it here but I wanted to say interrupted. The actual example for this is it unknown for status is good. I forgot to do that. Okay, good. So if we were to wait the 120 seconds, we were gonna see the status is done. So if you do like five seconds, I'm gonna see the status is done. So this is just one of the ways that we can have to listen to assistant signal and do some work that is needs to be done regarding your goal routines and then actually finish in a way that it's easier to come back later. Like if this was like a database, if this uses a database, I would actually change the status and say from database that's got interrupted and then in a second execution, I could check which task has got interrupted and there isn't from there. So this is actually something that was happening in the code that I'm working at had had, since it's an MVP, we're not using PuiWis yet, like I said, and then we had tests that were running, they were getting interrupted, but they got stuck on running forever because we didn't know that the status got changed just because the pod restarted or something happened. And okay, it said we introduce a new status and I think that the last thing here is that what happens if we actually get this code and run it inside of a Kubernetes cluster if we get it to a production like environment, what, how is it gonna behave and how can we make sure that it's going to work? One of the struggles that I had when I did that was that we have a super nice gotcha here that Kubernetes is gonna send you like not a OS interrupt because this is something that you get from your keyboard, but you need an OS in each of this into an OS SIG term. Actually, it's a different package. You can check here. OS interrupt is just a variable that points to sys call save ant and they do not expose a SIG term like that. So we go directly to sys call and get SIG term here. This is something that actually took me a long, long time to figure out because this is one of the cases of it's working on my machine and I don't know why it doesn't work when I deploy my application. It really like, I really didn't know. So one thing that we can do is like great container file here. This is like a standard container file for Go. It's gonna use a two-step build on Docker to build our application and then get your binary and run to the scratch image. So it's a super tiny Go image. So we can see, oh yeah. I did not do something super important which is like do a Go mod in it in my code. So I was using Go mod here, but I didn't need it. So now it should be good. Yeah, I have my Go mod file. All right, nice. I have an image here that I can run if I want. For the sake of time, I'm gonna actually get the image that I already have built and run late. I have an image on Docker IO that I publish it because would be like the Kubernetes deployment for that image. But if you build it locally with that container file and host it somewhere or use your local registry, you should be able to create a local cluster and play with it too. So I'm using that one that I already have here. And what I can do is like apply that image deployment and I have some, I have a pod running here with that. Let me make it bigger. Okay, so I have a pod running. I can get the logs here and in the code that I wrote before, I was bringing that as local routine started. So if I go over there and literally delete the pod, oops, I think it already stopped running. Is that it? Yeah, it's not that slow apparently. Let me go back to the, it's not that slow apparently. I'm gonna have to be faster. Coupcudo, get pods. Probably a minute or so. C3, C2, okay. Then it actually prints the shutdown status if you look up there. It has the test status interrupt it and things like that. So it's catching the interruptional sign up. It's doing what it needs to do and then closing your app. Another gut for Docker and Kubernetes and things like that is that depending on the time that you need to wrap up your application, this was super fast because it was an example of things that were just in memory. But if you needed to do like, if I needed to run a two minute, two minute like code here, block of code here, I need a higher termination grace period. That is because Kubernetes has a termination grace period by default, but if you need more than, I think it's 30 seconds, something like that. So if you need more than that, you actually need to say, hey, I need two minutes here, thank you. And then Kubernetes will actually wait for you. After this grace period ends, if you didn't do anything, your application is just gonna interrupt with a CQ that you cannot catch. So make sure that you have this set up correctly and you can probably do the same and apply the same things for long running tests that you have on your Go application. And this is gonna be useful exactly when you wanna build some MVP, when you still are not ready to, when you still are not ready to actually move to a quayway system, when you know that it is some kind of process that you wanna move to a Rep Jim Q or a Kafka sort of quayway, but you were not sure if it is the right time yet. So you can use Go routines to do this kind of work that is running concurrently that can be longer. But you also have to make sure that if it's too long, you'll have some sort of way to catching the system signals and understanding what's going on, if your program waits and if your program stops somehow. So I think that's it. I'm not sure if anyone has any questions, but I'm happy to take them here or like on Twitter or somewhere else. I think that would be all. Hi, thanks, Bianca, for your interesting presentation and showing us the code and explaining us the whole, an entire flow of it. I don't see any questions in the Q&A section at the moment. All right, and that's it. I mean, I'm happy to just chat any time about Go, about Go routines, about any of that. Hello, Wagi. I have my manager here watching the presentation. And yeah, I mean, I'm happy to take any questions afterwards, chat about it any time. Just ping me and I'm gonna be happy to chat. Sure, thanks a lot, Bianca, once again. Thank you.