 Hello everybody. How are we doing today? I'm the last day of KubeCon Good. Yeah, I feel about the same. Yeah, that was that was about as half-hearted as I was expecting. Wonderful Thank you very much for joining me today. We're gonna talk about Agones and Quilkin open-source multiplayer game server orchestration and service mesh on Kubernetes. I think I managed to fit all the buzzwords in almost Thank you very much for hanging out with me a little about me before we get started My name is Mark Mandel or Mark Mandel depending on if I'm using my American or Australian accent I technically develop a relations for gaming on Google Cloud I'm founder. I'm one of the founders I should say and Maintainers on both of the open-source projects. We're talking about today, which is Agones and Quilkin You can find all my details up here Before we get started, I would love to poll the room because this is gonna be a Distributed audience and possibly a little different from some of the normal people I give this to Who here works in the video game industry? Small pocket of people over here. Okay, cool. That sounds fine. Who here went to Joseph's talk earlier today about Agones and PlayStation? Okay, a lot of you got some giving not everyone. Okay, that's perfect. Who here likes video games? Okay, now I understand what's going on. Awesome. Awesome. No, that's great. That's great. That's great. Okay, cool We're gonna be talking about some fun stuff today and we're we've got half an hour So some of it will be a little surface, but hopefully I'll be able to give you enough information to be able to go Oh, that's interesting. Cool. So What do I want to talk about today? I Want to talk about fast-paced real-time multiplayer games Your overwatches your fortnight's your rocket leagues all those kind of fun stuff and I want to talk about kind of Two interesting kind of problems both a little bit about how we host and scale them but also how we look at securing and monitoring and Doing interesting things that's on a service meshy kind of stuff with the UDP traffic that also happens With games over workloads and we'll dig a little bit into both We'll talk about how some of that stuff works as well before we go too deep I am going to be doing a live demo because I like being afraid I'll be using a open source Game called synodic who here has heard of synodic. I'm guessing like three people. Yep. Okay. I counted that about right perfect Any music open source game called synodic? It's like if you ever remember like circuit 2000 like Unreal tournament 2000 that kind of game It's very much in that vein. It's lots of fun Okay, cool. So let's talk a little bit about real-time Games I'm gonna talk about one particular pattern which we call dedicated game servers And we're gonna talk about some of the challenges there in and hopefully how the two projects we want to talk about today Aquinas and Quirkin help mitigate some of those challenges See so dedicated game server very quick primer Think of it as you are playing a game with some people There is a full simulation of that game happening somewhere on a process running in the cloud That is what we call a dedicated game server Its job is that many people will connect to a dedicated game server And it has the state of what's happening inside the game and it communicates that state back down to the players How dedicated game servers work? That's a whole talk on in itself But that's the basic deal of how dedicated game server works now There are some challenges that come with dedicated game servers. I love dedicated game servers I live with them all the time But they are Kind of a very much in the very classic monolith sense One big thing that encapsulates pretty much everything that happens inside that game You'll get everything from communication run times to simulation stuff to play authentication It all sits inside this dedicated game server process That means yes, it is a single point of failure Dedicated game server is usually handle a wide variety of in-memory state because they're running that whole simulation inside of them So if they go down and all the players are connected to them Have a bad experience if that game server goes down, which is not great This also means that unfortunately some people in the games Seeing aren't the nicest of people So that can mean they can be target for attack They can get DDoS you can have all kinds of horrible stuff people like I'm losing. I don't want to It's fine. So There are interesting problems about how to orchestrate and scale them we sort of talk early today speaking about agonist I'm going to talk a little bit about gone as we're going to focus probably more on the quirk inside today But for those who aren't familiar One of the projects I work on is a project called agonist and we'll show some examples of agonist Agonist is a open source project that teaches kubernetes how to run game servers Basically, the major thing that it does is that it's a where that hey if you have players playing on a game Don't shut it down But if you don't have players playing on it, that's okay. We can shut it down. We can do stuff to it It introduces new concepts to kubernetes like game server and fleets and we'll have a look at some of that as well It's it's nice been working on it probably for about five years now We've had a wide variety contributors and thanks to the extensibility of kubernetes with CRDs and operators and controllers It's really nice the sort of stuff we can do inside there but Let's talk a little about some more challenges with dedicated game servers because I want to talk a little bit at UDP game protocols So if you're not familiar pretty much all your multiplayer games Generally speaking as per everything it depends our UDP based Just take it for granted. That's why it is. It's a low latency thing. It works better. That's again a whole of the talk but the fun problem with dedicated game servers or just gaming in general around UDP game protocols is There's no standards Not a thing Many use some kind of court-on-court reliable UDP That's doesn't mean like a re-implementation of TCP, but that's just a thing that happens Security and encryption can vary greatly We'll find there's a variety of sort of protocols and not really protocols But kind of libraries like Unreal Engine has some stuff Unity has some stuff There's libraries like netcode and KCP steam has game networking circuits A lot of times you'll just end up in a proprietary solution because it's just better for your particular game to get better bandwidth or throughput or anything else and Security also like I said really widely varies Some people will use DTLS, which is awesome. Some people take stuff from Libsodium I've seen projects where they're just like that little bit of Quicks really nice. I'll take that. Thanks All the way down to who's gonna look at my traffic anyway. It doesn't matter so like Right, it's very different from Sort of probably what all of us many of us who work in sort of the cloud native landscape Like there's no HTTP. There's no GRPC like none of that exists So the only guarantee that you can have in this environment is that you will get an array of bytes Cool awesome, thanks. Okay, so how do we start looking at problems like we're talking about right like communication and like You know making making some kind of standards or like useful tools for this kind of stuff So it isn't just baked into the dedicated games over all time So I will definitely not be the first person to say this or point this out I mean, we've all seen this happen before Proxies proxies are great And we see them all over the place from everything from side cars to everything else In this environment if we have a UDP proxy just talking theoretically at this point We start to get some some nice things that happen here We have redundant points of entry. We can't really get away from the monolith per se as much But at least we can hide the information about where those game servers are because usually we're making a direct connection to an IP and port The game client gets that information suddenly it's available to you so anything we can do to hide that is nice We can put some commodity stuff in it We'll look at that some monitoring some information if we want to like manipulate the UDP packets We can do that too And it just generally makes it much harder to take down We can distribute load if we're getting DDoS'd we can put smarts in it That otherwise we would have to have put inside our dedicated games over or run in a different process. So yay proxies Definitely not a new concept to anyone here So this is where we get into Quilkin over the last couple of years Been building this with Ambark Studios and we've had some other people float through which been lovely Quilkin's a non-transparent UDP proxy specifically designed for use with dedicated game servers. I there is no protocol We want to do stuff like access control telemetry data metrics and more I will say Quilkin. It is not yet GA. I think is the best way to put it. It works We keep finding new stuff. So we break API's on occasion But I've been working on it within buck studios And I think it's safe to say that they've been using it for their upcoming game because they've published off about that And we've had some other uptick in some other places as well. So I'm gonna take you through Sort of an example of these two working together And how we can both orchestrate game servers as well as do access control on them Ideally without actually touching our game server code potentially even our game client So Little bit about Quilkin Quilkin really has two kind of concepts and you'll pick up that they're kind of the same concepts in envoy Thank you very much that team for like opening up that proxy space. That's awesome. We lift a bunch of stuff from there We have endpoints which are basically addresses and We have filters strangely enough We have a variety of filters a little probably expand over time We can do all kinds of stuff like do basic firewall rules. We can do compression We can do routing to different endpoints. We'll look at that a fair amount We can add to packets. We can remove from packets all that kind of fun stuff but everything we do inside our filters really is about manipulating a Array of bytes because that's all we can guarantee we get So if we have a look at a Sample config here. This is just a static Quilkin config It's YAML just like everything else we do without time We can specify a set of filters here. We're concatenating bytes and we'll look at this later So if we wanted to add say a routing token to the end of our packets Which is exactly what we're doing here We can do that through a concatenate bytes filter We can say hey a bend is that and add our routing token of a not very secure. Please don't do this four five six Onto the end of every byte that comes through our system and then we can pass it on to an endpoint Nothing more more fancy than that So we can start doing these layers of filters to be able to manipulate our packets Even though they're fairly rudimentary and we can start to do some relatively sophisticated things Okay, so that's the static configuration, but if you're saying okay cool mark. You've got game servers. That's dynamic They're spinning up. They're going away people are playing on them. You're scaling them up and down How do we integrate this thing with like something like agones, which is like an orchestrator that that manages game servers? Well, again, we we talk a little bit about envoy a little bit We implemented an XDS API. It's an XDS compliant API If anyone's looking for a rust implementation of XDS we have one I guess That might be useful for some people People here anyone not familiar with XDS Which is also totally fine some people. All right cool cool XDS is a relative standard API for configuring proxy servers It has cluster discoveries listener discoveries endpoint discoveries and so you can write programmatic configuration of how you want however many Proxies to be configured by their filters and where they're going and what kind of stuff But that's hard and we don't want to do that Right. I know a bunch of hands went up. We're like, I don't know how XDS works. That seems hard So we looked at it and we're like, okay We are often building stuff with agones, which we'll actually look at in action in a bit And we want to run Quilkin with it. So yes, you can write your own programmatic configuration of how this stuff works but if this is a common pattern and we both like these open source products and We want to use them together we created a set of providers inside Quilkin So this now means that what you can also do is you can run Quilkin in proxy mode But you can also run it in a manage mode which says hey We're going to take all our dynamic information from me. They're inside the Kubernetes cluster or what's happening inside agones itself So we can see things like oh, there's a bunch of game servers running your clusters. Those becomes endpoints We'll have a config map for the system so you can see what's going on inside of it For what filters we want. So we'll kind of have a look at that in action Because that this this is actually one thing I think is actually quite exciting When you start building layers on top of these layers and suddenly you have these systems that can talk to each other without having to know The complexities of things like XDS although the capability of such is still there and I think that's really really cool So let's have a look at how that works. All right, so If you have Quilkin in management mode drop a config map in your cluster Make sure it has cool config map and here we just have Another set of filters that we want to do now What we're actually going to do here is you saw before we sent in a routing token We're going to capture that routing token. It's the last three Bites of the packet as it goes through again. We just get an array of bytes nothing more than that We're going to take that array of bytes. We're going to drop it off The capture filter will pass that information that routing token to the next filter The token router will be like oh cool. I see you have a routing token And as part of the metadata for endpoints, you can attach tokens to them strangely enough It'll look through the endpoints and say which one does it match to if it matches that endpoint That's what it's going to send that information to and we'll see this in action in a sec But this this basically is all I need to configure the filters inside End number of proxies inside my cluster when I'm running in manager Ganes mode To create a fleet now, I know some of you weren't here for the talk previously about running a Ganes Agones Has game servers and it has fleets Kubernetes you have pods deployments game servers fleets kind of the same so fleets are just big groups of game servers Usually you want to run a bunch of them because they run they take a while to spin up You want to have warm set so that you can basically grab one of them which we'll talk about in a minute To CRD basically what we're doing is we're saying hey How many of them do we want and what does the container spec look like it's a pod template spec? We do a lot of port management and all kinds of other stuff for today's talk I'm going to hand wave a lot of that away But essentially I'm going to have a bunch of game servers running inside my system. They're going to spin up using this fleet We're going to have three of them. They're going to come up in what's called a ready state Which basically means they've loaded all their information and they can take players And we're going to have them available But there is one other concept that's going to tie Quilkin and Agones together in a really nice way. I was saying before that When Quilkin's running in manager Agones mode Endpoints get picked up when game servers exist. They actually get picked up when game servers move into what we call an allocated mode Allocation is kind of a weird special thing that Agones does that kind of breaks the Kubernetes paradigm just a smidge We'll see it in action, but it's more of an imperative command that basically says go find me Go find me a set of game servers that match those selectors we have up there find the best one for me And market is allocated move into an allocated state and that's the special state We were talking about before that says I have players don't mess with my game Because I need them to finish otherwise they post mean things on reddit. We don't like that. That's bad Nobody likes that or they put things on Twitter or whatever people are using these days But the other thing that Quilkin does it's also quite nice is it looks for Annotations on those game servers to know what those routing tokens are and so it manages it for us One of the extra things we can do during allocation is attach and an annotation to that game server that it's requesting So we like oh cool go get me that game server sweet I'm gonna grab it. I'm gonna mark it is allocated and then I'm also gonna be like these are the routing tokens That are available to me right now. I've you know made some random number generation Hopefully something cryptographically secure And I'm gonna hand those over I could do that at allocation time I can actually do that whenever as long as I have access to the Kubernetes API and then in theory All that stuff should all roll through together And we suddenly have access control and we can be able to do things like if this player has the right token They get access to their game server if they don't they get booted off if they're doing something in a farious We can stop their access pretty quickly across a whole range of proxies So Should we try it in action? Cool, this will be fine. I also realized that my cluster is still in the US So the gameplay traffic might be a little slow, but it should be fine. All right. Oh Let me go. I have no idea. Is that big enough or do I need to make it bigger? Bigger bigger. All right, let's go Also gonna do this over my shoulder just for funsies. We're good Thumbs up excellent. All right, this should work. Where's my mouse? Perfect. Perfect. Perfect. All right Okay, I'm in the right spot. I am not Fantastic, okay, so I have nothing in my hat. Um, I have a cluster has it gone is already installed on it I don't have any game servers There's nothing there. So the first thing I'm gonna do is I'm gonna create that fleet of game servers We sell before so this is gonna create that fleet. We saw with three game servers And the internet here is good. Yes We have a quick look we can see we have three of them They're gonna take a little bit to spin up there about a gig in size each This is the dedicated game server for that game. We were talking about called synodic You'll see here the state there. They each have an address in port So you ever play dedicated games behind the scenes you usually get a direct address to connect to with a client Quick look Beautiful they're all up and ready. Fantastic. All right. So next thing we need to do is install Quilkin in its manage The manage what's my color manager got his mode. I mean what files I have in here Yeah, there it is So that's gonna install That config map we saw before with all the filters in it and we're gonna see That the XDS control plane has spun up So basically it's just a gRPC endpoint that each of the proxies is gonna connect to so that it knows what configuration information to get Beautiful. Yay up and running. Fantastic. Good start Also, see there's a There's a service for it too. Yeah, perfect. Perfect. Okay next we need some proxy pools to connect to And a UDP load balancer to throw in front of it Those will be set up so that Very proxy pools. Yeah, there we go Those will be set up so that they connect back to that manager got his mode and they'll be looking at the Configuration that we've set up in that config map and what game servers we have available now all our game servers are ready So we won't have any endpoints yet. We can actually inspect that Let's just have a quick look Pods Beautiful. That's all up and running. All right. Let me show you what that looks like I'm actually we have a config endpoint So we're gonna have a quick look. Can we port forward? Yeah, give me that Beautiful. We'll just hide that for a sec. No wrong one So this now means that we can have a look at our config and that's a lot So let's scroll up a little bit But we can see that the config on that proxy has those filters. We were talking about before there's our our capture There's our token router. So it's gonna grab that end of the packet and push it up We don't we can see we're connected to our cook a management system But we don't have any endpoints or anything like that set up. So let's get that going I'm also just gonna quickly check to see if we have an external IP we do okay So that's where we're actually gonna end up connecting to when we send our traffic for our game So rather than connecting directly to the game server, we can actually send it through a load balancer Scatter the load across that proxies and hopefully save ourselves some headaches alright, let us Allocate a game server so I'm gonna allocate a game server again basically being like the manual matchmaker that says okay I have a game server. Let's get this up and running So create that usually this is something like a matchmaker does through an API or something like that, but we'll do it We'll do it on their behalf So I'm just gonna create this game server allocation It's gonna look through my game servers. It's gonna return me the state of which one it is It's allocated so we can actually look at it. This again means that that we have players on it. It's it's demarcated a special Awesome, so it's allocated. We can see that there so we get it to scale up and down We could actually do rolling updates and all kinds of other fun stuff to the cluster without having to worry about Whether or not something bad was gonna happen. Okay, cool. Oh No, now we have to actually connect to it and play a game But before we do that if you remember we need to attach that routing token to the end of our packets before it goes through Now I have this open-source game. I don't want To mess with the code on this open-source game that seems scary and I don't like that So we can run an instance of the proxy to do this too, right? In production you may do this as part of your game client Just because things like consoles are really restrictive on like what you can do with them But PC and other places this would probably be perfectly fine What I need to do is Get my external IP Looks pretty good. This is gonna be great for my neck later. So you'll you'll uh Where are we let's scroll down you'll recognize this from before this is literally the same can catnate bite Filtering that we're talking about previously so rather than Let's put our load balancer. I can beautiful Awesome. All right, so theory being is what we can do is we can spin up an instance fair Quilkin locally Cool, that's gonna accept our game client traffic Which is gonna stick our client token on it to a routing token on it That's gonna send to the load balancer then from the load balancer will go to one of the proxies the proxies Can be like cool. I see you have one of the tokens. That's fantastic. Actually, let me double Let me show you something real quick before I before I do this We can actually look at this now, but yeah, it'll look at the token past the token and be like Oh, that's the end point I descend to and we'll be able to play a game But if we have a look at our config Real quick We can see our endpoints right we allocated a game server the management system sees that there's an address There's our token on it at four five six very secure stuff So in theory this should all work now Okay moment of truth Let's run our proxy. That's all up and running. Let's pop over here And let's run Synodic This will be fun. All right So rather than connecting to a mode address and 1.0 127.0.0.1 I'm going to connect to 7000 which is local And everything is working black screen for whatever reason this game is actually a good thing. Don't tell me why Um But yes, now this is connecting to my agonist cluster that is running over in I think it's us west actually Um, but it's running through my local proxy And I got dead already. Wow. That was that was really fast That was on purpose for sure. Um, it's because I'm playing on a trackpad. It's not not how bad I am at video games Um But it's yeah, it's running the information through the proxy that is local through the udp load balancer And then and then into the game server People will always ask mark. What's the overhead for this? Um, so usually we see about a p99 about half a millisecond through the proxy That's sort of worst-case scenario, which is reasonable and fine. It's not too bad But it does mean yeah now we can nice do nice things like either manipulate our traffic We can do this access control like we're talking about before Um, we can either remove or add stuff if we want to remove that annotation at runtime from that game server because I'm doing something that I shouldn't be doing inside the game We're able to do that. Um, and you can start building our architectures of proxies as well But there is actually something else that I want to show you that's also cool Again, we're talking about breaking apart the monolith. It's very much that traditional monolith to microservices kind of conversation Um, I will I will exit this that is fine But if I come over here and we refresh this page because it should be fine Yeah, suddenly we can do stuff like have generic metrics for our udp communication that we wouldn't otherwise have necessarily been as easy to get to Running this out of a game server. Um, I don't know if you know this But there's not really a prometheus exporter for like unity or unreal Um, we'd be really nice if there was actually I should talk to those people that'd be really handy If any of you are here, can you write one of those? Thank you Um, but we can start seeing things like network traffic rates and and the information packet drops errors How many sessions we have the length again all that nice stuff that you get from Open source standards where you're like, oh look, I'm running basically a service mesh where I can start to start pull this kind of information out We have lots of other cool stuff. We're looking at doing One of the things I'm actively looking at right now is Being able to like talk to external systems Like if you're rate limiting something and you've got someone who's sending way too many packets You know like they shouldn't be doing that We should tell like a firewall to stop them from doing that or something like that We can do all kinds of interesting stuff there But again, we start to break apart the model length which I think is really exciting And is is particular to this particular use case, but we can kind of Start to set some standards as well Everything we're building here as well is very like Simple in it's oh, you don't see that Very simple in sort of its usage if you do need to like build it into your game client Which in some cases you will again what we're doing here. It's adding a routing token to the end of a packet We're trying to be as simple and composable as possible We also have some unreal Clients in there as well if you want to do that for those of us who work in video games I know there's just more pocket over there. I just want to mention that all right cool Wrap it up so We saw a nice demo that actually meant that We can implement UDP proxies And definitely I want to say like I'm not definitely not the first to do this There's there's a bunch of prior art here that we've definitely leveraged Across both wider tech and the gaming industry, but suddenly yeah, we can start to talk about open source standard tooling for this kind of work We can make it so that game servers are harder to take down. We can have more redundancy We can kind of give some of the What have traditionally been probably some of the more proprietary Proxying solution for games Out to the wider video game industry, which makes me happy Um, and it does also mean that in a space where There are no standards when you start building proxies We've seen like with everything else that we've seen it through envoy and like through traffic and through all the variety Proxies and sidecars we've seen in the in the cloud native space And when you make it easy to do I don't want to say the right thing the better things Then you can start to set those standards and that that also really excites me So yeah, this is this is literally what we what we ended up with uh with the demo that I showed This isn't the only way you could configure this you could Do all kinds of other fun stuff you could be like here I'm going to give the game client three proxies out of the set of 20 that I have Um, maybe you'll get some of them. Maybe you'll get all of them Maybe I'll do fun things like honeypots and proxies when I know some people are not very nice people and be like I'm just going to send all the bad traffic over here Because I know it's getting bad traffic. You can start doing all kinds of other interesting things But yeah, we have the agonist xds provider. It talks to the dedicated game server sees what's going on there And uh, can manipulate all the rest of the proxies. Um, so yeah So finally to wrap up, um, I will also mention my business cards are down here in the front So if you think oh, maybe in the future, I will have a question for mark my contact details are there But we have active communities across both projects. Um, if you're interested in game server game server orchestration or just in general operators and crds and extending Kubernetes in weird and wonderful ways, uh agonist.dev We have lots and lots of contributors there. It's uh, we have a slack that's available as well Quilkin, um, is Been running for a couple of years, but it's still I would say relatively new but but it's been growing quite nicely over the last last period Um, you find a discord there. You can see the age of each project by which chat platform everyone uses We would definitely love uh to have you contribute I do want to give some credits as well to the external community. I work with on quilkin, especially those from embark uh, luna de claus, um I started the project with afinia uber who did a lot of the initial work and uh, Aaron power who's been doing a lot of the work on quilkin in the last little while and is absolutely killing it Okay, um, finally, um, I know it's almost lunch time and you all want to run But if you do have questions, uh, please walk up to the microphones and ask them Otherwise, thank you very much for your time. I know it's precious and really appreciate you spending your cube come with me I hope you learned something interesting