 Let's go. Hello everyone. Thanks for coming out today. My name is Joseph Irving. I work at PlayStation. I give you a little bit of context about what I do. I work in a kind of centralized team at PlayStation called Online Technology and we assist the various PlayStation studios with doing online stuff and that's generally like hosting services, like back-end services for games or also infrastructure and best practices around doing that kind of thing. So what I mean by studios, these are studios such as guerrilla games who make Horizon, they're based in Amsterdam, some of them are here today, and studios like Santa Monica, God of War, Insomniac, Spiderman, etc. These are what we call PlayStation studios collectively, so there's many of them, but when I'm talking about them, that's who I'm talking about. We don't make any games ourselves, we just assist these teams with doing their online side of things, especially as PlayStation traditionally is more of a, at least these days considered more of a single-player company, but we're moving into doing more live-servicing, multiplayer game-style things, and that's what the talk today is going to be about, like, what are the kind of challenges of running these multiplayer games, especially if you want to do it in a Kubernetes cluster, like the madmen that we are. So here is the level table of contents for today, which I'm going to take you through about the challenges of running these things in Kubernetes, but like all video games, we have a mandatory unskippable tutorial that we're going to do first, and that is just to explain what is a real-time game server, what am I talking about? And I will start by saying what isn't a real-time game server. So a lot of games, though, a game client is what the thing that actually runs on your machine, whether that be a PS5, a PC on your phone, that's what we call the game client, and that runs the game. And these often talk to a variety of different back-end services that are running in the cloud or a data center. These could be like a leaderboard service with the highest scores or the fastest time, so a racing game, it could be things like in Little Big Planet, you've got player-generated content, player levels, or it could be an in-game store where you can purchase items or something. These can be ran quite easily in Kubernetes. They see themselves as that. There's an API, probably GRPC or HTTP API running in a cube cluster, and we already use this for a variety of different titles that need to talk to that, whether they be a single player or multi-player game. We often call them async. It's a term we use internally, and we say it's because these often happen in the background. It's like you might be loading this in, or while the player is playing, it doesn't necessarily present itself to the player immediately, and therefore we don't consider it as latency-sensitive. And if these things take a little bit longer to respond, it's not going to completely ruin the player experience. It might just be a bit annoying. So not in the way it will ruin your game play experience. So if that is what is in real time, what is real time? So real time, real time is something like this. It's when you've got a group of players all playing together in the same kind of physical or virtual space, if you like. So, you know, this example, we've got a lot of players in classic World of Warcraft who are all gathered at their PCs. They've connected together, they're playing with each other, except for Leroy, who is finding chicken. Now, what might this look like? How do players actually connect with one another? So in some models, you have what's called peer-to-peer. This is when your console's PCs directly connect to one another and send packets to one other. Say, if I do something on my console, that gets sent to your console, and vice versa. And it's so communicating. Now, this can work with two people in just a 1v1 co-op, let's say. But then, you know, we can introduce another player into the mix, and then it becomes a little bit more complicated. In this scenario, you might have one console. It's kind of elected the leader, and this is what everyone connects to. And what happens in this person's game world is what goes. Effectively, it needs to relay back to other people what happened. Now, you might realize, in this scenario, if that person's internet connection's unstable, or if their console shuts down or whatever, all the other players at this point would lose connection to the server and be weed out, which is one of the issues with scaling this to lots of players. There's also the problem of trust. So, everyone's playing on their own console, and then at the end, let's say they're playing a round of a game. Like, they can all then say back to the server, I won. Right? If I've had my client, who do I know or who to trust? All the gameplay was happening on people's personal machines, and at the end of the gameplay session, they could all report back different things, and we're not sure who's to trust, who is correct. And this can be quite problematic in a game where you've got ranked matchmaking or where league systems where people are going up in ranks, you want to be sure that what happened is what happened, or it could be a loop-based system where you're saying, oh, I just got the best legendary item ever just dropped, and then you tell the server that, and it has to believe you. So, this is one of the problems for using this for more like competitive games. You're probably going to want to avoid this approach. Some ways you can kind of avoid perhaps the kind of single person point of failure is to get all the consoles to connect to each other. So, you could form some kind of mesh network, if you like, where everyone's sending packets to one another, and this can work a bit better, but you've still got a lot of network complexity, potentially. For example, if I were to add just two more players to this diagram, it starts to look like this. And if you can imagine, as we scale this up to like a 15v15 game, it wouldn't be practical to imagine all of them to be connecting to each other over, going through people's personal internet routers, et cetera. So, this is where the concept of a dedicated game server comes in. This is a game server running in a cloud, data center, wherever, and all the players just connect to this and this only. They send all their information to that game server, and it sends them back why things happen. So, you no longer have the problem of if one player's internet connection is unstable, it shouldn't as much affect the other player's experience, you know, one player can drop out and it wouldn't cause everyone to be kicked out. And it also, it's the thing that acts as a source of truth. So, at the end of a match, it can be like, okay, well, I saw all these things happen, therefore, it is true. And yes, this person won or this person lost. So, what this kind of looks like is the game server is actually simulating the game world itself. So, if you have your game world on your console, another person's game world on their console, the game server is also kind of simulating the same game world. So, if one player moves their car, in this example, it will send that to the game server, and hey, look, I've moved over here, I had this collision, the game server will then kind of do the same thing in its game world, and then it will then pass it on to other players and show them that's what's happened, right? This is, but at the same time, everyone else is also sending their information to the game server saying, oh, well, I've moved over here. So, maybe like the player on the right thought they weren't in collision with that car, and it's up to the game server to kind of determine who was right, who was wrong. This can come down to a variety of different logic, that's way too complicated to get into, but it's the game server that ultimately decides what happened, and so it's in charge. And this is kind of how it works. There's some clever things, like smooth out latency problems and stuff, like a common thing is to predict the trajectory of, like, player paths that you can do. So, if you imagine players running in a straight line, the game server will assume they will continue to run in that straight line until it receives a signal telling it otherwise. This is why if you've ever played an online game and someone suddenly starts just running into a wall non-stop, that's because the game server's like, oh, well, they were going in that direction, they're disconnected, so they're probably still going in that direction, and they'll teleport somewhere when they come back, because it now knows where they are. And that's the gist of what the game server is. It's something we run remotely, and it is very, it's a recording moment to moment. It's the every action that players move, it has to record and track. It's normally another UDP, because we need it to be low latency, and it's a persistent connection. And I say persistent connection, a persistent connection for the session of a game or a game session. So if you imagine, like, one match in a FPS game, one round of a death match, that might be one persistent connection to a game server. At the end of the match, you would disconnect, but the entire match, you need to be connected. If you get disconnected at any point, you're going to get an error, or horrible lag, or something on your end. And as I said, very latency sensitive. Any kind of latency problems, and yet you get your players teleporting around, or a slow frame rate on your end. It can be present in different ways, but it's not going to be a fun experience, so we want these to be fast, and not much in between. So some of you may already be thinking, this sounds like a terrible idea trying to run this in Kubernetes. And that is what we are going to do. So running in Kubernetes. So out of the box, it won't work, effectively. And that is where Project called Agones comes in. So Agones is an open source tool by Google. You can find it at Google for games slash Agones. And it introduces a load of new concepts and extensions to the base Kubernetes to make running game servers a much more pleasant experience. Some of the creator of Agones is in the audience, so please pester him, not me afterwards, if you have any questions. And yeah, I'm going to take you through how it does that. And I'll do that when I talk about some of the problems inherent in running a game server in Kubernetes and how it can solve them. So the first problem we'll talk about is termination. So pod termination is something that happens all the time in a lot of Kubernetes clusters. The two most common reasons for like a normal pod termination would either be the cluster or a scaler kicking in. So it scales down a node because it's no longer needed. The cluster shifted around and it will drain all the pods off a node and move the mouse where it's more efficient. Again, you don't want this to happen if players are connected to the pod because as soon as they get this happens, they'll be chucked out the game, right? And the match won't finish or something. And the other one being deployment roll-ups. If you're updating your version of your application, you're rolling out a patch fix, whatever. It will spin up new pods and as they've become healthy, delete the old pods. Again, don't want this to happen if players are playing there. And Kubernetes has no real concept of that. It doesn't know who's playing where. It's just not a concept it has. So Agones has a new concept called a game server, which is a custom resource in Kubernetes. So in Kubernetes, you can add extra resources that don't exist by default using custom resource definitions. And Agones adds quite a few and we're going to cover some of them. But this is like the most basic one and it's a game server. The thing to draw your eye to perhaps is the spec, which you might notice is very similar to a pod spec and that's because it is very similar to a pod spec. In that each game server is one pod. It maps one to one. If I create a game server, it will spin up a corresponding pod and in many ways it behaves exactly the same as a pod, but it has some additional metadata around it that's more game server specific. There's a few extra things in the spec. I won't go over all of them, but one thing to let draw attention to is the container ports. So generally we're running game servers as host network pods that will just expose a specific UDP port. So if we're trying to stack a load of game servers on one node, they each need to know what host port to use. They can't have a static host port defined. They have to be something that's generated for them and so this is a mere bonus. It can do it, it can pass you. It will just pick you on. That's not being used and the game server can spin up on that. We do everything like host network. It's quite common because of latency. We don't want to be going. There's no service messages that we're going through to hit our pods here. Generally a game client will be connecting directly to a pod running on a cluster somewhere with nothing in between. No load balances, no nothing. Effectively though, this is a pod. You can see that it get game servers, it responds with some game servers and each one of them would have a pod. It corresponds with it. There's something else they have. It's an extra states which are more game server specific and in this case they're in a ready state which is quite normal but we'll share the difference. So how this works in practice is a player, person, computer, whatever, it can request a server from Agones and it will go and have a look if there are any game servers in a ready state and if they are ready, that means they're not currently occupied by anyone. They kind of are not occupied. No players are playing here, it's just ready to go and play some games and then it will return that to the player and at the same time it'll change the state of the game server to allocated and this means that players are playing on this and only by some kind of interaction can you change the state like the game server could say at the end of a match that okay I'm done, there's no one on me anymore or it can just terminate itself and anyone can spin up or something but at this point we know that there are players playing on the game server and therefore we should not terminate it and the way this is done from the user's perspective or whoever does it is using some called a game server allocation. So this is how you actually ask for a game server is you create another CRD and there's another CRD called a game server allocation. It works quite similarly to how a like service selector works in Kubernetes as a good analogy in that you have selectors and match labels similar to how you would map a service to a collection of pods. It's kind of similar with this game server allocation you say hey I want some game servers that have these labels on them for example we're doing a match labels on this one and they also have some extra more game specific stuff with it and at that point I've got players like how many players can this game server fit on maybe I've got five players and I'm trying to find them a game server pod so I need one that's got five spots available or something like that. So it has some extra metadata but you create this and then what happens is it returns back an updated version of the game server allocation with in the status field it's added the actual address of the game server. So the most important things here being the address and the port because this is what the game server client's actually going to connect to. So yeah this is an allocate game server it's ready to go connect to this unless you do people. Now every round I just talked about so far was just talking about one static solo game server you know you create one game server you get one pods we just like in kubernetes you don't generally just create one pod at a time you use a deployment or a staple set or something to create a selection of pods and that's where agon has introduced his concept of a fleet. The fleet is very similar in concepts to deployment it looks quite similar from a spec perspective to a deployment but the difference between a fleet and a deployment is it creates game servers which then creates pods. So you create a fleet and it creates a later game servers in a state and again this is all very similar to a deployment but the main difference is it's aware of the fact that it's a game servers and need to be treated a little bit differently than a standard deployment. So just to give you an example of how it understands that in this here we have a scenario where I've got two pods well four pods two of them two of the game servers in an allocated state so that means players could be playing on here we don't know actually but potentially people are playing on here and then two of them are in ready states that means at no point has this game server been given to anyone as a as an address to connect to. So if I were to use command of kubernetes scale replicas one in a normal like kubernetes scenario with a deployment you would end up with one pod right but in agon as you don't you end up with two and the reason for that is it won't delete the pods if there's players playing on them. So it understands that there are people playing here and therefore I should not terminate these pods and it's the same way if you were to do like a deployment roll out it won't terminate pods when players are still playing on them. So it's very much like a deployment just a lot more careful. It's yeah it's not going to it's not going to terminate stuff when people are playing on it it's up to the pods themselves generally to say I'm done I'm ready take me down. So that's that's the first issue kind of solved of like how do we stop the pods being terminated all the time is using these fleets and the game servers that they create. And the next problem we encounter is autoscaling. So in a standard kubernetes cluster the normal way to autoscale your pods is either using a hpa or potentially a vpa hpa being a horizontal pod autoscaler which is when you scale up and the replicas of your pods dependent on some kind of metric by default you can use cpu or memory. You can also use custom metrics if you want but this kind of concept doesn't make as much sense in with game servers like if a game server is running at 80% cpu spinning up another game server isn't going to suddenly reduce the load on that game server necessarily you can't just keep keep adding new game servers. It doesn't it doesn't it's not like a it's not htp requests which we're just evenly spreading across the cluster it's there's a load of players on here they're playing until they've finished adding a new node it's not going to help so yeah it's more about the general state of the the game like how many players are playing who needs a game server these are the more the problems that you encounter when in game servers. So again we're going to have another crd which is the fleet autoscaler so this as an alternative to a horizontal pod autoscaler or a vertical pod autoscaler we have the fleet autoscaler which is a way of autoscaling your fleets and this one is the kind of default standard out the box one you can use which is just it uses a possible policy type buffer and this is all this is is it will make sure you always have a buffer of five in this case five five ready game servers so if you've got five game servers that are allocated you'll have another five that are ready and then you know as another as the sixth one becomes allocated another one will get spun up that's in a ready state so you just always got a little buffer of game servers ready to ready to rumble as and when you need them now this is this can work quite well out the box but potentially you need something a bit more advanced a bit more specific to your game's exact needs and that's where the webhook type of fleet autoscaler comes in this is like is think of it similar to maybe like an admission webhook it's not really the same but it's a me a similar concept of you can you can tell you may as hego talk to this service and this will tell you what to do which is kind of how validating and mutating webhooks work and effectively you can point agones to a custom service that you have designed and it will tell agones how many pods it wants at any point so the agones controller will just you know periodically get updates from this this fleet autoscaler webhook and this will tell it how many of each type of game server it needs so this means we you can you can kind of code a lot more game specific logic into this this fleet autoscaler webhook like maybe you've got a lot of players coming in at the moment and there's like a specific type of content or specific type of level they needed up and therefore you need to spin up these kind of game servers and it just allows a lot more fine-grained control compared to just using the buffer which is a fairly simple way of responding to it yeah so that's how you would scale game servers in the Kubernetes cluster then the final thing we'll get onto is multi region so when you plan a video game you could be playing it from anywhere in the world you know people playing games all over the place and as we know it's very latency sensitive when you connect to a game server so you don't want to be connecting to a game server in the US if you're in Europe for example because the experience will not be fun so a general approach to this if we have Agones well Agones in this case but you know you're going to have your game servers geographically distributed you'll have them in all different parts of the world and the player can therefore connect to one that's close by so that they have a good experience there's a few different ways of how this can be implemented in the game you might have played some games where they just ask you straight up like what region are you playing in or I'm playing in Europe and then it will point you towards European servers or I'm playing a North American and they'll just put you on some North American servers but you don't have to do this and an approach so like Agones well recommends and can help you with is like a ping based approach so this is where you actually just ping from your game server to a load of different servers and you work out which one gives you the best response time and they're like okay well I will pick this one then as it's giving me the best one and this can allow for more flexible kind of deployments of of game servers globally because you don't you're not relying on someone selecting exactly where they are you can and then you can spin these up on demand depending on how you do it it also allows for players to play across region because some games it's so much you know if you're playing on North America that's it you're in North America and where's if you're playing in Europe it's a different set of servers and a different set of people you can play with this can allow you to play with a poor to serve people that obviously you're still going to have some problems if you're in you know the UK trying to play with someone in Australia you might find a game server in the middle or something for you you might both have a bad time but yeah it's still going to work better if you're close by depends on the game but so this is perhaps how you can determine what reading you're in but like how you get a game server is I created a CRD and Agones gave me back a game server and really silly I don't think we have every PlayStation in the world start doing Kubernetes CRD requests to our cluster so instead we're going to use something which we call a matchmaker so going back to our earlier example if in here Leroy had a group of people to play with but afterwards he might not have done at which point you need to find some people to play with and this is where matchmaker comes in matchmaker is a piece of custom software that its job is to group players together based on a variety of different parameters this is like what kind of content they want to do whether they're playing a they want to play some player versus player experiences they want to do some player versus experience a different type of raid or content or game mode you know what whatever it'll take it all these parameters into consideration as well as the skill levels of each player because generally matchmakers are meant to try to make especially player versus player matches as even as possible to give both sides a fighting chance though if you ever look on Twitter that's never true but it's the goal but yeah a whole talk could be done on matchmakers they're a complicated piece of software but this is what we're going to use to actually talk to agonists so a player will talk to a matchmaker asking for a match and the matchmaker is what could actually go and find a server and return it to you you're not going to be doing that integration yourself because that's too complicated but again even then like having the matchmaker go and talk if we're talking about like global clusters all over the place do we want the matchmaker talking to like a Kubernetes API in all over these different regions and having to integrate with them using like RBAC or something it's it's perhaps too complicated so agonist has a way of exposing the bit you care about the allocation bit via a service and this is called the allocator service this you can see in the top left of the screen is we have the allocator agonist allocator service which like in this example we put behind a load balancer what this does is exposes an API either HP or GRPC which a matchmaker for example could query and send requests to and it will it will go and do all that kind of agonist allocation stuff on the behalf of the matchmaker and then return it to it it authenticates by like mutual TLS by default is its authentication method so you can as long as a matchmaker and the allocator service agree with each other they'll be happy but this allows you to abstract a lot of the kind of kubernetes of it all away from the matchmaker the matchmaker just needs the hit an API and it will get given back a ready game server that this cluster has so if you imagine this in practice with like multiple clusters you might have the matchmaker perhaps it's running in its own cluster or somewhere more global and it will have a variety of like regional clusters each of which are exposing a allocator endpoint and if a player comes in and they say oh I'm in the U.S. West for example the matchmaker would know okay well I will go and ask the allocator load balancer in the U.S. West cluster for a game and this is how you might do a kind of global regional deployment you know you just you've abstracted away the individual clusters of it it's just like go hit these allocator endpoints so all the matchmaker needs is like a list of allocators and what region they might correspond with and it also acts when nicely for doing upgrades using an allocator balancer because that's another thing that's kind of important for these is you know as many of you may have experienced doing a Kubernetes cluster upgrade can be a traumatic experience and doing this when people are playing a live game could be very bad if they all get disconnected all of a sudden because we we ball something up and the general approach to Agana's recommends is when you're upgrading either your Kubernetes version or your Agana's version is to spin up a new cluster say if you're doing this with like the allocator service you could just have like a with 53 DNS entry wherever of the allocator service that points to one cluster and then when you've got a new cluster to spin up you just swap it over and then all the old players will continue to play on the old cluster until the game sessions have finished that cluster can now slowly scale down and disappear and then all the new sessions will be getting allocated into the new cluster over time and there's a relatively safe way of testing that testing that upgrade process so hopefully we don't get review bombed on Metacritic okay so to conclude that was a whirlwind tour of Agana's introduces a lot of new concepts to Kubernetes such as the fleets and the game servers which are ways of actually running game servers in a Kubernetes clusters as these auto scalers for scaling them based on player demand versus your traditional metrics you've got these allocator endpoints which you can use to abstract the Agana's away for something like a match maker to connect to and have a kind of multi-region deployment you may ask like at the end of the day though why did you try and do this on Kubernetes which is a good question and the answer would be that a lot of these things we've described while some of them are unique to the kind of nature of running in Kubernetes a lot of these things we would have had to do anyway if we were just trying to run game servers in the cloud like if we were doing this just on normal VMs or something we would still need a mechanism to know okay can I shut down this VM I'll play as playing on it how do I upgrade this how do I auto scale this yeah a lot of these questions and problems still exist it just is so we're building on the shoulders of giants by leveraging both Kubernetes and then Agana's as another way of simplifying this process we'll double that and yeah and that's that pretty much covers it this is this QR code you can scan for the feedback and there's the slides if you want to see them I've been uploaded yep I'm around some of my colleagues from places around if you want to chat we are hiring have to plug that the Guerrilla guys who are based in Amsterdam are also around so please talk to them they're also interested in Agana's and go play Horizon Burning Shores which is launching today thank you that's all we have time for questions so if anyone has a question fire stun silence okay oh I see some hands is there a mic is there a mic guy is there a mic guy I don't think there's a mic guy I will just point okay oh how does the pod note yeah so the the pod well the game server so the question was how does the pod know about like being allocated and stuff so the generally the game servers will integrate with the Agana's API using the SDK so they'll be aware of when they've they've been sent a signal to say you've been allocated effectively and they can also talk to this API to then say I've finished or something there's also a sidecar that can be used if the game server might struggle having its code modified to integrate with Agana's SDK but we would recommend well it's up to you but yeah he has an SDK that you can use in the game server to do that kind of thing cool another question oh there's a mic there I think that is does that work one two three one two three yeah so what about data collection so if the player is playing you need to store somewhere the information it's API it's persistent storage it's database how how you manage that from the because the containers are not storing information right yeah so the question just case is on the track was about like how are you storing like the player's data when they're playing on the game server and the answer yet generally the game server is a stateless so they won't be holding any state they'll finish at the end in terms of like where the data comes from that is that varies greatly depending on the game like we have like player information about like their ID and stuff which will be coming from like the centralized PlayStation Network APIs but like more game specific stuff like I don't know what what's their load out or something there'll probably be a service that is respond like a load out service of what what weapons does this person have and that will be storing the data somewhere again depends on the game you know it could be a database it could be any of a kind of back end store but yeah but yeah generally there'll be some other micro services that the game service server will go and like talk to to find this stuff out sorry I had another question fire one more question please yeah here on the right side upside there we go north my question is basically how do you know or is there a concept or a process of where you scale a server or a game server during the game so for example if you're running a game with like six players and somehow suddenly 10 other players are joining and you probably would need to scale is there some process for this is the game server replaced or how is that handled yeah so the question was like if you need to scale up a game server during play is that possible like if more players need to join a game server and the answer would be maybe it depends like I have not personally done that like generally the game servers are set or it would be started and they would kind of know how many players they can support and if you needed a different like let's say you're playing a game which is by default 3v3 but then some modes are 5v5 you'd probably just have different sets of game servers for each one it would be uncommon to switch it during a match that's not as common so if you if you have like a game where like people can go back feels it could be between 10 and 20 players you'd probably just start up between 20 players but there is like in place vertical pod auto scaling and stuff you can do in Cuba is these days and that might be an interesting thing to look into maybe but yeah thanks do you have any experience in running windows pods as game servers question was do you have any experience running windows pods game servers not personally no generally what we've been is trying to you know a lot of game servers won't even containerized right so some of it's just that's been one of the challenges like creating like headless versions of game servers that can be ran in containers and stuff has been a challenge but no windows ones that I've encountered so far personally we do support windows according to the agonist man so is there time for one more question yeah yeah so you showed the like sort of the DNS Route 53 switch over from old cluster to new cluster I was just wondering in terms of making sure that like of course like marking that cluster as no longer to be allocating new sessions on how exactly does that mechanism work to prevent like that constantly recurring new game servers popping up on the old one and making sure it slowly drains to the last sessions yeah well that's so the question was like how do you make sure that like the old cluster gets drained when when you want to show it down and make sure people don't keep playing on there so it kind of depends on the game I guess would be that I can use that for a lot of these questions but it depends on what type of game it is like if it's just kind of like a match based game where you just play to the end of the round at that point you know they get booted off the server and they would have to get a new server and at that point you send them to the new cluster it's a much more interesting question when you get to more kind of open worldy games where players can hang around for a long time and I'll go back to you on that one because we haven't wear that one out yet can't wait for one player to leave the server it's like yeah yeah leave now please yeah I've had these these this way keeps me up at night those kind of thoughts okay thank you very much all right cool okay I think we're out of time but yeah I'm mostly around so if you want to ask me any questions feel free so thank you