 I want to talk to you about implementing game servers in Erlang and OTP. Because game servers are really interesting, because they're kind of like telephone systems in that you have lots of people connected, and they're all in the separate bubbles. Like in a telephone system, it's most like two people that talk in the game. It's like four people per game. And you obviously want to isolate those bubbles. And Erlang is a perfect fit for that, naturally. And unlike the legacy telephone systems that exist, we can use like web sockets and other cool stuff. And who am I? I'm a freelancer doing Elixir, Erlang, and Ruby. I began working with Erlang one and a half years ago, and did Ruby and iPhone development before that. And one of the projects I've been working on is Sauspiel. That's the biggest online version of the popular German card game, Schafkopf, which translated that means sheep's head, but there are no sheep involved in the game. It's a card game. So I took the definition from Wikipedia. It's a so-called trick-taking card game, meaning hands are dealt to players, followed by a finite series of rounds called tricks, which are each evaluated to determine a winner or taker of the trick. And at the end, you know who won the game. So this is the game client. You see the people sitting at the table playing this game. And the company behind Sauspiel is also the company behind three other sibling sites where you can play different traditional German card games. And so that's an interesting requirement for the game server because four instances of this game server that runs different games. And another project I've worked on is Bolt Poker. And so when you play Poker, you have got the deal cards, and dealing cards could be tedious. So why not get out your iPhones and use an Arlang server to deal cards for you? And that's a product. You can buy it. It works. It makes money. And that's where the tagline is, deal with it. That's what I want to talk about today. First I want to give you an eagle's eye overview of this whole system, then a deeper look into the finite state machine that is one of those games. And then I just want to basically manually trace one in-game event through the processes it touches. And then I'll talk about some real world concerns like people that reconnect and how we deal with that, or when we deploy a new version and we move people over to the new server. And then I'll talk about our plans on Bolt Poker and some Q&A. So this is the physical setup of one of those games. There's a cluster load balancer that gets the incoming connections and forwards them to whichever Arlang physical server is currently marked as active. And let's focus on one of those Arlang servers. There are four OTP apps. So when you do mix new dash dash umbrella, what happens is that mix creates for you a folder structure where you have different sub-applications. And these are so-called OTP applications. They have dependencies between them, which means the Arlang VM can determine what the proper startup order is and the proper order to shut it down. So you need to have the storage application online before you can have a card game. You need to have network active and so on. And let's look at the card game OTP app. What the Arlang VM does when it starts, it starts a supervisor. Like in the Learn You Some Arlang for a great good book, I'm also using the same green for supervisors always. So this is the top supervisor process. It gets started by the VM, and it's a simple one-for-one supervisor, which means you can just tell it to start me a new child with some arguments. And thank you. Each of these children is one of those table processors, and they are identified by a global identifier. There's support for that in the Arlang standard library. Like it's a tuple, table six, table seven, and so on. We got the network app, and it uses Cowboy and Ranch. Cowboy and Ranch are third-party libraries. Pluck, the Alexia web thingy, uses that. When an incoming connection gets terminated at this Arlang node, Cowboy starts one separate process per incoming connection. We got WebSocket connection handlers, basically, and since we have got a flash client that uses a legacy protocol that sends raw TCP and doesn't use WebSockets, we use Ranch, which is a socket acceptor pool library on which Cowboy builds to deal with those raw TCP connections. And the connection handler process gets the data from the socket from the Arlang VM, and then it deserializes it and hands it further and presents a consistent API to the rest of the system. The storage app is also rather simple. It spawns a fixed-sized pool of PostgreSQL workers, and we've got the table ID sequence, GenServer, that we use to get monotonically-increasing table IDs. It talks to the database and says, like, give me new IDs, give me new IDs, and so on. And the GameServer OTP app is basically a grabback where we put everything else that's like important. So the things that are grayed out are like the leaks and the lobbies and stuff like that. And it also has a child supervisor, which is, again, simple one-for-one. So the player processes live under that. And they are, again, globally registered, so within the whole Arlang cluster, you can always check whether or not a player process for player one exists. And yeah, when implementing a card game, it's really nice to have a finite state machine because it fits pretty well. And the Arlang Standard Library has a pretty great finite state machine called GenFSM. And so, as I said, you've got this table, there are four people sitting there, but in the beginning, there's one guy sitting there, then the next one arrives, and so on and so on. Once four are there, the game can start. And then once it's done, it restarts until someone leaves. So I draw through this state diagram. The challenge we had was because we've got lots of house rules that lead to different things that need to happen. And we got these four different games that are also complete, kind of different. We needed to be able to customize that. We've got the outer state machine, which is a GenFSM, which just flaps between idle and playing. So once four people are there, it goes into playing. And then it starts what we call stages, which is basically a stack of modules that it uses as callback modules. Like a GenFSM has its own callback module. This thing again has more callback modules where it forwards events and those modules then deal with them. So when it transitions into playing, it calls create stages for this game settings. And that returns a list of callback modules, each with some configuration. So that is for Shafqop. It starts with dealing some cards. Then everyone has like 5,000 milliseconds to raise the bet. Then again, dealing some cards, then negotiation phase. And then there are a lot of play stages where everyone has to play a card. And then so on until we've got the result. So when you send an event to a running GenFSM process, the callback modules function with the name of the current active state gets called. So that would be table.playing for us. And it gets the current state data from this process. And it can return a tuple with either next state and the next state name and the new state data or stop if it wants to shut down. So that's like idle and playing. And then we've got the nested stages again. And what we did is we always look at the top of the stack of nested stages. And we do the same as the GenFSM. We call the current state name on them like playing. But we hand them also their own state that we carry for them, like the GenFSM carries our state for the table. And we support a few more return values, like the tuple next stage, which pops the current stage off the stack and goes to the next. Or end game, which ends the game. That's also the function, GenServer, reply in the Alexia standard library, which is interesting. It basically allows you to out of band reply to a call in a GenServer. And we did something similar for our table process. There's, for example, table broadcast event, which broadcasts an event to all players. So it starts with deal cards, then this is done. It goes into knocking, this is done, deal cards again, and so on. Then when an event arrives, it looks, do I handle that myself? For example, it handles stuff like people going offline or people requesting a pause of the game itself. And if not, it just forwards it to the current stage to do whatever. So following an in-game game events through the system, there are two things to keep in mind. So these are all the processes involved. We've got the table FSM, and it communicates with these four player processes, P1, P2, P3, P4. And each of them communicates to a connection handler process. But these connection handler processes, they aren't actually their web socket. The actual socket is this Erlang port, which is what the Erlang VM gives you when you have a network socket. And it behaves mostly like an Erlang process. You can send it messages. And these messages then get sent over the network. And when data arrives at the socket, the port sends its so-called controlling process a message with like, hey, there's data here. Do stuff. And secondly, if you remember the OTP apps from before, all these processes live in different OTP apps, which means their supervisors are different. But they still can communicate without issues. So the table is in a state where someone has to play a card. So it sends this two-game event must-play card to the player process that has to play the card. And once arrived there, the player process sends it to the connection handler. The connection handler process knows I'm a web socket. I speak Chasen. So let's serialize that. It sends it through this protocolChasen.write method that we have that spits out a Chasen string. And then it sends it down the socket. The player process also stores the event so that when later player reconnects, it can do that again. But I'll talk about that later. So the message arrives at the client. The client does something, updates the UI, waits for the player to play a card, sends back some Chasen. It arrives at the connection handler. The connection handler calls protocolChasen pass and gets back a tuple. This is really awesome, I think. Like decoding this stuff at the farthest border of the system makes it easy because then within the system, we only got like elixir data structures, like tuples and stuff, where we can really easily pattern match on them and so on. And we don't have strings of Chasen that we send around. And we had in the past, for example, protocol XML. And so we could support flash clients that did XML, which we had. But then we were like, XML's horrible. Let's use Chasen and throw that away. So then the player process knows I'm playing a table number nine. So let's forward this event to the table. The table forwards it again to the current stage and so on and so forth. Two things to keep in mind in the real world is like, as I said, people might get angry during the game and mash the keyboard and hit F5 and the browser reconnects. And then the browser closes the web socket connection. Stuff happens. Also, we've got this one server in which it all fits, which is really good because it's nice to deploy and stuff. But what if we need to do a kernel upgrade? And we can't really do that ever because people are playing 24-7 and you don't want them to be like, oh wait, we are rebooting now, you can't play for 10 minutes. So we need to figure out something for that too. Also, the same for rolling out new code. We both do like hot code swapping and just starting a new server, moving people over and having them play there. So first, when a user disconnects, at first the port closes down because the web socket connection is gone. Then the connection handler finds out about that and is like, okay, I'm gone too. Then the player process finds out about that because it monitors the connection handler and usually the player process now would shut down because what do you need a player that's just in the lobby when the player is disconnected? But because the player process knows that it's currently playing at this table, it can't shut down. Instead, it just sends a message to the table like, hey, I'm offline now and it stays alive until the game is over. So now the player reconnects. We get a new connection handler process. This connection handler process does the authentication like, hey, what user ID do you have? Send me your cookie and so on over the web socket. And then it knows I'm the connection handler for player number one. And it checks if the player number one process lives. And it does, so it says like, hi, I'm back. And the player process then looks at its state and knows for the current stage in the game, these were the events that are sent. And it first sent some fake events like you are sitting at this table and these are the people that you're playing with. And then it just sends all the events for the current stage that it had cached. And the player is back. Second part is if you want to change active servers, as I said, for like deploying new versions of the code or because you need to reboot one of them. What we do is we have got our load balancer and we can just point that at another active server. And we do that. That doesn't affect connections to the currently live and that are still connected to the old node and only new connections arrive at the new node. And we use the Alan distribution protocol that's symbolized by that to maintain state within those two servers that are basically both doing stuff at this time. So that's this Alan module RPC. It has this function multicore. You can give it another module and a function name and arguments. And it will call this modules function on all nodes within the cluster. So this is good. The old active server knows, oh, that's no longer me. I need to go into standby mode. The standby server knows, oh, I now need to be active. So looking at the old server, suppose there are like seven players playing. Player one to four are playing at table number nine and players five to seven are just in the lobby. What we can do is we can just kill players five to seven because they're in the lobby, they don't care. Let's get rid of them. They will reconnect arrive at the new server. Now, table number nine is still going. Let's do some split view of like old server, new server. We start a table micrater process on the new server. This process monitors the table process on the old server. It monitors the players on the old server. And so it knows if this still exists. I've removed some of those lines because they are cluttering stuff. Meanwhile, the web sockets from the people in the lobby reconnect and their connection handler processes authenticate them. And then they check, do we have a player process number five? No, we don't. Let's start one. And that happens for player six and player seven and they're back in the lobby. So far, this is all awesome. But what happens when one of the players playing at table number nine who's still connected to the old node dies? So as I said, they get angry. They had hit a five. Their browser window reloads. Website connection is gone. And player process and table process are still on the old node. But the new connection handler arrives at the new node. What happens is it performs authentication as always. Then it looks, where is the player number four process? Should I start one or should I reconnect? And it sees that there is a player number four process but it's somewhere else in the cluster because there's a global registration so it's visible. And then it starts a connection proxy process on the old server and just forwards it the raw data it gets. Because that way we can just change how we deserialize and serialize the JSON. And if you're still playing on the old server you get the old code that did that for you. And we don't have to keep compatibility between that. So the web socket message arrives at the new connection handler. It just forwards the binary data to the connection proxy on the other node that connection proxy pretends to be a web socket. Deserialize the JSON and forwards it to the player. So that way we still have like the guarantee that only one player process per player exists in the cluster but people can reconnect and then continue playing on the old server. So back to the running game. Let's forget about players five to seven. And suppose table number nine now has finished the game. It's in the result stage. It displays this like player, some player has one message. And usually it now would transition back to idle. It would check there are still four people sitting at the table. If so, just restart the game. But because it knows it's gonna be migrated, it doesn't do that. Instead it gathers the minimum amount of state it needs to be recreated somewhere else and sends that to the table migrator. So that looks like this. It's like a map with like table IDs. This, these are the players. These are the positions at the table. These are the house rules that are active. This was the last game ID and so on. And then it just shuts down. And the player processes, similarly see they're on an old note they are no longer in a game. Let's just shut down too. And the connection handler processes do the same. So now there's only the table migrator left. Usually the player now would see you've lost your connection and everything would be dark and we will reconnect you. But we obviously don't want that because we want it to be seamless for the players. So what we did was before we shut down on the old note we sent a message over the web sockets saying you will be migrated. Don't display the connection lost dialog for the next 10 seconds instead just reconnect. So the players just hang here for a second and don't notice anything. Now the connections arrive at the new note because the load balancer forwarded them there. They authenticate. So suppose player one was the first to re authenticate. It then looks in a global ads table. Hey, I'm player one. Am I currently being migrated? If so, where to? And then it sees you're being migrated to table migrator nine and it says to table migrator nine, hey. And the same happens to the other players. And once they're all there the table migrator just takes the portable state that was sent by the table on the old note and creates a new table with that. And that's it. It, people seamlessly transitioned and at this time we can shut down the old server and yeah. So bold poker. Bold poker is way simpler and smaller than Southspiel. It's like the perfect staging ground for new ideas. It for example doesn't need a database because you're just playing poker. We can hold all the state in memory. There are no user accounts or avatars or something. Southspiel already uses Elixir for integration tests. So we write our new integration tests in Elixir and we use mix for building and dependency management because mix is really, really awesome. So we want to look into using more Elixir code in production. So bold poker. What we're gonna do is we're gonna rewrite all of bold poker in Elixir, which is really awesome and I'm looking forward to that very much. And why? Because as Joseph Valim said in his keynote the extensibility goals of Elixir are awesome and we want to leverage them. The standard library of Elixir is so much better than standard library of Erlan. Erlan has accumulated this craft where you don't know if when you use this module or that module or if the subject you're working on is the first argument or the last. You don't know if indices start with one or zero. You have to look that up all the time. That's annoying. Elixir does a way with that. This is great. Also mix. Mix is the best thing ever. The way it does dependency management, it's like Ruby's bundler but way better because it has optional dependencies. That's what I learned yesterday. And X unit. X unit is also great. It's like there's E unit for Elan. I don't like it as much as X unit. X unit is better. And the business case for this is Sauschbill is way bigger and more important than bold poker. We want to use these technological advantages. Will we run into any issues? Let's find out by trying on bold poker first. And one of the things I think that might be a bit annoying is that the Elixir standard library doesn't yet wrap all the important Elan modules. Like it does wrap and Chen server. And it does wrap supervisor and provides a really great DSL for that. But it doesn't wrap the Chen as FSM or ETS. So we'll probably end up using like alias ETS as ETS. Yeah, we'll see. Another thing that's proper, which is an open source library that allows you to do property-based testing. That's like, if you know Haskell Quick Check or Elan Quick Check, Elan Quick Check is actually proprietary. And proper is open source. And so we use this, but it uses Elan macros for like four all. And it's like 15 lines of macros, but we'll have to like port that to Elixir and maintain that. It's not much, but it has to be done. And I hope to open source that once it's done. And I hope that proper otherwise perfectly works with Elixir. And the other thing, when I was looking into the proprietary Elan Quick Check, because Elan Quick Check has pools, which is kind of like concure error, which shows a volume talked about today too. I noticed that pools uses pass transform on your Elan code to instrument it. And that pass transform is like bytecode, you can't edit it. So for example, if we were to use pools, we couldn't migrate because we wouldn't get the pass transform. And maybe there are other libraries we want to use that do this, but I don't know of any, so I think not. And some lessons learned. I think performing encoding and decoding as far at the bottom of the system as possible is really great. It allows you to abstract it out. It's awesome. Creating your own behaviors like we did for the nested finite state machine allows you to test them way easier because you've got your behavior that's like one small aspect and you can just send it to test data and figure out what it does. And it's not that hard to create your own callback modules just because like chen server and chen FSM do it, don't be intimidated by that, it's easy, it's awesome. And distributed Elan makes hard things so much easier, like reconnecting and forwarding that connection to the old node after it's authenticated and we know which player that actually is with Elan it's easy. Yeah, one more anecdote. So we were thinking about how to handle our database when it gets downtime and we've got this like data locality where all the information we need for a game is already in the Elan server. So what we did is if we want to persist something during the game, yet we can't because the database is currently down, we just let the players finish their one game then just dump the results that we would save on the disk and shut down this one table. And we had some really interesting experience with a kernel bug that shut down our database I think twice. So this helped immensely. Yeah, thank you. Any questions? Great talk, that was really nice to see the whole design. And I have a curiosity, what are the deliveries guarantees to the browser when you push the web socket? There is a chance that you send a message and the browser is not really going to receive it. So do you hack back or things like that when you're pushing stuff to the browser? So we use TCP Keep Alive to ensure that we know that the web connection is open and if it's gone then we'll shut it down. And there are timeouts on the server like if the client didn't reply with an answer for like which card to play, then stuff happens like in this case we presume he's away from the keyboard and after a pause we like play automatically the next card according to our AI and so on. But we don't do like any real like here's the thing, hack it. Because if you reconnect, we just send you all the events that we know you need to restore your state anyhow. And that has worked. In the Alexia IRC channel, I was asked to write about how we use MIX to build our Erlang project. So I'm gonna write about this and I hope to have it online on Monday and you can read about it. Briefly you mentioned how you're generating monotonic IDs with a single gen server. I was wondering how you deal with that in a clustered environment? Like is there a specific way you're able to ensure that it's always monotonic throughout the entire cluster? So the gen server talks to the Postgres database and there's a sequence there and this is the thing that actually generates the unique IDs. And the gen server just always gets like 50 of that and caches it. All right, thank you. Thank you.