 I'd like to welcome all the elevator enthusiasts here. And you elixir folks can listen in too. I'll start off talking about my little embarrassing story of the day. First, I understand I'm not a morning person. I like mornings. I just wish they happened later in the day. So I'm leaving the hotel room this morning getting on the elevator. And I kind of have this mind goes blank. It's like, what do I do? I press a button. Which button do I press? And I'm supposed to talk about elevators this afternoon. So and there were actually a couple of other attendees here on the elevator with me. So that was actually witnessed, which made it even worse. I'm here to talk to you about OTP with an elevator system as my main example here. I'll start off with sort of the obligatory. Who am I? Name's Greg Vaughn. Been just doing elixir, hobby time in the evenings for a little over a year. A copy when my first commit to GitHub was for my first elixir code. Currently employed by Living Social, worked primarily in Ruby. And I have been obligatory. If you've been to a Ruby conference, you might have seen this before. We're hiring. We're not doing elixir today. But I'd love to get more enthusiasts on board. Talk to me later if you're interested. And that's all the commercial portion of the program. OTP. This is what attracted me to elixir. I love the ability to know I can scale up. But yet, even as James was pointing out, it scales down really well, too. So you can start simple. And you have that breathing room at the upper end. So it's a lot of things. And I'm going to focus on a few things. It's been around for an open source for 16 years. And I read an internal history of Erlang PDF file from Bjarn Docker, if I'm pronouncing it right, who said it was even identified by name a couple of years before that. So it's been around for longer than a lot of people's careers. But it doesn't feel stodgy. It feels really powerful. A lot of lessons that have been learned have been codified in there. It's buzzword compliant. Our industry loves to reinvent things and give them new names. And to me, this is what microservices really is. It's really describing an actor framework, but missing an important operational side of it, which I'll talk about some more later on. And I thought I was going to get to be controversial here. But I really love props to all the other speakers. OO is not a bad word. But you have to kind of change your thinking about it a little bit. You can't think in classes. The core ideas of OO are separate from classes. I'll defend this a little bit more on the next slide, but I doesn't need as much defense as I thought it might. Actor-based, we've heard lots of talk about actors basically process in a receive loop that can respond to messages and store its state inside of it. I already used the word framework, but I like to make the distinction between libraries, which are code that you call to do some particular task for you versus the Hollywood principle that Don't Call Us Will Call You. Your OTP behaviors are like an actor's manager, that they're the ones that handle that run loop, handle exceptions, handle trapping exits, all that sort of timeouts, all that sort of logic that's handled for you. It is design patterns. I forget when I speak to this morning, I was saying, oh, it's functional, no design patterns. I think it was Richard. I think it is design patterns, but it's a very simplified version of it, that the actor itself, actor, pattern is a design pattern. The Hollywood principle, the template method, these sorts of things all appear in the framework, so you don't have to throw away any past you may have with object-oriented programming. You have to be careful, you don't take what you've learned before too far, but there's still value in some of those things you've learned. And it's robust. It has the answer to what happens if it fails, because it actually embraces, it's not an if, it's a when. If you scale large enough, it will fail at some point. Even if you write bug-free code, hardware, networking problems, these sorts of things happen. So you have to be prepared for that failure, and it kind of makes it easy for you to separate considerations of those two things. So I'll start out with Alan Kaye quotes, not gonna surprise many of you, the core of what object-oriented means. And a lot of people like to take object-oriented and draw it in contrast to functional programming. I think it's better to contrast functional programming with imperative programming, and that they're both about how you deal with state primarily. Now I've heard Erlang described as a concurrent, concurrency-oriented language, and I think Elixir inherits that just as much. So we have those processes right there at the VM level, totally isolated memory spaces, messaging's the only way to get in or out of those. So when you combine all three of these concepts together, actors are the natural culmination, and they're a very powerful abstraction to build your distributed systems around. Like to dig a little bit into the core concept to understand about OTP are the behaviors. Behaviors versus your callback modules. I got really confused about these and some of the terminology when I started out, and I'd like to hopefully help some of you avoid that. This is about the simplest gen server you could write here. It doesn't do anything, but it will instantiate. It's actually the gen server start call with your callback module being passed in that actually starts everything up. That's what instantiates it. The actual use gen server that you put in your callback module really is just syntactic sugar. That's not, that prepares you to be a callback, that's not actually what instantiates you and starts things as a callback. And this callback is all about the contract between the behavior and the callback module. I'd like to peel back the curtains a little bit here. I actually copied this from Elych's GitHub repo a few weeks ago and reformatted it a little bit. This is what use gen server does. It calls this macro. Now, if you don't know macros, it's not important. It's pretty clear what's happening here. It's giving you default implementations of the six contract callback functions that gen server requires. Now, they're mostly kind of do nothing implementations, but they're all implemented there. It also adds that behavior attribute to your module, which, if I understand right, down at the Erlang compiler level, this kind of makes sure that all these functions are available here. Well, Elixir has kindly provided them for you, so you're set. Def overwriteable is a piece that kind of says, if you write this function in your module directly, use it, we're just kind of the fallbacks. So it's not a big mystery what's happening here. I want to note that the contract between the behavior and your callback module is not just the functions that you write, but also involves what you can return from that. Usually tuples, sometimes it's just raw atoms. We have some examples here. These are all documented pretty well. I'd encourage you to read through those. There's some interesting, I'll call them corner cases, but interesting unfamiliar corners of it that hopefully I can show a couple of these a little bit later on. Next slide, okay. So I've taken that simplest GenServer module and expanded a little bit. Now it can actually respond to a message. But I want to point out a couple of potential confusion points, kind of understand how data flows through the system. When you call a GenServer start link, your second term there is what gets passed into your init function. It's a single term and your init function can do whatever it wants to with it. But typically you'd return an okay state tuple. In this case, this one here is doing the same thing that the default implementation of the macro gives you, but I want to just kind of point it out here. And then that state is what gets passed in each time that one of your callback handlers gets called. And this way you sequentially process messages and the state you return from one message is what gets back to you on the next message. That's the way functional programming, we're not modifying anything. You're just getting a new copy of this data, potentially changed copy of this data coming into you each time around. And there's one little gotcha here that I don't know if anyone else has hit it. I was using a handle cast at one point and I said, no, I want to change this to handle call. So I changed the function name and my code won't execute. It's important to understand handle call has this extra from parameter it's going to receive. There's some advanced things you can do with it. You can actually choose to defer responding to the caller. You could spawn a task and give it this from parameter and which could do something computationally intensive and then return to the caller. Most of the time you don't need it, but it's there for expansion purposes and allow you more possibilities of how to structure your particular problem. And this is one of the places where I said this is not class oriented. Don't think of modules as classes or your brain will hurt. They're not classes, they're namespaces. And it's important to understand that even in this namespace, certain functions may execute in different processes. The start link up at the top there is really a convenience, a convention. It's not required by the APIs, though you normally see it in the code. And this is typically going to execute in the supervisor process, it's going to start you up. Furthermore, then your init method, I say method. I knew I was going to mess that up. I've been trying really hard to get my terminology straight. The init function is executing within your actor process, your OTP process. Then you have the convenient do it function, which is going to execute in some calling client. So it's important to get your brain around this that different parts of your module can be intended to execute within different processes. And basic conventions kind of demand that. I included a couple of private functions here, but quite frankly, you could have other private functions that are intended to be called by one of the client facing functions. So you can't even say all your private pieces are going to execute in one process or another. So it's something that requires a change of thinking to get your brain around. So I promised you elevators. I like to step back and share a little bit of a story. About two years ago, I was at a user group in the Dallas area that we call Hat Club. And we get together kind of practicing our craft of programming. Usually we're pair programming with someone. We might want to learn a new language. We might want to try test driven development techniques, try some new library. And the idea behind that is to have a problem that you can solve in an hour or two. And then we all kind of look at how each other solved it. Well, I was sitting there with someone who wanted to learn Ruby. And so I was teaching him some rubies were going along. And the problem that night was to simulate an elevator. And it was only after I had done 30 minutes of it or so, I realized this tick method kept coming up in all of our tests. And it dawned on me that this is inherently a concurrent problem. Elevators have to be able to move around while people on other floors are saying, hey, come here, come here, I'm pushing buttons. So these things have to happen at the same time, at least conceptually. And I thought from that, perhaps I should spend some time digging into threads and fibers in Ruby, but that's pretty anemic. It wasn't really worth spending time on. I thought maybe I should learn celluloid. Read up about it. And I realized that I learned there that it was strongly influenced by OTP. So then when I finally heard about Dave's book and he was saying elixirs, the greatest thing since, I forget how you phrased it, it's made him the happiest, has that same sort of feeling as when he first met Ruby. And Ruby treated me pretty well. So I decided to dig in and I wanted to solve this elevator problem in a good concurrent fashion. I like this problem set because we're all familiar with elevators. It's a little bit more, it has had a few more nuances than a lot of the other example OTP apps that I've seen before. And I've been spending my hobby time learning about it and now I get to share that with you. Now I've not gone crazy trying to model elevators as this state machine, when are the doors open, when the doors close, what happens when you press the little fireman's hat button. I just kept it really simple here. The idea is that a writer process will send a floor hail. I wanted to call this a floor call. But OTP already had claims on call. It sends it to a hall signal. I wanted to call this a hall monitor. So naming has been a bit of a challenge. I want to curse you OTP for getting all the good names. So a writer process is going to send the floor hail to a hall signal. Elevator cars are going to pull that and retrieve that hail and then it's gonna travel to the floor and notify the writer process. Hey, I've arrived. It's gonna pass its PID back in that call. Then the writer can tell that particular car that arrived where it wants to go. And then when the car gets there, it notifies the writer. So this is sort of our message passing what's going on here. Now, I don't want to lose fact that we're still in a functional language. And in a functional language, if you can define your data structure as well, it'll make your life a lot simpler. A lot of my core logic focused around this hail struct, which is gonna consist of a floor and a direction and an optional writer PID. With that, I was able to build out helper functions around that and really kept the main logic in my OTP callback modules pretty simple. And I would like to jump to the code now. I had great designs of great ideas of doing wonderful. And I seem to be, Exit, here we go. Is that legible or should I go to a light background? Okay, I'm not getting my input. Better? Size good? Bigger, bigger. Okay. So this is the hail struct that, as I said, struct with direction, floor and an optional caller. And I have matching and sorting types of functions available on this. Most of them take a list of hails as a first parameter and then a specific hail to match or sort around as a second one. I really wanted to go deeper into the code as I was rehearsing and timing issues. It's gonna be a fairly cursory glance, but if anyone's interested, I'll be happy to talk more about this, you know, anytime during the conference. So the hail is kind of the core data structure part of it. And then I also mentioned the hall signal is really simple. I possibly could have done this with an agent, but I wanted it to respond to a couple of messages here. So this is how a person waiting for the elevator can do that. See, I called it a floor call here instead of a floor hail. I started out calling it, naming it a call and realized the confusion later. So it basically has a state that's a list of hails. So it will know all the people on different floors that want to go which direction. I didn't make the logic smart where it tries to decide which elevator car to send it to. So I let the cars actually pull it. So let me show you the car. This guy also defined a struct on it. I could have used a map just as well. I was able to reuse the hail to serve as the position of an elevator car because it's got a direction and it's got a floor. The elevator car also keeps track of stops that it has to make, which may be ones that people who got on the elevator pressed and maybe ones that came from people outside of the elevator and it has a tick value. This is my way of getting this car to be processing things in the background regardless of what messages it receives at the time. It's one of those, I mentioned these little lesser used corners of some of the contract between the gen server behavior and your callback module. If you return in this third parameter an integer, you get what's known as a timeout value. And your gen server behavior is going to watch that. And if you don't receive a message in that amount of time and you feel lonely, it will send you a new message. And what that looks like here is down here, you get a handle info type and you just receive the timeout atom. And I'll tell you after Dave's talk, I feel really smart for that line of code right there because my main process in here is a pipeline. So I'm able to retrieve a call which is I'm polling that hall signal to say you got anywhere else for me to go. Then I can do a check arrival to see, am I at a floor anyone cares about? And then I can move. And so I'll receive this at least in my case every one second or it may be maybe longer than every one second because other messages coming in. It's not a perfect heartbeat tick. Which you could do with a separate timer process but I liked exploring this part of the API. And so that means I'm having to store the tick in my state and I have to return that as a third parameter to all those callback calls that the gen server behavior will make into me. And this is a really simple application which I'm gonna show you new versions of this as we go along. This was the simple code I was able to put together just for the start of learning OTP. And this has a pretty simple just calls the start link on my two modules. They don't need any parameters, they're hard coded to know what registered names to talk to and we're gonna expand that in a little bit here. I have a simple little test method I used so that I can actually go over here into IEX and my window is too big now. You guys still seeing enough of that? So I created a sanity test which is just IEX mix. I can do a run dash E and just tell it to run that test function at my application level. So that was my way of just kind of seeing debugging things and kind of just getting them basically working. So real basic stuff, nothing too exciting. There's actually two different people on different floors calling and the elevator kind of processes each of those. And I wrote a test which I'm gonna start up now because it's really slow. Because this was the very novice level test. I know better than to do this. This is a horrible code smell, timer sleeps in your test. Don't do this. We're gonna make it better here in just a minute. I don't wanna offend anyone here. So my test has to wait for the elevator car to do all of its thing. Now in order to make the test more comfortable to run at a faster velocity, I needed to parameterize some of these things that were hard coded in my callback module. So I'm gonna jump ahead to my next prepared get commit here. Let's see if my, there we go. So now in the test I was able to use a setup block or I guess it's basically a function call that will actually create separate hall signal and car because notice in the last parameter to the car I'm passing an infinity. So this timeout that the car is going to give to its behavior is now infinite. So that's not gonna be pinging it. So I can handle that here in my test and speed up time. Now to make this work I also and managing the hall signals name actually comes in as another parameter. And of course the hall signal itself had to be given that to know how to register itself. And with this you can now see that my mixed test shoots out it's done and I threw an even extra test here. So what I was able to do with that is create my own helper function to assert arrival which is what I cared about here. So that here looks something like this. It just keeps calling this continue function which is going to send that tick just a raw send to the process. And what happens for all these raw sends they get converted to handle infos. So I'm gonna receive that timeout and my business logic's gonna execute here. I found this handy way to find out the message queue length. So I just keep calling it until the test process has something in its message queue because it actually registered itself as the rider. I found another little handy thing that's just kind of scaring up the state here. Don't do this in production either but you can actually pull the state out of one of these actors. I was digging into APIs like I said I don't do it in production. Everyone who maintains your code will curse you and question your ancestry and who knows what else. They say to write your code assuming that the person who maintains it is a axe murderer who knows where you live. So to make this work there was a couple of extra parameters added to start links. And of course at the application level I use some attributes to kind of keep track of these things and pass them in when they start. I didn't mention earlier that in this test function I made it's just spawning a raw process as the rider. And this guy just kind of makes sure that the car arrives at the floor he started at and then tells the car go and then make sure he arrives where he wanted to go. So this is my simple way of just kind of testing this and doing some sanity testing. But the downside now is all these IO puts everywhere. I kind of like my test output to be nice and neat. So my sort of next thing to explore was GenEvent. And this ended up being really simple here. So jumping ahead to the next version of the code up here in the application start I'm able to just start a GenEvent. I was maybe too cute calling the attribute of venue but I thought of who hosts events. Venue again naming is kind of hard sometimes. Was it the two hardest problems in computer science. So I can I'm cheating here if mixed environments not equal to test. I'll actually stream the GenEvent and dump it out. And that means all the IO puts that I had before I've converted to things like GenEvent notify. And I changed them all to tuples because I'm gonna do a little bit more with them later and it just seemed to be a little bit more consistent. So I can still execute that same sanity test thing I showed you and it's still gonna show me all my output. I can kind of see what's happening but on the flip side my tests nice and neat six test zero failures in 0.00 seconds. So that's a pretty good test suite there. And now I wanna go back to some slides. Supervisors are one of the big things you hear all about OTP and that's what enables all of the robust behavior. The slogan let it fail. I love it. It's shocking the first time you hear it. It really got me to pay attention but I realize it's only half the story. You let it fail but you have to have a response and realizing it's not a matter of if it fails it's a matter of when it fails. Realizing what depends upon what and how they have to be restarted. Ironically I found myself thinking longer and harder about failure modes than I normally do because it separates those concerns. That's some of the design patterns of OTP. You can think about everything I showed you before was business logic pretty much. I didn't really have to think about all of the robust behavior yet. I can actually think about that separately. And that's really powerful and not something I've seen in other frameworks. And this is where I mentioned the quip early on about microservices. I think where they're gonna fall down is if you've got 2,000 operating system processes across 200 physical machines, something is going to die. They're gonna have a restart. There's Monit and God and other sort of operation level operating system level tools that do that but they won't understand all of the dependencies between the parts of the overall system. And that's where OTP gets it right. They've made that sort of knowledge in the same language, in the same domain, the responsibilities of the same team. There's no separation of ops and developer teams that don't always communicate well. I know there's a push for DevOps and that's a great idea. And maybe some of you are on teams where that's not an issue at all but this takes some of those operational concerns really into the application developer, the one who really understands those dependencies. And you have the fine-grained control. Microservices, people like to talk about 100 line services. Well, I just showed you three of them right there. Each one of my business logic modules were no more than 100 lines of code. So let me kind of build up a supervision tree here for this system. I'd like to start at the elevator car. The samples so far I showed you just had a single car but you would expect there's buildings that have two, four, six elevator cars that all serve the same floor. So I wanna have a car supervisor. Notice my clever name there. Let's start at multiple copies of this callback module. These don't depend upon each other. A one-for-one restart strategy works great for these. But we can't forget the cars still depend on a hall signal. And they both, both types depend on that gen event now. So we can build up these supervision trees and you build them the way your problem dictates based upon dependencies. So I came up with another clever name. I called it an elevator bank that's going to start the gen server, start the hall signal, and then start the car supervisor. And with this, because of that order of dependencies, I'm able to use a rest-for-one strategy. Which if you haven't encountered that, it means starting left to right, if any process fails, it will restart that one and all the ones to the right. So it's a way you don't have to restart the world. It doesn't have to be a one-for-all. But it's a nice thing if you can order your dependencies that way, these sorts of things are really cool. But that wasn't enough for me. I wanted to take this to the next level. You ever been in a tall office tower that has one bank of six elevators that serves floors one through 15 and another bank of six that serves 16 through 30? Well, one building's elevator system may have multiple banks. So I wanted to allow for that in my code. And it's simply a matter of getting another supervisor in there. So now I have the top-level elevator supervisor that can supervise multiple banks of elevators. These again are totally isolated from each other and a one-for-one restart strategy works great for these. I wanna highlight something else here, playing with GenEvent. Right now I have each bank supervisor creating a GenEvent. But if I were the superintendent of some large hotel or something, I might want my different elevator banks to share some status indicator. Maybe something my security guards are kind of watching to see what the status of all the elevators in the building are. So again, without changing code, but by using config, I'm actually able to make that happen. And I'm getting short on time. I'm gonna move quickly here. This, I don't wanna skip it, but if you haven't seen before, this here, the call to the worker function is a primary way you create a worker specification. And the supervisor is then told to supervise that based upon the specification. There's a little bit of a sort of impedance mismatch here that bit me and hopefully I can circuit maybe for some of you who haven't run across it yet. Right here, this call to the worker function takes a module name and a list of parameters. These are going to be applied. And that means the list of parameters become multiple parameters in your callback start link. You decide how many parameters it has. You supply that list, things work great. But then look down here when you call gen server start link. That second one, you can pass in a list of parameters, but your init function has to pattern match on the list. That's because this is not really an apply situation. It's actually using a single term. You're allowed to pass a single term into that init with an arity of one. So it's important to remember, you need those square brackets in your init if you're gonna pass things around in this fashion. Just as I was putting this together, I came up with, I think I'm gonna play with it some more. I'll be happy to hear more feedback from you guys. A convention that I'm gonna try using is actually to have my callback start link function take just two parameters. One of them is gonna be a tuple of the things a init cares about. And the other can be the keyword list of options into gen server start link. I think it's a little bit more parallel, a little bit clearer what's going on that you have to write a tuple when you create your worker and you're matching on a tuple in your init. So I'll be happy to talk to anyone more later in the conference. And now let me jump back to the code. And I have to punch a button on here that says stop. There we go. So back in the code. I wanna jump ahead to my next commit here. At my application level, I've actually started using the configuration now. So I've pulled all those hard coded references out of the attributes even. And now they've gone into my config. So I actually allow for a config in the application environment is how you get at those. I might wanna connect in a distributed fashion to nodes. And I can take a variable list of banks defined in my config and loop over those and create some bank supervisors. And the bank supervisor pulls out of that bank definition, all the various pieces it needs, venue, display number of cars, the tick amount, and creates, I've wrapped with the GenEvent in this elevator status because I actually allow different display types now. It creates the hall signal and creates this car supervisor. And these guys have a lot more parameters than they had before. You can have to pass these in but since you're not hard coding them, it gives you the flexibility of things you can do in your config. And the car supervisor, and not particularly exciting, it takes the number of cars it was told to supervise and creates a worker spec for each one of those. This is a case that I didn't see in some of the early OTP tutorials. I can't name this car process because I have multiple copies of it running. So you can't register it under a name. So you use this ID is how the supervisor keeps track of its various copies of that process. And how are we doing on time? Okay, five minutes. I'm gonna jump kind of to the grand finale here. I'd be happy to talk to anyone more, looking at more details of the code here. But I have created another one of my cheater things here. Shell script that sets a certain mix environment name. So this tells me which config is going to be executed. This is the config called visual node. And I can actually show this. This is pretty simple but note here that since I've taken the names we register these processes out of hard coded, now I can actually register it and give it a tuple that's a globally registered name. So I'm set for distribution by changing in my config how I wanna register these things. And I'm gonna go ahead and start that node. Over here I have another configuration that this tries to register an event name under the same global. I was able to write the code when I start up the gen event. If it's already registered, you can actually return and ignore Adam back to the supervisor. Which says, hey, it didn't start up, but you don't have to supervise it, but life goes on. And so what this lets me do is, another one of my cheater shell scripts here, I'm able to kind of start a separate node that's actually gonna execute. I've got a T method defined in my elevator that's just kind of randomly start choosing floors and trips for people to take on the elevator. So I can actually execute this on a separate node here. This is gonna start up. It's gonna find the gen event on the first node that I started. Supposed to. There we go. I played with some UTF-8 characters and just created a simple little visualization of what's happening to the cars on the elevator. Sometimes they get synchronized and then they'll kind of go out of order. So I've had fun just kind of sitting here staring at it for like 10 minutes just to go look what I've built. And now back to last wrap up slide here. Some of the takeaways. I wanted to make a distinction between the behavior module that OTP gives you, the generalized decades of experience in building these. Take the generic behavior out, reuse that versus your callback module, which is the specific behavior that your application needs. There's a couple of things that can trip you up how you do these start links and the initialization steps. I will tweet out a link to these slides later. Parameter and return value contracts. This is how you communicate between the behavior and the callback module. Set up your supervisors in your tree according to your application, restart strategies. There's not very many to choose from, but that coupled with the tree structure of supervisors gives you a lot of flexibility and responding to failure and getting your system back into a known state. And with config, you can do a lot of really cool stuff. Try to not to hard code things, all that knowledge you've learned in past OO type of languages is still important here. Move things out into configuration that makes sense. Your app becomes more flexible. And that's the end. I do have the exact code that I showed here up on this link and thank you for your time.