 Alright, our next talk is going to be Harrington Joseph. He's going to tell us about when bullions are not enough, state machines. Let's give him a round of applause. Thanks, everyone. Can everybody hear me okay? Yeah. What about now? Louder. Louder? Alright. Okay, I'm going to talk about when bullions are not enough and how we can use state machines in those cases. My name is Harrington Joseph. I'm a senior software engineer at Netflix where I work on the data platform team. My work is mostly focused on day orchestration and even driven architecture. Feel free to find me after the talk if you want to know more about what I do at Netflix or you want to share your favorite show or something like that. Expectations. This talk is about why, how and when to use state machines. This talk is not about the internals of a state machine. I want to make that really clear. Let's start with a very simple analogy. Imagine that you have a closet where you store tools and sport equipment. But this closet is so messy that you don't really want to look at it. You don't want to think about it. Reality hits you the moment that you have to get something from that closet and turns out that it's all the way at the bottom of the messy pile that you have in there. You don't really know if you can pull it out or not because everything may fall apart. Things get actually worse and you have to ask someone to get something from that closet for you. That person is probably not going to find what you need and is very unlikely that it's going to feel comfortable moving things around. This is exactly what happens when we use multiple Booleans to represent states and enforce business rules. Our code gets messy and no one wants to touch it. So please raise your hand if you have a class with any of the following attributes. Right. I think everybody feels identified. Using multiple Booleans to represent states and enforce business rules. Very simple. Very concise. So let's take a look. Let's just bring it into the code. Here we have a video class that receives a source, which is the source of the video, and it initializes a source attribute with the given source, but it also initializes a Boolean attribute called isPlaneEqualsFaults because at the beginning the video is not plain. This class also provides three functions, pause, play, and stop. I'm not going to dive into the detail of how to play or pause or stop a video file, but what is important here is the last line of this function, which is self.isPlane. So when you call pause, self.isPlane is set to false. When you call play, self.isPlane is set to true, and when you call stop, set.isPlane is set to false. So we can go and create a video instance. We can play the video. We can pause the video. We can stop the video. And we can check if the video is plain or not. But video.isPlane, what does it really mean? When it's true, it definitely means that the video is plain. But what happens when it's false? Is it paused or is it stopped? In reality, we don't have enough information to answer this question. So naturally, we will go and do something like this. So we created a new attribute called pause, which by default we initialize it as false because when we create the instance of the video, the video is not paused either. Then when we call pause, now we are going to update this attribute and we're going to set it to true. When we call play, we're going to set it to false, and when we call stop, we're going to set it to false. So we can continue answering the video.isPlane by just calling video.isPlane. We can answer if the video is paused by calling video.isPause. But when it comes to knowing if the video is stopped, we actually have to check if the video is not plain and not paused. This is very ugly, it's not elegant at all, but it's also really fragile. It's prone to errors. The moment you introduce a new state, any condition that you have built around this logic is going to be broken. Let's talk about some business rules. Let's say that a video can only be played when it's paused or stopped. Rule number two, a video can only be paused when it's plain. Rule number three, a video can only be stopped when it's plain or paused. Very simple rules, just like a regular video player. So rule number one, a video can only be played when it's paused or stopped. We go and modify our play function and now we check if the video is not plain or if the video is paused. Then we make the call to play the video, update our play is playing attribute to true, set the isPause attribute to false. And if this condition is not satisfied, we basically raise that exception saying that you cannot play a video that is already plain, assuming that the video is already plain. So there is some assumption here that leads to fragile code again the moment you introduce new states on new business rules. For rule number two, we say a video can only be played when it's paused or a video can only be paused when it's plain. So in this case, we check if the video is already plain. In that case, we make the call to pause the video, update our flags, otherwise we raise an exception saying that we cannot pause a video that is not plain. And for rule number three, we say that a video can only be stopped when it's plain or paused. Again, we check if the video is plain or if the video is paused, update our flags, otherwise we raise an exception. So the code is rapidly becoming complex. Our play, pause, and stop functions are not actually focusing on what they're supposed to be doing. They are checking for the state, they are validating business rules, instead of focusing on playing, pausing, or stopping the video. It's bloated. We added a bunch of code and we didn't get any functionality. We're just checking for the state. It's repetitive. We keep checking for the state and raising exceptions. Even though it's not exactly the same condition, it's something repetitive. It's something that could probably be automated. And it's definitely hard to test. In order to write unit tests for this, you have to write unit tests for play, for pause, and stop with all the possible values that this flag can have. We're talking about only two flags. You might in the moment that you add more flags for different states. Here's a different approach. You can have a video class where you define some constants for your states and then instead of having the boolean flags, you can have a state. And then you just do initialize this as stop because the video is not playing once you create the instance. Now you say a video can only be played when it's paused or stopped. The code actually didn't change much. Instead of checking booleans, we continue checking for constants and now we have to update the state. We keep raising the exception in all the cases. And also we make assumptions. We say if the video is not playing, we can definitely play it. But is that true? It is true at the moment with the three business rules we have. But the moment we introduce a new rule or we introduce a new state, this code is going to be broken. Same thing for the pause case and same thing for the stop case. So let's talk about state machines. What's a state machine? A state machine is a mathematical model of computation. With a finite number of states, transitions between states and a machine that can only be at a state in a given time. All these sounds very complex, theoretical and mathematical. But in reality a state machine can be seen as a directed graph where each of the nodes represent a state in the machine and the connections between the nodes are represented in the transitions. Basically when two nodes are connected it means that you can transition from A to node B. And then you just need a pointer that tells you in what state you are at a given time. So here is a very simple example of a state machine. This is a state machine for like when a user lands on a website. Initially the user is in the logout state when you land for the first time on a website. Once you are in the logout state you can perform the login transition to move to the login state. Once in the login state you can perform the logout transition in the login state. In this case logout has this blueish glow representing the initial state of the current state at the beginning. So how do we design a state machine? First we need to define a number of states that we want to deal with. What are the states that our object is going to be representing? Then we need to lay down the transition between states. What are the business rules that we want to enforce? And finally we just need to pick a final number of states. Playing, pause and stop. Step number two, lay down the transition between states. In this case what we are doing here is translating the business rules into edges in the graph. So rule number one said that a video can only be played when it is stopped or paused. That's why you see two incoming arrows to the playing state. One from stop and one from pause and the name of the transition is called play. Number two said that a video can only be paused when it's playing. That's why you see only one incoming arrow to the pause state and it's coming from the playing state with the name of pause. And rule number three said that a video can only be stopped when it's playing or paused. That's why you see two incoming arrows to the stop state, one from playing and one from pause with the name of stop. And finally we just need to select the initial state. And then we stop because the video is not played at the beginning. So let's take a look at the code. So I'm going to rely on a library called Transitions. It's an open source library that you can find on GitHub. It's really simple to read like the code is fairly easy to understand but you can find any other flavors in multiple languages as well. So the first thing that you need to do to use Transitions is to import the machine. Then you define all the states that you want to deal with. See, this doesn't look very different to the latest approach that I show you. Next, we need to define the Transitions. And this is the part that looks a little bit complex when in reality it's not. What we're doing here is describing the graph. The trigger is basically the arrow that represents the connection. The value of the trigger is going to be the name of the transition. Source is going to be the origin of the transition and this is going to be the destination of the transition. So that a video can only be played when it's paused or stopped, we basically have two rules. One going from paused to playing and another one going from stopped to playing and the name of the transition is going to be play. For rule number one that a video can only be paused when it's playing, then we have one transition that is going from playing to paused and the name of the transition is going to be called pause. And rule number three that can only be stopped when it's playing or paused means that we have two transition rules. One going from playing and one going to pause both ending in the stop state and the name of the transition or the trigger is going to be stop. Another thing that is important to mention of this definition here is that the trigger name should match the name of the function that you're going to call. And I go over that in a moment. Finally, you need to create the machine. When you create the machine you say double equal self and I explain why this is important. It's related with the trigger name. Transitions, the list of transitions that we want to enforce or that we want to provide and the initial state which in this case is stop. Now our code actually looks like this. As you can see there is not much there because there is nothing related with the state anymore. We don't have to keep track of state. We don't have to check the state. We don't have to raise exceptions. The code is actually focusing on what it's supposed to be doing. Pause is going to pause the video. Like it doesn't care about the state at all. Same thing for play and same thing for stop. So now we can create a video instance. We can play the video and our state should transition to playing. If we really care we can actually check the state of the video. As you saw I didn't create a state attribute but the machine itself is injecting this attribute into the object by a model equal to self. The machine injects this attribute. You don't really have to access this attribute at all because technically you shouldn't deal with the state. The machine shouldn't handle that for you. But if you need for some reason it's available. We can pause the video. The state should transition to pause. We can stop the video and the state transitions to stop. But what happens if we call pause again? Remember a video can only be paused playing but we are in the stop state. This is going to give us a machine error. The machine is taking care of validating that this transition is valid. And we get this pretty much for free. All we had to do was to define the transitions but we didn't have to write a single condition. The way this works remember that I mentioned that the trigger name was important and the model equal self was important. It's basically when we initialize the machine the machine is going to look all the trigger names and it's going to look all the functions that match the trigger name and it's going to wrap them. So when you call play you're actually going through the machine first. The machine is going to validate the current state and the state you want to go and if that is valid then it calls the actual play function. If it's not it raises the machine error. Another thing that is important is that for example if you call play and you get an exception saying let's say that you try to play a file that doesn't exist or something happened trying to this file if your function raises the exception the exception is going to be propagated by the machine and therefore the state is not going to change. So how do we test this? We don't. If you're using a state machine library you don't really have to test that the state machine library is validating that you can transition from state A to B and then you cannot move from one state that is not connected to the other one because that's the state machine job. Instead you can test that the machine was initialized with the right transitions and they write initial state and then you can actually focus on testing real functionality. You can test that the play function actually does what it's supposed to be doing same thing for pause and same thing for stop. Nothing related with the state like you don't really have to care about that anymore. So let's add a new state and let's call the state Rewinding. I'm highlighting in red all the changes that I made to the rules when I added this state and a brand new rule. As you can see there is a new node in the graph called Rewinding and you see a bunch of new arrows. So rule number one saying that a video can only be played when it's paused, stopped or Rewinding means that there's a new arrow going from the Rewinding state to the playing state with the name of play. The rule number two that a video can only be paused when it's playing or Rewinding means that there's going to be an incoming arrow to the pause state from the Rewinding state named pause. The rule number three that a video can only be stopped when it's playing pause or Rewinding leads to that big arrow on the right that is basically saying that it's going from the Rewinding state to the stop state with the name of stop. And finally a video can only be Rewinded when it's playing or paused that leads to the two new incoming arrows to the Rewinding state one from playing and one from pause and the name is Rewind. So let's take a look at the code. In this case all we had to do was this. We basically defined in a new state called Rewinding and now we modified our transitions. So in this case for the rule number one we added one new transition that is going from the Rewinding state to the playing state and the name is going to be play. For rule number two we did exactly the same for going from the Rewinding state to the pause state and the name is pause. Similar for rule number three going from the Rewinding state to the stop state with the name stop and then for the brand new rule number four that a video can only be Rewinded when it's playing or paused we added two new transitions one going from playing and one going from pause to the Rewinding state the name of the transition is going to be Rewind. That's all we have to do. The only thing that is left here is to write the Rewind function but we didn't have to actually change any of the play, pause or stop logic we don't really have to care about the state of any of these functions and the other good thing is like the unit tests are not going to be broken. The only unit test that can be broken is the one that tests that the machine was initialized with the right transitions and the other unit tests for play and pause and stop remain the same. So when are booleans not enough? When you have multiple booleans representing a single state if you find yourself checking multiple booleans in order to decide what the state of an object is something smell fishy here there's got to be a better way of doing this when business rules are enforced by multiple booleans along the same lines if you find yourself checking multiple booleans multiple constant multiple attributes in order to decide if you can perform an action or not then you're probably in the wrong place like you probably have to change your approach so when to use a state machine when states are not binary do you have something more than true or false one and zero yes or no you may want to consider using a state machine it's not a hard rule but you may want to consider it when you have to account for future states and this is not about over engineering by the way it's about knowing for a fact that you're going to add new states anytime soon for example or even remove some states when you have to enforce complex set of business rules as you saw in the presentation we were talking about 3 states or 4 states in some cases 3 or 4 rules in some cases and it got fairly complex the moment we add more states and more business rules the complexity simply increases all this complexity to a state machine we can actually focus on implementing the code that we want to write we can focus on creating the functionality that we need to provide so in summary I would like you to consider using state machines to represent states and enforce business rules as we saw in the presentation when we use Booleans to represent states and enforce business rules the code gets messy unmaintainable, prone to errors and really hard to test with the closet analogy on the other hand if we use state machines we basically delegate this complexity and it also help us decreasing the amount of unit test that we have to write it is important to mention that state machines are not a silver bullet so it means that it's not a one solution to solve all these problems so know your tools, think about the problem that you're trying to solve and decide if it makes sense to add a state machine to your solution be mindful about it and buy less of your decisions that's all I have, thank you very much