 Hi. I'm Will. I'm a developer at Carbon 5 here in our Santa Monica office. As my colleague Andrew said earlier, we're an Agile product development company and work with clients ranging from new startups to large enterprises who are looking to build a thing and get a little better at Agile while they're at it. We also often host the Elixir LA meetup in Santa Monica, so perhaps hopefully I'll see some of you there. So today we're going to talk about ants, swarms of ants. And because you paid to go to an Elixir conference, we'll talk about some Elixir stuff too, like processes and gen servers and registries and dynamic supervisors and contacts and types, but mostly ants. Now a great man once said, a person is smart, people are dumb, panicky, dangerous animals and you know it. Luckily we're not talking about people today, we're talking about ants, and they're just the opposite. An ant is dumb, apologies to any ants in the room. A colony is a smart, efficient, coordinated group. So how do they do it? Well, the answer in short is pheromones. Now ants are a diverse set of creatures, but one, here's a common strategy. Let's follow an ant named Alice. Alice, who like all worker ants is female, wanders the world in a semi-random way, searching for food. But wherever Alice goes, she's generally able to keep track of where she is. Scientists think that some ants can keep track of landmarks, but that others actually count their steps and know their location by dead reckoning. We know this because some scientists glued tiny stilts to the legs of desert ants, gave them food and let them go back to their colony. What they found was that the ants totally overshot the colony because their step and distance counts were off. They also found they had been gluing tiny stilts onto ants, which is a pretty awesome way to do science. Anyway, so once Alice comes across food, she picks up a piece and then she heads back home. And since she knows where she is, she's able to take a direct path back. When she's traveling with food, she leaves a pheromone trail behind her. For some ants, the mechanism is just the food weighs them down enough that their stinger drags across the ground. But however they do it, other ants are able to smell this pheromone trail. And so once Alice gets back to the colony, she can drop the food off and then head back out. So now we can follow Bobby, another maybe less talented ant. She follows the same steps as Alice, but happens to find food that's twice as far out. Still she takes it back, also leaving a pheromone trail behind. But there's an important caveat. Pheromone trails evaporate over time. So because Bobby's trips take twice as long as Alice's, Bobby's laying down half the pheromones that Alice is for a given hour of collecting. And because the pheromones evaporate over time, Bobby's trail will never get that strong. This is important for our last ant, Eve. It turns out there's another step in the ant algorithm that Alice and Bobby were following. If while wandering, a ant comes across a pheromone trail, they may decide to follow it to food, depending on how strong the trail is. So now we can see how this comes together. If in Eve's wandering, she comes across Bobby's weak trail, she may decide to pass it by. But when she sees Alice's strong trail, she's much more likely to get on it, follow it to food, and then bring it back, leaving her own pheromone trail on top. Pretty quickly, more and more ants make that same choice as the path is reinforced, till eventually the whole colony has focused up on this, the closest source of food. Once the food's gone, the pheromone trail evaporates, and then all the ants can go wandering again, perhaps now finding Bobby's slightly weaker trail and focusing on that. This is known as a recruitment strategy. It's a way for members of a swarm to tell each other, hey, I found something good, help me out. So that's more or less how at least some ants work, which is interesting, maybe, but as a developer, what is maybe more interesting is that what Alice and friends were doing sounds a little bit like code. In fact, it's pretty straightforward to translate our ants' instructions into some pseudocode like this. There are two main cases here, one where the ant doesn't have food and one where she does. In the no food case, if Alice sees food, she grabs it. If she sees a strong pheromone trail, she probably gets on it, and if she sees a weak one or nothing, then she carries on randomly walking. If Alice is carrying food on the other hand, then she just keeps going, depositing pheromones, and going towards home. When she gets home, she drops off the food and heads back out. So this is pretty algorithmic. As it turns out, we're not the first people to notice that ants seem sort of similar to computer programs. Back in 1992, a man named Marco Dorgio, who, as you can see, is into both ants and robots, which I can relate to, came up with what he called the ant system for his PhD thesis. This was a start of what's now called ant colony optimization, or ACO, which is a method of applying Alice's ant algorithm to tricky problems like the traveling salesman and NAPSAC problems. This is the general formula for ACO. There's a lot of Greek in here, but all it really means is, for a set of possible moves, the probability of picking one of them is the amount of pheromone deposited on that move to the power of some influence factor, which is 2.0 in the literature by default, times the desirability of the move, which in traveling salesman might be related to distance or in NAPSAC related to value. Also taken to a factor divided by the value of all the other possible moves. So for instance, in the traveling salesman problem, this is a classic hard problem in computer science. A traveling salesperson wants to visit a bunch of cities and do it in the least amount of time possible. So how do they choose which cities, which order? Well, it turns out there's no easy way to figure it out, but ACO provides a pretty good approach. With an colony optimization, you have a bunch of virtual ants randomly traverse a graph of cities, making sure to visit each one once. Afterwards, each ant drops a pheromone trail on its path with the strength of the trail corresponding to how short the trip was. Since the ants use the ACO algorithm to choose moves over multiple iterations, the ants coalesce onto the optimal solution. Pretty smart for some ants. So, ACO is an interesting technique. But if you want to do that kind of number crunching, Elixir is not your language. But a few months ago, I was reading about ACO, because that's how I roll on a Tuesday night, and it got me thinking about something else. An ant has a little bit of state, it knows its location, and if it has food, it can take a limited set of actions, like moving, depositing pheromones, and grabbing and dropping off food. And the ACO algorithm that determines the ant's next state is just a function of its current state, plus its surroundings and a little bit of randomness. That means ants are like actors, not that kind. But they do fit pretty well into the actor model that Erlang and Elixir use for concurrency, which got me wondering, can you simulate a forging ant colony using Elixir processes with GenServer? So the short answer is yes, it works, and we'll see that later on in the talk. But that's not really the point here. The point is that if you're like me, you spend a lot of your time writing CRUD applications, even in Elixir, and don't actually explicitly drop into OTP that often. When I'm trying to solve a problem, my instinct is to reach for a database table. So today I want to walk you through a sort of sufficiently complex example of how you might go about building a program where you're thinking in actors, supervisors, and data types instead. You might not be writing an ant simulation because you won't be able to compete with this one, but maybe you'll be inspired for your next problem. So in Phoenix 1.3, as we heard earlier today, introduce the concept of context as a way to organize your code. As a reminder, instead of putting all your logic in a flat hierarchy of schemas and controllers, 1.3 encourages you to pull your business logic out into separate cohesive bundles of structs, functions, and modules called context. So following 1.3 best practices will start with defining some context. Coming up with good context for your domain is definitely an art, and it's one that I personally find kind of challenging. But these are the three that I came up with for this project. Ants, worlds, and simulations. Ants are definitely their own thing. The ant's context will have logic for choosing a move based on locations it receives and keeping state stuff that ants know about. But outside of the ant's site line, there's a whole world. The world knows about locations of food, even if no ants do, and can report to an ant what's immediately around it. That should definitely be separate from the ant code, and ants ought to only be able to query data from the world using its defined API. So that's clearly another context. And then finally, outside of the simulated world, they're the mechanisms of actually running a simulation. Spinning things up, shutting things down, assigning IDs, sending messages to cause the ants to do things. So we can keep the things that we're actually simulating focused by taking the simulation itself out into its own context. I also ended up with sort of a shared context that holds a grab bag of stuff that was used across the app, which is maybe a code smell? Or is it a code pheromone? No, that's not good. Okay. So now that we've got our base context figured out, the next step is to set up our data types, the structs and other things that define the domain. Then we can figure out how the data will be interpreted and transformed as a simulation progresses. So a reasonable place to start might be to define our humble ant. Like we discussed earlier, the ant only really knows two facts, where they are in the world and if they're carrying food. So our ant model is equally simple. It knows its x, y coordinates and it has a food Boolean. You'll notice I also defined a type for our ant. Types obviously aren't required in Elixir, but declaring types for structs lets the static type checker dialyzer be a lot smarter about knowing, are you using them, and is also a nice bit of documentation. In the Elixir standard library as well as type functional languages like OCaml, it's idiomatic to name the main type in a module t. That way you can say that a function takes an ant.t like in this function and it's clear what you're talking about. So the few other types in this domain will focus on the world of the simulation. And how an ant will interact with it. So what does that world look like? Well, for this simulation, the world is pretty simple. It's a grid of tiles surrounded by impassable rocks. The world is mostly empty land, but it has a few brown food tiles for ants to find and a pink colony for them to start at and return the food to. So given this, how would we represent in types? Well, they're pretty much just the English descriptions. Land tiles can have pheromones on them. Food tiles can have food to be collected. The home tile has food that's been collected, and rocks don't have any particular data associated with them. For brevity, I've skipped the type declarations here, but pheromones float and food is an integer. So given those types, we can define a tile.t type, which can be any one of the four sorts of tile. This is a key concept in type functional programming, and it's called a discriminated union, or a tag union, destroy union, variant, or some type, depending on the language. But whatever you call it, the point is that a tile.t could be any sort of tile, but we can use pattern matching to figure out which one it is. Being able to pattern match out, the type is what makes a tag union different than a simple union type. So, for instance, because we know that we have a tile.t in this rating function, we can make some decision based on what type it is. Obviously, this kind of pattern matching on a thing is common and elixir, but it's nice to give the technique a name, or several in this case. And because I've declared the possible types of a tile, it's easy to verify that the rating function is a total function, in that it handles all possible inputs. A couple other types, points, and moves round out the main domain model. Both of these are defined as pairs of integers, but points are x, y coordinates on the grid, whereas moves are changes to x and y that go from negative one to one. So even though they're internally represented in the same way, they have different modules so that we have a clear place to put functions that operate in one of the other, we have types that say what an x or a y means in particular function, and if someday we want to change one of the data structures, we can do all that work in one place. So now that we've got some types, we can sketch out how the system will work. For the most part, I'm not going to dive into the actual implementation of the modules, but you might be interested in seeing some of the mechanics of managing a bunch of processes. So here's the supervision tree for this application. I want to be able to run multiple simulations at once, so the two top-level processes are a SIMID agent that keeps track of the IDs for the different running simulations, and then a simulation supervisor, which is a dynamic supervisor that can spin up many child simulation supervisors and restart them when they have errors. A quick note on dynamic supervisors. They've been mentioned a couple of times today. They're a new addition in Elixir 1.6 and replace the old simple one-for-one supervisors. Both kinds let you create many trial processes at runtime rather than supervising a static list of processes. The only difference is that the new dynamic supervisors have a little bit nicer syntax, and they can supervise multiple different types of children. But whatever sort of dynamic supervisor you use, there's always a problem. There's naming. To work with an Erlang process, you need to know its name or PID to send it messages. That's easy for processes that you only have one instance of, since you can give a global name, like often the name of the module it's defined in, like with this simulations supervisor. But for a process that you're going to have a bunch of, like an ant, that doesn't work. Instead, we'll need a registry and a via-tubble. So you can do a lot of different things with registries, but the one that matters here is naming processes. Using the registry, you can construct a via-tubble, which is a data structure that can be used to uniquely identify a process. That way, even if a process is dynamically started and then later restarted with a new PID, you can still pass messages to it. A via-tubble has this structure. It's a three-tubble of the Atomvia, the registry module, which is Thing and Elixir, and then some tuple that starts with the name of a module you set up as a registry, and then some uniquely identifying information. So with that, we can start simulation supervisors. Each one of these controls the supervisors for a given simulation, super well-named, so they can be taken down as one unit. Simulation supervisors can be referred to by their via-tubble, which includes the SimRegistry, that's the registry for this app, the simulations ID, and then an Atom saying, hey, this is a simulation process. For free simulation, we have a tile supervisor that supervises a bunch of tiles, 400 for a 20 by 20 grid. Each tile is a gen server that holds the tile.t data in state and can be told to add pheromones or remove food from its state. The via-tubble for tiles includes the x, y coordinates of the tile. That's important because it means that for an ant, at a certain set of coordinates, it can easily look about the tiles around it. For an ant at 1, 1, all the tiles from 0, 0 to 2, 2. That kind of lookup would be really inefficient if the tiles were kept in a big list. So we also have a process for each ant, since that was sort of the point of the exercise. But since there's nothing identifying about an ant that's coordinates change, there's no x, y that you can send a message to, it needs a little anti-D agent to give a unique identifier for each ant and then be able to tell you which IDs you can loop through to send a message to all of them. The ant gen servers, like tiles, keep an ant.t in memory. They can take external commands to move themselves or to deposit pheromones to decide where to move each ant to ask the world's context for its surroundings, which is just a list of tiles. The ant then does a weighted random selection from that list using the ACO algorithm and goes in that direction. Since the tile gen server logic is in the world's context, the ants are able to work with tiles without needing to know how they're persisted. So if we wanted to change our mind and decide to store tiles in a map or a database, because that's in the world's context, the ant's context wouldn't have to change. So those are the highlights. With a simple tree, we're able to spin up hundreds of ants and tiles, have them work together in a useful way. So now that we've gone through that, let's actually see this thing in action. OK, it worked. So this is a gift, but of the running simulation, that you can see that the ants at first spread out into the world, quickly found the first one, and you can see the yellow trails as they lay down pheromones and focus up, getting further and further ones away. For the each turn, it takes about six milliseconds to run, and the way it works is the app is waiting every 50 milliseconds, is requesting another turn from the server, it's doing all these calculations, and then sending back some state that's rendering. Since we heard about Elm earlier, I'll mention that this is a Reason React app, which is another super functional front-end language, but a topic for a whole nother conference. Yeah, so I think that's kind of neat. I don't know. All right, so what did we learn? First of all, this is how Skynet starts. And OTP supervisors already know how to terminate it. We've learned that dynamic supervisors, registries, and via tuples are useful for handling large numbers of dynamic processes. We've learned that answer pretty cool, good note. We've learned keep your business logic in your context. We didn't even look at the Phoenix web code in this talk because there's nothing to see. The turn end point just immediately calls out to the simulations context, which orchestrates telling all the other contexts to do their work, and then kicks it back to the client. And it's useful to declare the types of your data structures, including structs, and you can name the main type in a module, t. And finally, you can do a lot with processes. The funny thing about this program is even though we're running all these things, it's essentially synchronous. Every turn we wait for all the ants to move, then we have them place their pheromones, then gather everything up for display, then respond to the client. So why go through all the trouble with gen servers? One potential benefit would be concurrency. If you can, on a multi-core processor, you can have ants all working at once, maybe you'll get some performance benefits. In this case, things actually got faster when I turned the concurrency down a little bit. But a more important gain is fault tolerance. Without processes, if one of these ants had a bug, no pun intended, they could corrupt the state of the entire simulation and bring it down. But with every ant in its own process, if an ant manages to wander off the edge of the world, as they did for a while, the supervisor is able to kill that ant and restart it back at home. If we needed to keep this thing up for a long time, that would be really useful. Finally, it's a nice way to think. I'm simulating independently acting ants, so it's nice to be able to model my code that way. In some ways, this almost feels object-oriented. Essentially, we've got a bunch of instances of ant and tile classes, each with its own state and then methods that you can call to update them. So we get some of the benefits of OOP, nice models, but without the drawbacks of mutable state and crazy inheritance trees and all the other reasons that we are all at an elixir conference. So going all in on processes isn't going to be the right fit for every project, but I think it's a good fit for more projects than we realize. That's it. If you want to crib any code or play with the ants yourself, that's all in the repo here. Thanks.