 Hi everyone, my name's Chris, I'm here today to be talking to you all about selling food with Elixir. So that's a bit of a weird title I know, but it will kind of make sense in a bit when we start getting into the talk and you can kind of see the meat of what we're talking about here. So I'm Chris, I work at a company called Made By Many, we're based in New York and we have an office in London as well. But I'm not actually here today to talk to you about Made By Many, I'm actually here to talk to you about a project that we've done for a client of ours which is called Carver Grill. And Carver Grill are a fast casual restaurant chain based out of DC in Washington, sorry Washington DC, no one says it the other way around, Britishism. So what a fast casual restaurant chain means, it's a lot of words, but you can think of it basically like a chipotle for Mediterranean food and if anyone's been, anyone that lives in DC here, I hope you've been there and I hope you're into it. Yeah, did I hear a woo? Oh, there's a woo. Okay, now we're rolling. So this is, this is a, this is the application that we ended up building here, sorry I've just messed that up. So this is the application that we ended up building which is basically a food ordering app. So you can go ahead, you can place a food order, you can kind of build these bowls up and you can add to the cart and you can check out. And the whole point of the app is that you're, you're basically placing an order for a, for lunchtime and you're doing a pickup at the store. So you can see it's a really great looking app. There's obviously a lot of JavaScript, but don't worry, I'm not going to talk to you about JavaScript today. So I'm sure everyone's happy about that. But basically, the system that we've built for them, what it really needed to do is kind of process and send these orders to a point of sale system. And those point of sale systems you have to remember are based in a store, in a physical store somewhere in a location across the US. The second thing that we had to do was basically throttle the volume of those orders going to a store. Now there's nothing, kitchens hate more than having too many orders and they can process in a given window. So what we, a lot of our requirements around this system, we're all about throttling those orders and making sure that we're not delivering too many at one time. And then the third thing that we have to do is have resilience against these stores being down. So because it's a point of sale system and it's in a store, those stores have really flaky internet access. So we're talking like DSL from 10 years ago. So something that we have to bear in mind here is that, you know, you're still going to want your lunch and you don't really care if that order went through or not. But we have to do a good job of making sure that that order gets to a store. So today, I'm going to talk to you about three kind of points on here. So we're going to run through some of the application design that we use to design the system. We're going to talk about some stories from production. And then I'm going to wrap up by kind of touching on if we did it again, what would we do differently? So the first thing we're going to talk about today is the application design and the design of that system. And before I get into all the elixir parts, what I wanted to do, and I'm sure there's a lot of probably post-rails developers or current-rails developers in the room. So I wanted to walk through how we might have approached this in Rails previously. So you're all probably familiar with a step-by-step approach like this. You know, do a bit of Rails new. We bundle and still sidekick because we want to send some things in the background. And then we start generating our models and we just start coding. And you know, that's great. And we've all built great systems like that, I'm sure. But if we start building a system like that, what we're really doing is putting all of our state in the database. And we're thinking in the terms where we're saying all of the state of this application exists in this database and we're always using that to read and write. And in an application like that, all roads are going to lead to an active record object. And you know what? We could do this again in Phoenix, right? We could do exactly the same thing. We could mix phoenix.new, Carvergrill. We can install work. There isn't like a nice mix command. We can run here because you obviously have to add it to your depths. And then we can start coding again. And that's okay. And I'm sure like some of us will have success with doing that. But what we're really doing there is making this mistake where we're appropriating the shape of the previous technology as the content of this new technology. So we're basically, you know, we're taking all of our old baggage with us into this new world. And actually, we can approach this in a very different way. And I guess that's that's kind of what I want to talk to you about today. So some elixir design principles that we use to design the system. So there's like a wealth of information in the community. And like really, I'm tapping on a lot of that information that's out there already to kind of come up with a lot of these design principles. So the first one is Phoenix is not your application. And I think Chris McCord, yeah, we've got another whoop. Here we go. So I think Chris McCord did a fantastic job today of kind of reinforcing this point. And the changes that are coming up in Phoenix 1.3 really means that you're separating out this idea that, you know, Phoenix is just a web interface, right? Phoenix is not your system. Your system is the thing that it needs to do. So in our case, it needs to send food orders to a store when it needs to track those orders and things like that. But, you know, if we start thinking about Phoenix being the entire application, we're kind of constraining ourselves into that world of MV and C. And you know, those are, those are okay. That's an okay world to be in. But I think we can do a lot more in Elixir as well. The second point is embracing state outside of the database. So as I alluded to earlier, if we, if we look at Rails, what we're really doing is thinking, all of this state goes through this database, right? But actually in Elixir, we have these, these mechanisms so that we can store state in processes and store them in memory as well and become a bit more stateful in our services. And that state doesn't have to just live in a database. So we have many other mechanisms to where to put it. The third point is, if it's concurrent, extract it into its, into an OTP application. And you know what, that, there's also the inverse of this as well, which is like, don't go overboard extracting things into OTP applications, at least not at the beginning as well. And what I mean by this concurrent aspect here is, you know, all those things that you're probably going to use workers for in your Rails app, well, they're probably really good candidates for something that, you know, could be extracted into an application on its own. And the fourth design principle here is, don't just let it crash. So if you, if you've come into this world, you've probably heard this term, let it crash, be bounded around. And you know what, it's fantastic for system design. And all of the guarantees that we can bring with supervision trees and things like that. But, you know, for us, especially, we had to think about the failure, and what happens when these things don't work. So we have to think about the expected failure cases, right, and, and handle in those. And that's, that's really some of what I'm going to get into as well. Yeah. So given all of that, what we ended up with in the system design is something that kind of looks a bit like this. So we have basically four main components of the application. We have a scheduler that's job is to schedule and send orders to the point of sale system. We have our order tracker, you could probably guess what that one does, that actually needs to track the state of the orders from the store. We have our store availability managers that basically keep track of that capacity so we can limit the amount of orders going to a store. And then we have our web part as well. And there's a bit more to the system. I'm kind of simplifying it a bit here, just sort of sake of brevity in these slides. But the first thing that I wanted to dig into today is the order scheduler part of the system. So what we do with the order scheduler is we do this just in time delivery of an order to, to a store. So the entire job of this application is to take an order and then try and send it to a store. And what we do here is our point of sale provider that we're using here, they don't actually have a means to queue up those orders. So what we're doing is we built our own queuing mechanism effectively. So we batch up those orders 15 minutes ahead of time. And then we, we send them to a store. So if your order is at 12 o'clock, we're going to try and send that order at 15 minutes before so that the store has enough time to build that order. So you saw from the video earlier about how complex it is to build one of these bowls. Now imagine that in the real world where they have to go along like a conveyor belt. So they need a bit of time to be able to make these orders. And the system has to deal with stores being down and orders being delayed, sending to a store without having an impact on any of the other stores. So we want to do these things concurrently, but we want to isolate their failures. So the actual supervision tree structure looks a bit like this. So the rounded boxes like this represent supervisors and our circles represent the workers and the gen servers in the system. So what you can see here is quite a complex super, actually it's not a too complex supervision tree. It's quite simple supervision tree here really. But what you'll notice here is that we have, we have two trees going on here. And the reason why we do that is what we're actually doing is we're setting boundaries around these stores, right? So we're saying that this, this failures can happen, but they're going to happen on a per store basis. So we're creating nice boundaries around these different stores. So say store one represents a store in DC and store two is somewhere in LA. You know, if we stop sending orders to store one in DC, we can still be sending orders to that store in LA as well. So how does this work? How do we actually like schedule these orders to get sent to a point of sale? Well, you know, we don't use any cron jobs or anything like that in the system. We use the building blocks that Erlang and Elixir give us. So Elixir has this great mechanism to be able to send yourself a message after a given time. And that's this process dot send after call you can see here. So in that call, what we're saying is send myself a message of process and wait a certain amount of time. And that time is in milliseconds, not seconds, just in case you mess that one up like I did first of all. And then we actually we process our orders and then we in queue ourselves to do that again. So this is kind of a recursive function that we keep calling ourselves here. And what that actually looks like. So we have we have our store supervision tree here. We have a store manager, which is just a gen server that sends itself a message like you just saw. And what we do there is then we request some data from the database. We get these we get these orders back. And then what we're basically doing here, we're getting the orders back from the database. And then what we're doing is we are taking those orders and we are creating effectively a worker per order that we're sending out. So you can see here this is us like creating all these different workers. And then each one sends to the store independently. And the idea there is that, you know, each one of these workers can fail, but without having an impact on any other workers in the system. And at the same time, the store could fail, all of those orders could stop sending. And we still don't have an impact on any other orders that are going out by another stores supervision tree here. So talking about failure, what what happens when failure occurs in the system? So like I said, we're expecting failure and we're designing for that failure here. So we want to make sure you get your lunch, right? That's like that's an important thing. You're going to be pretty annoyed if you get to the store, you turn up and there's no lunch waiting for you. So we actually use a library called gen retry, which is by Pete Gommash and the guys at Apke's that handles a lot of this retry logic for us. So gen retry will actually do things like exponential back off. It will hand a jitter with your retries as well. So you don't just try everything at the same moment in time. And you can set limits on how many times you might want to retry something to be sent as well. And you might think like, wow, that all sounds really complicated. But in actual fact, the code is literally like this. We basically have a function that says this thing may or may not blow up. It might succeed or it might throw. And then we just pass gen retry that function and give it some options here. So you can see that we're saying that we want to delay by three seconds each time. And that will be exponential. So it starts off with a delay of three seconds and then we're applying an exponential curve to that back off as well. And then we're applying a bit of jitter here as well. So we're saying add point to jitter to this retrying. But honestly, we've been running this in production and it's been fantastic. So if you have similar retry needs, definitely check out gen retry. Very much recommended. Right. So that approach is great. The problem is we're doing that is it's actually quite hard to think about exponential back off in that world. So first of all, we use the library called delayed OTP that effectively has a supervisor that will allow a child to die a slow death, which sounds really brutal. Basically, it will add this exponential back off. But gen retry wraps all of that up. And it is under the hood. It's actually a supervision tree with its own supervisor and its own workers. So I'll answer more questions at the end, though, if you have them. So the next part of the system that I really want to dig into is the order tracker. So you can kind of guess, like I said earlier, what this order tracker does. And we're kind of every time we send an order to the store, we want to update you once that order has actually been processed. And we want to send you push notifications and things like that. So again, the order tracker supervision tree, very similar to what we had before, you'll see this like idea used time and time again in the system where we're divvying things up by stores. In this case, we have a task supervisor here and I'll get into why we're doing that in a second. And then we have again some worker processes from that as well. So each one of these store managers fetches a feed of order changes from a store. And that feed of order changes basically says what's happened to the order and at what time it's happened. But we have to get basically like a big feed and we have to give it a point in time in which like we last fetch that feed to start from and it can have multiple pages. So these these order, these store managers basically ingest that feed, read all the pages and then they process each one of those events in turn. So what we do here to do that is we actually basically just map over all of those events. And then we use a task supervisor here to start a child. And the reason why we use a task supervisor is if we just use task async here, if that was to blow up, we wouldn't be able to be processing anything else. And that would take down this calling process at the same time. So by using task supervisor in our supervision tree, we're making sure that if one of these things dies, it's a supervised child and it's not going to take down this calling process. So we keep track of the last successful time that we were able to fetch that feed in a state of this process. So this is just a gen server, right? We just basically have a timestamp that said this was the last time we could fetch it. But the problem with this is what happens when this process goes down, right? We can't guarantee that this thing isn't going to go down. And what happens on restart of this process? So we want to restart from a last known good state. So we want to actually persist that timestamp somewhere with more permanent storage, right? Outside of a processes state. So what we do to do that is if you haven't seen this before, gen servers have a really great terminate method that you can get access to. And what terminate does is allows you to perform any cleanup that you might do before this process is going to go away. So we can use that here and we can get access to the last date to actually persist something to the database to keep track of that. So this is a great place to do any cleanup that you might have before the process goes away basically. And you can have guarantees that this is going to be called because of OTP effectively. And the last part of the system that I kind of want to dig into is probably the most complex part of what we've built and the most complex application that we built in this tree. So it's the store availability system. So as I said earlier, we basically, we have a certain number of time slots that you can pick from to place your order. And the store availability system basically keeps track of that time slot and how much capacity a store has. And what we mean by capacity is basically how many orders can I process in a given window of time. And for us, that window is 15 minutes. So we're saying how many orders can we process in that 15 minute window. So again, you can see there's a very much a recurring theme here. Our supervision tree is set up by stores kind of each store as its own tree. We have this funky thing called an X table manager and I'll kind of talk about that in a minute. And we're backed by an X table here. So let me just explain why we do that. So there's a really high demand for these time slots during the lunch rush. That's because everyone clearly is going crazy over getting a carver food. They really want that. And everyone's pretty frantic trying to get that time slot of, you know, the elusive 1pm to pick up your lunch. So we could use the database to get all this information. But we actually have to do like quite a lot of queries here to aggregate all this information. And you know, making that call is fairly expensive. So at the end of the day, all we're really doing is we want to keep track of some integers, right? We want to say there's a time slot and this is how many have happened and this is how many are left. So what we can do is use X to store that capacity. And if you haven't seen X before, X is basically like a redis for Erlang. So it's built into Erlang itself. It stands for Erlang term storage. So it's basically a key value store that you can use with no dependencies in your application. So for us, X was a really good candidate of being able to store this data. So that data kind of looks like this, where it's just a tuple. And in X, you can have any Erlang term as the key for your data. So for us, that key is actually this time slot here. So we say that at 11.45, we're going to have the capacity of 15. So we can process 15 orders at 11.45. This is kind of made up data, by the way, don't try and like hack the system or anything. And then we have the pending and confirmed orders count here. So that's the number of orders that have been confirmed in the system and the ones that are kind of pending. And I'll explain about that pending state in a moment. So this is all very well and good. So we have lots of these data structures that represent this capacity here. But how do we retrieve them back out of X? So what we do in that case is we actually use this function in X called tab to list. And what tab to list does is basically dumps out the state of your table, right? So we're doing this in a gen server callback here. So we're saying, just give me the entire availability matrix for the store. So that's all of that data that you just stored. And it's actually keyed by the date time. So we can order by that as well. So this becomes a really, really fast way to say, give me all this data. And then we can reply back to the client with that. And now when we want to actually update the availability of that time slot. So you saw we had a counter effectively. So X has this great method built in where we can do atomic counter updates, right? So we can make sure that our reads and writes don't race and we can say that for a given date time, the syntax is a little weird. I know it's like Erlang syntax and it takes you a while to get used to. But basically what we're saying here is update the fourth element in the tuple and increment that value by one. And you have to put the other elements to say that you're not going to do that unless someone knows differently. But that's that's kind of what I thought you had to do there. And then we can just reply. In this case, we just reply okay with that. Okay, so I talked a bit about pending orders earlier. And and I talked about how there's this high demand for these time slots during the lunch rush here. But how do we not let that impact the user? So everyone's going for that one o'clock time slot. It's basically first come first serve. So we need to make sure that, you know, if you've selected a time slot, but you've spent ages putting in your card details that you're not going to miss out just could someone else got to that checkout button faster than you did. So just as a refresher here, this is kind of what our checkout looks like. We have you have all these time slots along here. And then you have to put in your payment details and click checkout. So the way that we model that is we basically have you've all probably used like ticket master or something like that before. So you have a countdown effectively. So once you click on one of those time slots, we basically have a countdown that says you've got seven minutes to confirm that order. And we hold that time slot for you. So that's going to take into account when we say the capacity, we're going to include those pending time slots as well. So the code to do that looks a little something like this. We basically say hold that time slot for a given store and we give it an order ID just as a reference back. And basically what happens in there is we kind of we actually model all of this in a process, right. So process is a great for storing state. But also, you know, we can be active based here. We can say that one one process represents one time slot in the system. So you're held time slot. What we're going to do here is basically create a process here. And then what we're going to do is basically we're going to monitor that held time slot process. So by monitoring it, what we get back is a reference to that held time slot. And then from this point, we can store that reference in the gen server for the store manager. And then in that held time slot basically what we have is we start a timer. So we just use process dot send after that says after X amount of time, let's say it's five minutes, we're just going to kill that process. And then we're going to listen for the result of that termination back on the store manager. So we're modeling this whole idea of like you placing an order and holding this time slot just in this process here. So what that process will do is eventually it's going to terminate after five minutes or so. And then what we can do is we can actually listen for the down event on that store manager. And because we're monitoring that process, we actually get notifications about when those processes die. And that callback looks a bit like this. So you can see that we have a down event and this would be this is implemented on our store manager. So every time that we're monitoring a process, we can always receive these down events. And then what we can do is just basically say, hey, that time slot that you held, we can now remove that hold on that time slot and decrement that pending order count as well. And so you might be thinking like, hey, this is all well and good, but you're storing all of this state in memory, right? X goes away when the process dies. So if our server restarts, we're going to lose that whole availability schedule that we had in memory at the given time. So what we do to get around that is basically we can read the state from the database and recreate all of that state about what orders were held and kind of the capacity of a store on the application start. So this is literally ripped straight from the code base. We say we're going to start one of these things in the system, we're going to read in all of this state, and then we're going to start the store manager with that availability matrix already pre-compiled. So this is going to do a bunch of database lookups and stuff. But really what we're doing here is we're using the databases, basically a bootstrap mechanism to get that data in the system, in this process here. One other pro tip, if you've worked with X before, you probably know this. But like I said earlier, if your process goes down with an X table, you're going to kill that X table as well, right? So you're going to lose all that. And what you can actually do here is you can use this library called Immortal that has this thing called an X table manager. What this will do is basically when you start up your application, the X table manager creates the X table for you and then you hand off that table back to your process. And this X table manager is basically going to listen for down events on your other process that was interfacing with the X table. So basically when that process goes down, the X table manager will take back the X table. And then when the new one comes back online because it's in a supervision tree and everything magically works again, it will give it back to it. So this is, it's honestly that sounded really complicated, but it's really, really simple to get up and run in. So definitely check it out if you're going to be using X. Okay, so that kind of walks through a lot of our application design. So what I wanted to share with you next are some stories from production. So, you know, application design is all well and good, but when you run these things in the real world, things happen, right? So the first thing that I really wanted to talk about was turning down the concurrency. So this is an actual email I actually got from our API provider that was like, hey, you're making too many requests, you know, like in Elixir you think, oh, this is great, this is awesome, but, you know, there's always going to be limits to those, those concurrent requests you can make. And usually it's going to be someone else that's going to be the bottleneck, right? Or that bottleneck might be something in your system. But basically to resolve this, we had to just stop making so many requests. So the next one I wanted to talk about is sending orders twice. So I kind of talked about how kitchens get really annoyed when they've got too many orders and they get even more annoyed when you're sending the same order multiple times. So something that when we first launched the app that was happening was we were seeing all these duplicate orders come in and I was like, we can't do that, like that can't happen, like we built this amazing system, there's no way that we could be sending these duplicates. And, you know, I probably had a bit too much hubris about it, but what we ended up finding out was that there was actually multiple versions of our order scheduler running per node. And, you know, this is actually a really, really trivial thing to fix in that you just name your processes, right? So we had one of those store managers and what I thought it was unique per node already, but something else was starting one of them which led to all these multiple orders being sent and they were getting sent at the exact same moment in time as well because concurrency, awesome. So all you have to do here is basically name your process and if you name your process it's going to be guaranteed to be unique on that node. If you want global uniqueness you can use something like the global module and people have been talking about that earlier as well. So that's literally the fix that we did, we just gave it a name, pass that name to it on start. And the second thing here is like Erlang has all these really, really fantastic introspection tools. But unfortunately we actually deploy on Heroku and Heroku doesn't give you access at runtime to those introspection tools because you can't, every time you run like Heroku run something like a bash session or something you're actually spinning up a new dyno. So there's no way to introspect the state of that system at runtime using Heroku. So that made debugging this really, really hard. So I had to basically do it locally but we got there in the end. The third one I want to talk about is, this is a kind of a really annoying issue with our point of sale API provider that we used. Basically we were sending these orders to the store and they were saying, nope, that order didn't go through. But then what we were actually seeing is that order still arrive at the kitchen. So it was really confusing. So we had this wonderful system of retries that I showed you earlier where basically we have gem retry in place. If their API says, nope, that didn't go through, we'll retry and send that order again. But what ended up happening was we had to completely turn off that system because our API provider wasn't actually atomic in the way they processed their orders. So we couldn't guarantee that they were or were not there. So yeah, it wasn't a great API basically. So I think the lesson learned here is your failure model is only as good as the API you're calling or the system you're interoperating with. And we can design these great systems but we have to think about the limitations of the third parties or the other things that we're calling in that system too. And the fourth and kind of final lesson learned here is this error request timeout. So we were seeing this like basically when we're making requests to our loyalty provider which we used to take payment. So this is quite a big deal. You don't want to mess up the payment part. Don't do that. So we were seeing this kind of intermittently. And it was really difficult to debug because I'd see it. I'd see it in the logs and I'd try and recreate it locally. I couldn't recreate it there. So I was assuming that it was the third parties API provider that had the issues of these request timeouts. But in actual fact, we use this HTTP library called HTTP Potion. I'm sure a bunch of you in here are also using a library like that or Poison. So Potion specifically, which under the hood it uses an Erlang library called Eyebrows. And basically what that does is it pulls the connections per host. So that host name is just the string that you give the address of the API you're calling. And what this ends up happening is it sets it as a default size of 10 with a queue size of 10 per connection. So you can effectively have one connection each and then that has a queue of 10 behind it. And if you have a lot of slow running requests, what's going to end up happening is you're going to see these request timeouts. So the solve is kind of trivial. You just say, hey, for this host set are much bigger Mac sessions. Honestly, I would remember this if you're using this library and if you're designing APIs around it. But really the better solve here is thinking about pulls and pulls of workers that you can use. So Hackney actually, which HTTP Poison relies on, that's really confusing. There's two very similarly named libraries. But HTTP Poison actually relies on Hackney under the hood. And Hackney has a great way to say, for this host, create a pool of HTTP connections. And then what we could do with that is say that we're only going to use that pool in these very defined boundaries that we have in our system. But I think the lesson learned here is really understanding the process design and the bottlenecks of the libraries you're using and not just the process design of your system as well. We're all probably interoperating with a lot of other libraries. And we're probably making use of a lot of those great Elixir libraries or Erlang libraries under the hood. And actually just take a minute to look at what that supervision tree looks like for those libraries. And make sure that you're not introducing some huge bottleneck into your system that you didn't know about ahead of time. So lastly, doing it again. So I think there's three things here. The first of which is kind of feeding work and don't read work. And what I mean by this is you saw earlier that we have lots of things connecting to the database and kind of pulling from that database. Now, you know, that's fine. But what we've done there is introduce a dependency on the database in that application. So what that meant for us was we literally had to extract the database into its own OTP application and use that as a dependency on these other applications in our umbrella app. Whereas another way that we could possibly model this would be using something like GenStage to feed work back to these systems and then they kind of process this work there. So the second one is start with an umbrella app. So we didn't actually do this. We did, we literally did mix phoenix.new app. And then we put everything in lib. And that was fine. It was totally fine. But you know, I think next time and especially seeing what phoenix 1.3 has coming out with all these like the ability to structure things as an umbrella app from day one, that's awesome. And like we could have definitely made use of that when we built this. And the third one, this might be controversial. Yeah, maybe don't use Heroku. Like Heroku is fantastic. It lets you get up and run in really fast. But you don't have the ability to do OTP releases, which means you don't have these great introspection tools. And also on Heroku, you can't do anything multi-node, right? So Heroku doesn't allow access to EPMD, which is the Erlang daemon that runs. So you basically can't do all of the really cool node connection stuff. And you have to use something like Redis to kind of be the middleman between there. So I don't know. Obviously, there's a lot of extra complexity if you're not using Heroku. But I think definitely approaching this next time, we think about using something like EC2. Or we're actually looking at Docker quite a lot as well. So in conclusion, I would definitely use Elixir again. I just want to make this point. So Elixir was really, really well suited to this job that we had of dealing with failures, lots of concurrent work going on. And honestly, the programming model was really simple. We ramped up members of our team to also be writing Elixir as well. Some of them are here today. And it's been absolutely rock solid in production. We've had very, very few issues with it so far. And the performance is basically dreamy. Everyone talks about it. But it's so awesome to think, oh, they don't have to put caching in front of absolutely everything to get below 200 millisecond response time. And honestly, working with it every day has just been fantastic. So I spent the last six months of my life basically building this thing, writing Elixir and some JavaScript. Let's not talk about that every day. And it's been awesome. So yeah. Thank you so much for coming. And if you have questions, I'd love to take them. Yeah. Oh, and also, sorry, sorry. Just awkward apology there. So Carver Grill are actually hiring Elixir engineers in DC. If anyone's interested in doing Elixir full time, get in touch with me. Come grab me. Or check out CarverGrill.com. And yeah, thank you. Does anyone have any questions? So when you're using edge tables, they exist per process. Yeah. And I'm assuming you're running multi-node, multi-dino for capacity and persistence and all that stuff. How do you handle synchronizing to keep the edge tables between nodes in sync? Yeah, it's a great question. And I can reveal the dirty secret. This is on one node right now. I was literally waiting for that to come up. So yeah, we basically, we don't need multi-node right now. It's barely making any use of, it's on like a 2x dyno on Heroku, which is like 1 gig of RAM. And it's, it uses like 100 meg of RAM and barely any CPU. So yeah, when we do need it, what I would think about doing is probably using something like Phoenix Presence and then sharing that state between the nodes rather than actually using that. So yeah, yeah. One of the quotes that you've said that really jumped out at me is your failure model is only as good as the API you're calling. Yeah. What would you have done differently if you'd known in advance how much your API sucked? That's a really good question. Probably consider a different provider is the honest answer. I think, but I think really maybe some even more robust failure handling and probably some more manual processes to actually deal with that. So yeah. Next. Desmond, there's one here as well if you want to run to the back. Hi. You mentioned that when, when you needed to call a start, you could pull all the data from the, for the S table out of the database. You didn't show how you put it into the database. I'm curious if you'd had some means of just sort of dumping it in there or did you have more of a relational schema? Yeah, it's more of a relational schema. So those orders have a, effectively have a time slot. So we're always persisting that order to the database. And then we can use that to recreate the state again. So it's very, very classic like order line items, you know, the kind of model that we've probably all used a thousand times if you've done this system. So yeah. I don't know. I have, yeah. Until someone kicks us out. That's a good question. I actually only looked at it very briefly. I'm sorry. So the question was any reason why you don't use amnesia. And amnesia is a distributed Erlang database, effectively. I've heard like horror stories with it as well, apparently. But this is like complete, you know, on the grapevine kind of stuff. So yeah, definitely one to consider. Yeah. Yeah. One more here. I guess that will be the last one because I think we're over time now. You mentioned that you have to limit how hard you hit the API. Is that limit only set in the eyebrows line of code that you showed us? Yeah. So we could do it a couple of different ways there. So we could have a pool of workers so we know that we're only ever making a certain number of concurrent requests at once. Or yeah, we could just set that limit in eyebrows effectively. So honestly, I think the preferred approach is probably a pool. So you know that you've only ever got like 10 things hitting that API at one time. But also we use process.Send after to schedule API hits as well. And we just increased the send after time to make sure we weren't doing it too much. Second part, there's a host parameter. So that means that the limit only applies to that host? Exactly. So under the hood, eyebrows uses an X table that keeps that basically the host, the port, and then the number of max connections that you can have to that host. So yeah, you always want to do it per host effectively. Cool. Thank you, everyone.