 Thanks for having me. Hi everyone. So my name is Francesco Serini. I started off my career as an intern at the computer science lab, so with Robert, Mike and Joe. And, you know, being very passionate about airline, but not just airline per se, but a lot of the features which came along with the language. And I'm going to get back to that later on. Having left Ericsson, I went ahead and found that what is today become airline solutions with offices, well, pretty much all over. There are five offices right now, three more opening this year, focusing on building distributed massively scalable systems. And, yeah, the reason I'm here today is, well, to bring a very, very important message. When C++ came along, everyone came and told you, hey, guess what, you know C, learning C++ is easy. Java came along. And when Java came along, they came in and told you, hey, guess what, no C++, learning Java is easy. Come over to the dark side. What are the Scala people doing now? They're telling you, hey, guess what, you know Java. Learning Scala is easy. And on and on, I keep hearing, you know Ruby, guess what, learning Alex here is easy. Beware. And I don't want to scare you, but, you know, when you're learning C++, you need to change the way you think and reason and how you program structure your programs. Java, going from C++ to Java, the same. And Scala, you know, from OO to an OO, functional paradigm. And very much the same applies to Alex here. You know Ruby, the syntax might be familiar. It might make it easier to get into Alex here. But that's not where the journey stops. And, you know, the real power of Alex here is, in my view, I think the concurrency model. There's a lot of functional paradigms, and I think Dave has covered them really, really well. What I'm going to focus on today is, you know, the concurrency and how you start thinking concurrently. But I'll also be covering the distributed aspects, the architectural aspects. And I'm going to do that using Erlang examples. The good thing is, you know, and I'm going to use Erlang examples from the last 15 years. So, by explaining how the Erlang runtime has evolved, how the virtual machine has evolved, and how, you know, the more powerful computers we got, the more powerful VMs, the more features we got, we started doing things differently. So the way of thinking in the Erlang world has evolved in the last 20 years. And it will continue evolving and continue changing, because the problems we solved 20 years ago are very different from the problems we're solving today. And, you know, I do a lot of talks, and for once, I can say I've actually not prepared one single slide for this talk. Actually, the cover and the last slide are the only ones I've prepared. All the other slides are taken from other people's presentations. Okay? And I will try to remember to acknowledge those presentations when we look at it. But, and the reason is, well, looking at the architecture, if you go back 15 years and you start looking at these architectures, that's where those presentations came from. Not only, this isn't the first time I've given this talk. The first time I gave it was in 2009 at QCon in London. And, you know, quite a bit has changed since. The name of the company called Erlang Training and Consulting, we're today called Erlang Solutions. Lesson learned, there is never name a company after the services you provide, because as soon as you do that, a couple of weeks, months later, you start providing new services on top of that. The logos changed. We, well, took in a graphics designer who, you know, convinced us to pay them a lot of money, because the logo wasn't elegant enough. And so, you know, this is what we've got today. And, yeah, my titles also changed as well, but it wasn't in this slide. Now, this presentation I gave a few months after I'd returned, you know, from a very first of Oskon. So, the first Oskon Open Source Convention in Portland, which Jose and Robert and quite a few others attended this week and flew down for LixirComp for, and I just come back from Oskon at and at Oskon during the keynote, Tim Bray, who at the time was the director of web technologies. So, in front of thousands of people went in and flashed this slide right here. And what it showed was, you know, Joe Armstrong sitting on a Royal Enfield. It was a motorbike he discovered they're still producing in India and then adapting them so that they would conform to EU regulations when it came to, when it came to, what's the word I'm looking for, the exhaustion, so that, you know, they would conform to EU regulations. And he went in, he flashed up a bit of airline code and then went ahead and with the following quote. And he says, after you open the top of your head, reached in and turned your brain inside out, this starts looking like a natural way to count integers, right here. And an airline does require some fairly serious mental readjustment. At that point I was just like, okay, Tim, I'm watching you. But, you know, however he added, you know, having spent some time playing with this, I tell you, if someone came to me and wanted to pay me a lot of money to build a large-scale message handling system that really had to be up all the time, could never afford to go down for years at that time, I would unhesitate and choose airline to build it in. And, you know, and ironically actually a few months later, you know, at the same time as I was giving the first talk, a Brian Acton got turned down by Facebook and found out what's today, you know, what's today has become WhatsApp. So, you know, after having given this quote, you know, someone sitting right next to me goes, yeah, yeah, yeah, yeah, the syntax really turns my brain inside out. And fact is, you know, the person sitting next to me, you know, is lost in the world of curly brackets, semicolons and commas and, you know, prologue-inspired syntax. And what he failed to do is see beyond the syntax and actually look at the semantics. What Tim Brain was talking about here was not the syntax. You know, what really requires the mental readjustment in the way you do things is the concurrency model. And you need to start thinking concurrently because that's how you go in and you start structuring your programs. Was it anything new for me? How many of you know how I first got into airline? There's a few, but I got into airline in university at Uppsala University in 1994 in the parallel programming course. The teacher came in, you know, waved the first edition of concurrent programming in airline, said, this is the book, read it. These are the exercises, do them. And that was it. And off he went and lectured into the horrors of parallel programming. Dead blocks, mutixes, semaphores, you know, critical sections, race conditions, you know, all of these things, which really scared us. Now, we started working on the exercises. And in this exercise, we had to write a simulated world. We had carrot patches growing, and then we had rabbits going around eating the carrots. And if a rabbit ate a lot of carrots, he got fat, he would split in two. If there's a politically correct Sweden, yes, here, we're dealing with. So if the rabbit didn't get enough carrots, he would run out of energy, would starve and disappear. Then we had wolves, which would go around and look for rabbits. If a wolf ate a lot of rabbits, got fat, would split in two. If a wolf wasn't able to catch any rabbits, he would die. And there was also a lot of intelligence displayed, you know, if a rabbit saw a carrot patch, he would broadcast all the rabbits within a certain range, just saying, hey, come eat their carrots. And the same, you know, if it saw a wolf, hey, warning wolf, wolf, and all the rabbits would start running away. And the same with the wolves, if a wolf saw a rabbit, it tell all the wolves, hey, there are rabbits here, let's go chase them. And it took me 40 minutes, sorry, 40 minutes. It took me 40 hours to get, you know, this lab in, completed with documentation, had a nice tickle TK screen and you had dots going around. So it wasn't that much. And what, you know, every carrot patch was an airline process. Every rabbit was an airline process. And every wolf was an airline process. And I, you know, it was really easy to model, really easy to conceptually think of. And when, you know, and I clearly remember going in, these were on HP workstations we were using, which at the time, I think, allowed a maximum of 128 threads. And I remember going into the workstation and typing in ps-ef, expecting to see a thread, an OS thread, for every, or a job for every rabbit, you know, for every wolf. And instead, there was only one, which was, you know, the thread of the jam, which was the virtual machine we used, the airline virtual machine we used at the time. Jam standing for Joe's abstract machine. And that's, you know, that's really what I thought, oh, this is kind of cool. You know, and then I think that much about it, until a few months later, we had to solve exactly the same problem with an object-oriented language called AFIL, invented by Bertrand Mayer. And, you know, for that exercise, it was exactly the same exercise, exactly the same specification. Only difference is that they paired us up. So there were two of us working on it. And despite reusing a lot of the algorithms, despite having solved the problem once already, and it's my view, an engineer never really understands the problem until he's actually solved it. And once he's solved it, he'll think of a million better ways to do it. So we reused a lot of that knowledge into it. And despite there being two of us, it ended up taking us 60 hours to do. So three times the amount it took us to do it with our language. And that's when I started thinking, in terms of there's the right tool for the job. You know, and there are wrong tools for the job. And that's just in itself is a completely separate talk. But what I hadn't realized, the whole concurrency model, what you have is three things. And they've been covered already, but you've got process which don't share data. Processes do not share data, they communicate through message passing, and they monitor each other for failure. And I think those are the three key ingredients, which give you this whole concurrency and the whole ability and the fact that it split the processes from the underlying OSFRED. So see the early virtual machine as an operating system on top of the operating system. So, you know, left university, joined the computer science labs, had a great time there, after which I started working with Ericsson's training and consulting arm, helping projects start using airline. And at that point, you know, the year was 1996. And they just started the AXD 301 switch project. This AXD 301 switch project, I think was, was one of the many airline success stories within Ericsson. It consisted about one and a half million lines of code. That was the first release. I think now it's way beyond two million. And about half a million lines of C++ code, that's a bit of Java, JavaScript. And the way the approach takes the AXD 301, and the way Ericsson approaches any airline project is designing by prototyping. They never start. So when you've got these large corporations, and they kick off a project, they'll have 100 developers. They'll usually maybe have 200 people on this project. And there are about 100 developers and testers, and then there are 100 other people, no one really knows what they do. But I'm sure they're important in some shape or form. And when you put 100 developers, you know, you need to make sure that what you've got is solid. And so before even, you know, committing to these vast resources, they'll always start designing by prototyping. I mean, it's not good enough, you know, to quote one of my mentors, Mike Williams, it's not good enough to have ideas. You need to make sure they work. And most importantly, you want to make mistakes on a small scale, not in a large scale production project. So just like any other, well, this was the first major airline project within Ericsson. And but you know, this, this this kept on going. What they did is they went in and started deciding, you know, how do we actually model the concurrency? How do we model the setting up and tearing down of these calls? So the XD301, it's an ATM switch. And you set up calls and you tear them down. That's a simplistic view over what was happening. And remember, this was the first time we had access to a virtual machine which had an incredibly powerful concurrency model. Back in 1995, I think the limit, the VM limit was 30,000 processes. You could have 30,000 processes running in your airline virtual machine. Compared us to, I think we had the 128 threads, OS threads. I mean, the delta was huge. And so we started, the question, they started asking themselves, the question we're trying to figure out in this proof of concept is, how do we use this concurrency to our advantage? We, we had no idea. No one had ever done it before. So the very, very first prototype, every call we fired off, every call we set up, had a total of six processes. Now, the goal of the XD301 is that for every processor pair, you had to have, you had to get in 30,000 calls. That was the goal. You needed 30,000 calls to be handled simultaneously in a pair of boards. Okay. You're following me here. So you had basically two blades, and each, you know, both blades together with no single point of failure had to handle 30,000 calls. So they went in and the first prototype, they basically did six, you know, fairly complex finite state machines, each running in a single process. And worked well, you know, took a few milliseconds to set up the call, a few milliseconds to tear it down. The problem is, if you had six processes, that meant you could only have a maximum of about 5,000 calls in your system at any one time. So didn't solve the problem. So they got a pretty good programmer who started merging together all of these finite state machines into two processes. And so that resulted in two processes per call. Still 15,000 calls, you know, far, you know, 15,000 calls away from the final goal of 30,000. So someone came up with a, hey, you know, let's merge these two finite state machines into one process and have only one process deal with all of the calls. Then we go in and we actually save the state in an ETS table. So, hey, set up the call. The process would go in, it would do all of the complicated calls set up. Once it was done, it would save the state into a table and then take in the next call. So, you know, two problems arose right here with that approach. The first problem was, you know, setting up a call, all of a sudden became a sequential bottleneck. It became the sequential operation because you could only set up one call at a time. You had one process setting up all the calls. So you got a bottleneck and that, and the second part was that the code was so complex it became completely maintainable. You really had, yeah, you didn't want to go there. It was just, you know, maintainability is critical when it comes to telco products. You've got, you know, 80% of your costs are usually support and maintenance. So that wasn't an option. So someone came up with the idea is, hey, how about having two processes per call transaction? So, you have one process when you set up the call and then when you've set up that call, you store the state in the table and then you terminate that process. And when you're done and you need to tear it down, you sport off a new process, make that process, tear down the call after which you store the new state in the table. And that's really where, that was the point where they started thinking concurrently and they started embracing concurrency the way they should have. And from the two process per transaction, your processes became cheaper and cheaper and they eventually moved to four to five processes per call transaction. You know, that meant that at any one time with four or five processes per call transaction is that your limit is at any one time you could be setting up about five, 10,000 calls. But that, you know, the number of simultaneously open calls all of a sudden, you know, became dependent on your memory and on your physical constraints of the machine, your networking, IO and so on. And that's the way we do it. You know, that's the way to think. Set up a process for every truly concurrent activity in your system. And that was in this case pulling up the call and tearing it down. And that's the architecture that ended up sticking with today. So this is on the concurrency side, on the distribution, your very old telecoms that had an active standby. So every time you set up a call, you went in and then communicated to the device board. And then there is that set up the end-to-end connection. When you were done, you sent back responses. And those responses were copied both to the active board and to the standby board. So in case the active board terminated, all the calls which were currently being set up would be set up successfully with the results being copied to the standby which would then automatically become the active in case of failure. So you know, that's how you got your no single point of failure right here. And even though, you know, even though you'd be losing hardware, the call setups wouldn't be affected. So this was the first kind of the first major project which worked. And where it started embracing concurrence in the concurrent way of thinking. About a hundred developers and I think it's still selling today all over the world. Now, if we fast forward another five years, this is another very interesting case study where doing a consultancy, I got called, this was my first customer outside of Ericsson. So I went to Paris and started working with a group of developers who were doing an instant messaging proxy. So at the time you could have a maximum of about 512 simultaneous open socket connections on a single machine. And they wanted to increase. So that meant that if they started running Jabber servers, there was from Jabber Inc. There were open source servers, which were available at the time. That meant that for every server you could only have 512 users on a single server. So their idea was to go in and create a proxy on commodity hardware and cheap hardware, which they would put in front of the Jabber server. So this proxy would handle all of the TCP-IP connections from all of the clients. And then channel them through one single TCP-IP connection to the Jabber server. You're following me here, you see what I'm doing here. And this is how, so I got called in, you sent me the code ahead of time, I went in and I reviewed it. And this is what I discovered right here. So this is how they set up their whole concurrency model. Now this team was a very talented team, but they'd been working in C++. They were thinking in terms of OS threads. And what they'd done is for every client they had two processes. One to handle the inbound messages and one to handle the outbound messages. And then once you'd receive a message, you sent it to another process which would handle the decoding. That process would decode the message after which it would pass it on to a third process which would deal with the state. So it would do all the error handling, it would do whatever rerouting needed to be done. From there it passed it on to a fourth process who would encode it again and then send it off to the process which was managing the socket connection towards the Jabber servers. Can anyone see how many simultaneous messages can we handle using this architecture at any one time? Sorry. This was pre-multi-core, so there was only one thread. But think in terms of concurrency, think in terms of the virtual machine handling a lot of messages in parallel. The total is three messages, well actually four. We could be receiving one message, then we could only as there was a process which was handling the decoding, we could only decode one message at any one time. One process handling the state, we could only have handled one message at a time, we could only encode one message at a time and then send one message to the server. So any one time we could handle a maximum of four messages in parallel, so not that efficient. And this is the way I've seen a lot of users think when they're coming from the C++ world but also from the Ruby world, because you're thinking in terms of threads, you're not thinking in terms of lightweight processes. Remember, lightweight process processes are cheap, it will take you sub microseconds to create them and they consume very little memory. So you know try to think concurrently and in this case, you know truly concurrent activity was dealing with the socket right here. So we went in and refactored the code and we had one process which was connected to the client and we used that process to call library modules instead. So what did we do? That process would first call in a library module to decode the message. It would call a library module to do all the state handling and then it would call a library module to encode the message. Once it had encoded it, it sent it to the green process up there which would forward it on to the Jabra servers. So all of a sudden we went from being able to handle four messages simultaneously to being able to handle 512 messages or about 500 messages which was the maximum number of open sockets that always allow us to have. And just by going over to this architecture back then, we got a 400 percent increase in throughput and a large, you know, not large reduction in memory. Certain things had to be serialized and that's why we had the blue process right there. So every time we need to serialize something which required a particular order, the terministic order, we just send the message and use that blue process to synchronize. You following me here? Yeah. Okay. So, you know, this is the way, you know, we started reasoning in around 2000. And, you know, the goal here and what we did here is, you know, in even when we were stress testing it, we couldn't get it beyond a certain number of, you know, I don't think, I think we couldn't get it beyond 70 or 80 percent CPU. And what we done is we'd actually push down all of the bottlenecks to the underlying hardware, operating system and external dependencies. Which is really, you know, when you're dealing with Erlang, when you're thinking with Erlang, and that also obviously means, you know, the Erlang virtual machine, the beam emulator, you need to start pushing everything, you know, pushing all of all of your dependencies so that your bottlenecks are outside of your system. And usually, you know, once you hit the OS, once you hit networking, bottlenecks, you know, or hardware related bottlenecks, you know, that's when your job is done. So, we moved on and, you know, this team left this company and took ownership of, I think, one of the first major open source Erlang projects, eJabberD, which I think at its height, and probably still today, accounted for about 60 to 70 percent of all of the Jabber-based, XMPP-based traffic. And they started working on a distributed XMPP server. You know, back in 2008, you could manage about 30,000 users per node. We've gone in and we've actually forked that project into what we've now branded here at Erlang Solutions, MongooseIM. And earlier this year actually reached one million simultaneously connected users on a single node. Now, eJabberD and even MongooseIM is very much the way we used to design systems in, you know, in 2002, 2003. And we did that by clustering. So, we'd go in and we'd start clustering Erlang nodes and then sharing load across them and sharing state. So, state across them. So, what would happen here is that a connection would come in and by the load balancer would be re-routed to once, assume we had three eJabberD nodes running. These eJabberD nodes would have a fully replicated MNESA database. So, you know, you'd have, there would be, yeah, the data would be exact copies on all of the three nodes. The load balancer would come in and it would forward everything to a process. Now, whenever you logged on using your eJabberD client, a process would be created which lived for as long as your session kept on going. So, every time a client sent a message, it would be sent to that process which handled your session. Then, you know, the checks would be done here. You wanted to forward a message to someone else. The client would check, okay, is this user on the same node? No, he's not. Okay, let's forward it to the node, he's actually on. So, we go for the client to server and then at the client to server, it would then forward on the message to the final client. So, we would basically have different hops in between the servers. We'd also have, you know, chat rooms, we would have federation and session management processes as well. So, but here, once again, this is the way we did it in 2002. Definitely not the way we do it today. Now, start imagining all of a sudden you've got one million connected users on a single node. That means you've got a million processes running, not necessarily doing anything. So, that's just wasting resources. The second problem, what happens if the node fails? You get a million users being kicked out. What happens if you get a million users simultaneously trying to log on to your system again? They hit that node and that node. Maybe a good load balancer, half a million here, half a million there. Logging in is the most expensive operation when you're dealing with XMPP. So, that would cause that node to crash. That node, so all of a sudden you'd have one and a half million users trying to connect on that node and you get a cascading failure. Back in 2002, you didn't have to worry. You know, you may be, in worst case, you got 30,000 users logging on again. If that, you didn't have to deal with the scale you need to deal with today. And that prompted us, you know, once again when we got another project, you know, to start rethinking instant messaging architecture. This here is a slide from a customer of ours. They approached us in 2007 and wanted us to build a scalable instant messaging and email gateway. So, using the predecessor to smartphones, so what they wanted, what they wanted was, using, what they wanted was to provide connectivity for email and instant messaging, which was highly optimized for mobile. So, it meant a, you know, security. So, even though you had an insecure connection, last thing you want is to log up and log on and log off all the time. When you lost connection, that was the first thing you had to think about when you're dealing with mobile. The second is thing you need to deal, think about is bandwidth. So, you want to reduce the protocol, reduce the overhead and make sure that, well, you compress your data and then only share the necessary amounts. And the third thing you need to think about when dealing with mobile is battery life, especially, this was 2007. So, you know, we've come a long way in the last 10 years, but it was still early days. And what we did is, you know, this is what the architecture looked like. Your mobile connections, your mobile phones, your cell phones connected to a set of front-end nodes. These were all airline nodes. They would receive their request and their goal was just to manage TCPIP connections. They were stateless with the exception of managing a TCPIP connection. They'd receive their requests, they'd parse them, and then we'd send an airline term to one of the three back-end nodes. Okay? So, whenever one of the back-end nodes received the request, it would spawn a new process which would live for as long as this request was being handled. So, try to picture instant messaging. What can happen? Well, you send the message. The message gets received by the front-end node. It gets sent to the back-end node. A message, a process is spawned and it keeps, it handles that message and the process lives for as long as it's done whatever it has to do. And in the case of instant messaging, what you would most probably do is, you know, you'd handle the message, you'd log it, you'd, you know, make sure that the session was still alive, and you would then forward it on to, I don't know, it could be AIM, it could be Google Talk, ICQ, or MSN, so Windows Live. Once, you know, ICQ or Windows Live or whoever you're forwarding it had acknowledged it, you send back the acknowledgement saying, yes, the message had been sent, the process would send an acknowledgement and went back to the cell phone saying, yes, you know, the receiving part has received the message and at that point you would terminate the process. And all of a sudden, what does this mean? That meant that you were receiving 20 status updates from your buddies, that meant 20 processes all being handled at the same time, all being processed at the same time. When I talk about making a mental change or a mental shift in how you do things, not have one process which handles your session where you then serialized all your 20 status updates from your buddies, no, you have 20 separate processes each dealing with a status update. And, you know, why go down that route? Yeah, I'll show you in a second. Any questions on this architecture? Okay, to add to this architecture what we had here, we had three, you know, we were not sharing all the data. The only data we were sharing was the session data. So if a user logged on, we had a process which could go in, it would authenticate them, it would then create a session record, store it in amnesia database and then, you know, send back and yeah, okay, you're logged on all well. Now this record in amnesia database would then automatically be duplicated to all of the nodes. What does that mean? If we lost a front-end node, we'd get a million users logging on again but they didn't log on. All you had to do was set up a million new TCP IP connections on the remaining nodes and you had load balancers which would throttle and make sure that you wouldn't get a wave which would come in and sync your front-end nodes. Next thing is, when a user then tried to log on, you had set up the TCP IP connection, say, hey, I want to log on, you'd lost the first node, it would, you know, be hashed to the second node or the third node and in doing so, in doing so, we'd check, hey, this user wants to log on. No, here's a ready to log on. His session is active because you're duplicating the session across all the nodes and so we just send back in a crowd and yeah, you're logged on without having to do any of you know, the complex analysis and you're contacting back-end servers, database lookups. We just send back an okay, which would be a very easy, you could handle hundreds of thousands of these requests per second in this cluster. So you're losing a node, a couple of seconds, everything had come back to normal and your whole peak of re-creating the sockets was done and also if you lost a back-end node, all that would happen is a request coming in from a front-end node would be routed to one of the two other nodes. We had two level hashings so that we wouldn't go in and rehash all the requests. So all the requests going through those two nodes would still be going through those two nodes. When going in that route, all of a sudden you lost that node, the request was sent to one of the two other nodes, the session record was still stored in the distributed table, you just look it up and you'd handle that node. So it was straightforward, fairly straightforward, and so there we removed, you know, with this architecture, the single points of failure we had with, you know, the architectures which we were dealing with around 2002. Now so, you know, we were happy with what we'd removed our single points of failure, we had a full tolerant architecture, next thing we had to do contractually was stressed as a system and show that we could handle, what we had to do contractually was show that with no single point of failure we could handle 15,000 transactions per second. Now remember this is mobile, every transaction was at least for HTTP requests, free destructive database operations, seven log entries, so there was a lot of complexity and business logic to it, a lot of XML parsing and so on. And so contractually we had to show that we could handle 15,000 transactions per second and that included killing machines off, so pulling network cables out, shutting the power off machines, the other, the remaining hardware still needed to maintain and handle that load. And the way we used to stress test our systems was fairly simple, you know, back in the days we started low testers and we got the system to run to about 100% CPU. That meant, you know, we got the system to become CPU bound and we'd follow the performance and as soon as we started seeing a degradation in performance we'd go in and look for the bottlenecks and using different techniques which, you know, we don't have the time to go into it now but using different techniques, I'd go in and I'd go in and, you know, we'd go in and just remove those bottlenecks and, you know, the goal was, you know, to get a stable throughput. So no matter how many simultaneous requests were going through the system, your throughput had to remain constant so you didn't want any degradation of throughput as more users logged on to the system. So just to give you an example here, if you had about, you know, if your system handled a thousand requests per second and you had a thousand requests going through your system, each request would take one second, right? So if you now increase the number simultaneous request to 2000, the only difference you should see is the latency. So every request would now should now take two seconds to handle. And so that's what we mean by constant throughput. You want your system to be predictable on the heavy load and know exactly what happens and you want it to behave. And that's what we've done with all systems in the past. And I think, you know, this is an old old example of a benchmark on your web server, which John stroke did and I think he published it in his PhD thesis, where you're looking at the throughput here, which is 800 kilobytes per second through the your server. And what it does is he measured the throughput, you know, if you had 10,000 connections, 20,000, 30,000, all the way up to 80,000, which in 2002, when this, well, this measurement was done, was pretty much the physical limit to how many processes, you know, your standard Linux, your Linux machine could handle. And you notice here that irrespective of the number of connections, the throughput is always the same. And this is, this is the pattern, you know, we're used to dealing with. Now, what, yes, much, okay. So what was happening though in 2006, however, 2007 was multicore was coming about. And so, you know, we started stress testing the system and we had load machines. First thing we did is we crashed the firewall. So we shut the firewall off. We started loading again and eJabraD cluster, which we're using in the back end, wasn't strong enough. So we had to put more machines there. We started load testing again and we managed to crash the out and load balancers. So we got rid of those. We started stress testing again and we just couldn't create enough load. So we had to add two more load machines. And we got to borrow some F5 load balancers, which worked much better at the time they were the worst, well, the worst meaning the best out there that they really were able to handle the load. But even there, we still got them to crash every now and then. And it was a nightmare. We thought it would take us two weeks to stress test the system. Three months on, we still weren't done, because all of a sudden we were dealing with multicore. And we hadn't anticipated the power that, you know, the airline virtual machine, the beam emulator or multicore environments would unleash. And we were stuck dealing with TCP-IP congestion, memory spikes, higher starvation, OS limitations and a lot, lot more. So it became really, really tough. So, you know, this is where, you know, we move on to a couple of years ago, about two years ago, where there's a web framework which tried to simulate Rubion Rails and, you know, called Chicago Boss. And, you know, there's a company called Concurix up in Seattle, which at the time was focused very much on optimizing the airline VM for multicore. And they went in and rewrote a Rubion Rails application, which at the time was running 30 requests per second. And they rewrote it using Chicago Boss and they then ran it, you know, and everyone was telling you at the time that, guess what, you're running airline on multicore, you're fine. You've got, it will scale. So they ran it on two cores. They got about 40 requests per second. They ran it on four cores. They got 40 requests per second. They ran it on all the way up to 64 cores and they still got 40 requests per second through. And that's when they started scratching their heads and, you know, came to the realization that, you know, they were being hit by Amdahl's law, which in layman's term tells you that your fastest, you know, your program will run as fast as its slowest component, in this case the slowest component being the sequential code. And what they discovered is that they were serializing requests through a single gen server. So you had a lot of workers all doing all the work concurrently. So each worker was serving a web page concurrently, but when it was time to send everything back to the web servers, they all went through one single, one single gen server, which created your bottleneck. They went in, removed the bottleneck and look at that, you know, from 40 requests on two cores, they reached about 800. On four cores, you know, they were over a thousand after which they hit the next bottleneck and the next one. So there's still a lot of headroom, you know, as you add more cores to the machines. But, you know, another thing you need to think about here in the message here is you're dealing with multicore architectures. You need to now start thinking terms of bottlenecks in your systems, because the bottleneck is what's going to stop your system from scaling. All right. Okay. And, you know, what does a future hold? I think, you know, this is React. This is, you know, from a Bashow presentation and if any of you have been on a React or Bashow presentation, you know, if the Bashon please don't show the ring they don't get paid. You know, that's a standing joke, but it's basically using consistent hashing to distribute across a cluster of nodes. So that's one way in which we're looking at, you know, so I think we've solved the concurrency aspects where you need to have a process for every truly concurrent activity in your system. We're now aware of the bottlenecks which happen in your system and a lot of tools are being built to find these bottlenecks in your system and give you warnings and help you try to well, remove the sequential code from that process and try to parallelize it as much as possible. So, you know, what we're dealing now and I think the next big challenge we within the Erlang World have, you in the Elixir World have is distributed architectures, massively distributed architectures and so consistent hashing, you know, the way React does it is one way. Robert earlier mentioned SD Erlang which allows you to create clusters. It's based on the concept of process groups and then there's one last concept which I still haven't found the slide for which is looking at Erlang on Zen. So, you've got instances of the Ling virtual machine which is not the beam but they've written their own virtual machine which runs in a Zen instance on the bare metal and what's happening there is, and I all recommend you look, go in and listen to Stu Bailey's talk, a keynote at Erlang user conference, they've taken a so the link switch which is an open flow switch and are integrating it into the Zen instance. So now you can then fire off, you know, tens of thousands of Erlang on Zen instances which start communicating with each other using software defined networking where connections, you know, based on your layer two and layer three traffic will tear up and you know, pull down connections, you know, based on your traffic patterns. So optimizing your traffic in, you know, based on, on, you know, the real time patterns in your system. And, you know, that's really where we're heading now and, you know, those will be the types of problems we're thinking and solving but once again, this will require another mental readjustment in how we do things because, you know, being able to fire off Zen instances in a few milliseconds and tearing them down and optimizing traffic within the means will start architecturing our systems in a completely different way. And structuring our code in a complete, all right. I think we're out of time. So I'm around all day. I don't buy it. Feel free to come and ask questions, love to answer and brainstorm. And, you know, just a shameless plug. I'm writing a book with Steve Winovsky, Designing for Scalability with OTP. And what we're looking at is what happens behind the scene when you're dealing with OTP and how do you structure and how much you think and reason. And, you know, everything in there is, if not as important, just even more important for you working on Elixir because it's the same concepts, it's the same principles. And, you know, we're trying to put all of the lessons learned from the last 20 years into this book. So, well, others can use them. All right. Thank you very much.