 Yep, I think we are live live. Okay. Yep. Yep. Don't wait for a couple of minutes for people to I think we'll wait for a couple of minutes and see how many people do we have here. I'm not sure the number. Okay. We'll start in like two minutes. I guess you can start now. Yep. Yep. Hello everyone. Welcome to the first episode of scaling from first principle series. Thanks for hosting this and thanks for joining the first session and then coming up with two case studies in no time. My pleasure. I'm happy to do this. So before we start this episode, I want to send the context for the series since it's the first episode I want to kind of talk about what's the idea behind the series before we get into the agenda for this episode. So everyone want to build applications at scale. Software is all about trade offs designing for scale often comes up with different trade offs right it comes with an expense of increased developer and operational complexity. So do we really need to scale from day one? Will the application ever see the scale that we can have designed for? How do you know where we are or engineering? And there are a lot of these questions. And we keep hearing a lot of the stories about how people different companies are scaling their software. But most of the times these are big organizations, the Google's Facebook's and Google's. So they're at a different level of maturity. So what worked for them may not really work for a small and medium sized organization. So what are the different approaches for scaling? How do you decide which approach is good for you? Are there any general guidelines? There are so many questions. Wouldn't it be nice to have a forum where we get the practitioners and then hear from them? Not only success stories, also the horror stories, things that didn't work for them. The mistakes that they've done so that we all can learn from them. So that's one of the ideas behind the series. The other important theme of the series is first principles thinking. Many seemingly complex problems can be solved with very simple solutions if you apply first principles thinking. You'll study and understand the problem at a deeper level and ask right questions. Will you be able to kind of find these kind of simple solutions? Well, the end result may look very simple, but that wouldn't be the case when you actually start with that. So you want to kind of talk about these kind of solutions, bring practitioners who have been like trying to solve problems this way. And then have a discussion in the forum. So to summarize, scaling from first principles is a forum to bring practitioners together to discuss and learn from each other. And to promote first principles thinking to building and scaling software. So that's the kind of agenda for this series. Now we'd like to kind of pause for a minute and see if there are any questions here. The other thing I want to say is like you want to kind of run the series as a regular series. So it will have one session every month. Initially we'll invite some people to come and speak, but over time we'll figure out how to take this forward. If any of you have some ideas that you want to discuss or you want to invite someone, please get in touch with me and we can figure out how to coordinate and then take this forward. So I think this is a community effort and we want all of you to kind of come forward and suggest what kind of topics that you want to have etc. I think the Haskeek platform has a place where we can actually have these kinds of discussions. But we'll figure that out as we kind of go forward. I'll pause here for a minute and see if there are any questions. I think if anyone wants to speak you can unmute and then speak. If there are no questions then I'll move on with the agenda for today. Anand, you just want to explain why we are doing in a case study format so that people can have audience. Yeah, sure. So the idea is basically look at different problems and then see how it can be solved. So when you look at a problem, you may think that it's probably too complicated and then you apply first principles thinking actually something very simple may emerge out of it. But also there are many cases where we look at a problem and then don't apply the first principles thinking or we can miss out some of the aspects and then the solution comes out to be very complicated. And then there are lessons learned from that. So that is something that we can learn easily. Everyone can actually come into the same context and understand what went wrong or what worked well. So that's the idea behind the case study format. Vish, do you want to add anything to that? No, and I think folks by having the case study format things become slightly more practical and I think some of you would be able to relate to them better. The takeaway would be something more practical and day-to-day engineering work. Other than having more of a theoretical discussion, we wanted to have it a more practical session. So maybe the case studies will help us verbalize our thoughts and our experiences both the success stories and the failure stories as well. Good. So then I'll start with the first, I'll set the agenda for today and then start with the first case study. So yeah, can you see my screen? Yeah, I can see it. Yeah, okay. So the agenda for today is, first we'll start with the context that already done that, setting context for the series. The three case studies that we're going to take today, the first case study will be taken by me and the next two case studies will be taken by Piyush. And then once we're done with the case studies, we're open for discussion. So the session is planned for an hour. We can take questions after each case study. We'll take five minutes for questions for each after each case study. And even after one hour, if people are interested, we can continue for some more time, I guess. So let me get started with the first case study. So this is about building web application to show results of an exam. So it started with this, right? And when back in June 13th, when CVS C12 standard results were announced, a friend, Varga posted and Twitter saying that the website is down. So this is a common thing that happens every time an exam results are announced in India. I have ran from 2007, where the Gates is when announced. I looked at IIT, Kharagpur, IIT, Kanpur and IIT, all the websites were down. They couldn't handle the traffic. IIT Madras somehow seemed to work at the time. And this is something that happens every time, every time an exam results are announced, the website goes down. So how far is the problem to kind of build a reliable system? And how much does it cost to kind of build a system like that? So that was one of the things that kind of came up in the discussions on Twitter. So I thought it's interesting problem to kind of take up and discuss about. So I thought I'll take that as a case study. So let's kind of try to examine the problem and look at the scale of the problem and then see if it comes with a good solution to that. So quick numbers. The number of streets appeared at CVS C in 2020 is 1.2 million. So let's add a small factor of safety and say it's a 2 million people would visit the website to check the results. Since everyone is anxious to find their results, they'll probably visit as soon as the results are announced. So isn't that all the people would actually visit in the first one hour? So what does it mean is we need to kind of have system that can serve 2 million requests in the first in one hour. That's about 55 requests per second. Does it look like a very big number? It's not really trivial, right? It seems like a big number. So the next question is now how do we build a system like this? So what should be the text tag? What kind of program language should we use? What kind of database we should use? What deployment strategy? Which cloud should you pick? So I'll pause for a half a minute and actually ask input from the audience. What do you think should be a text tag? How should we approach this? I thought it's maybe a too slow language. Should we just go? So anyone want to kind of try to kind of sign this something here? Yeah. Pox, you may send your message on chat as well if you want. Asif says text tag doesn't matter. Okay. Kubol. Okay. Kubol. Yeah. So Kaio-Obs incurred in each request. Elixir per concurrency out of the box plus some in-memory DB since it's just static data. The DB will be the bottleneck, so on and so on. Erlang since strong for concurrency. Okay. Nabarun is saying can you store the data? Yeah. Yeah. I'll go back. Okay. So the numbers that are there is basically 1.2 million students appeared for the exams. So that's, I'm kind of rounding it off to 2 million so that add a factor of safety there. So that's about 2 million requests per hour because we're assuming that people would actually come and visit the first one hour. Everyone would actually come and check the results in the first one hour. That means we are able to, we should be able to support that many requests in the first one hour. So that's about 2 million requests per hour. If you kind of take it to per second, that will become 555 requests per second. Yeah. I think Sandeep is saying that you are assuming a uniform request per second. The 2 million may come in a burst as well. So. Well, I think we can even take the factor of safety here and there. So shouldn't be that an issue, I guess. Okay. Yeah. So, so let's kind of try to build a system keeping that things are unique. There are some assumptions right now. Okay. Isn't that that's uniformly distributed across the whole hour, etc. Okay. It's also not that everyone would actually come and sit and visit in the first one hour. Right. So, so we kind of have some cushion there as well. Yeah. So in a word, it says a hash mark in memory may be suffice. Okay. Yeah. Okay. I think Anand Richard moved forward. No JS MongoDB. Okay. Sounds good. So there's a bunch of these ideas. Okay. Let's kind of. It is a lead only mostly the lead operation and data intensive operation. So I think for the back end is a no deal would we prefer and for improve the performance, we can use a cache to store the result. Like maybe we are at this one to improve the performance. That's good. So thanks for session. So let's kind of look at. Let's kind of. I think some of you already made the observations. So I also have some observations. This was the only no transactions. That's one observation. Okay. Now, let's kind of. We have to ask the right question. Right. We haven't asked the what is the data size that we're dealing with here. That I think is also an important thing to kind of consider. So we have two million students. Let's see what's the data size, right? Let's you quick math to figure out what's the data size here. The fields that we have are we have roll number and name. Let's take 10 subjects and maybe add a couple of more fields. Good. So if you look at data size per student, that would be about 100 bytes of size. Right. And the two million students that's entirely 200 mb of data. And we kind of started thinking that it's a kind of very big problem to kind of solve, but it looks like it's not really that big. Right. So 200 mb. It's something that fits in the realm of, I mean, even if you take this simplest VM that you can actually find that have like 10x RAM than that. So 200 mb of data can fit in RAM pretty comfortably. Isn't it? So do we really need to even use the database here? Can't I just load the entire thing into memory and keep it? So what I did is basically a simple Python program. Okay. Nothing fancy. Like the simplest thing that I can think about. Okay. So on startup, it reads a CSV file. And then there's a function to get a row given the roll number. So it's actually pandas data frame. It's not something even nothing fancy here. Right. And then there's a simple application that takes a roll number from a query string and then gets a row and then converts it to JSON and returns it back. So this is what I can have written. Let's kind of see what kind of performance this simple application can give. And then we'll see if we need to kind of scale up or do some optimizations on top of this. Any guess what kind of performance this may give? You can respond in chat or you can speak up. I think it will scan all the rows. Yeah, we are not. I'm not sure if you have not used indexes or some type of primary key. No, this is index. So pandas data frame when you set an index column, right? So that creates an index internally. So this is this works like a dictionary now. Okay. So this is a constant operation. Okay. Then if it's a constant and it will fetch the query very fast. Yeah. But the question is like how many requests would you be able to handle in a second? Yeah. So in that case, it is depend upon your server. So the problem is like a database operation fast, but how many connection it can support? Concurrent. So that is the problem. Let's talk about single core performance, right? Let's take a reasonable good CPU and the single core performance. How many requests can handle a single core? The single core is a process only single thread operation at one time. And if you're about the database connection, it's depend upon like your. We don't have database here, right? We only have just in memory, right? Yes. Yes. Sorry. So what do you think could be like, would it be like 10, 100,000? I don't know. Like what, what's the number that you think about? Okay. Python is kind of perceived to be like very slow language, right? We're not, we're not doing it in Go or AdLang. Okay. Any want to kind of guess? That's a 10kish. 10kish is pretty big for the. Thousands. It will still be like network I have bound high because around 500 to 1k. Yeah. Let's look at that. This is a benchmark I've done on it. The $5 distortion droplet. Okay. It's able to kind of handle about 800 requests per second. That's about 3 million requests per hour. 50% more than the requirement that we actually started with. For the context, if you'll just look at engine X serving static, it can actually serve up to 6k request per second. That's open 20 million requests per hour. Okay. So we're talking about a single $5 distortion droplet. And that can actually solve 1.5 times what we actually started with. We're not doing anything fancy. Okay. I've not done any engineering here. It's something that script that can actually write in half a day. Right. I mean, that's all it took me to kind of come up with a solution because we can start with the first principle thinking. Okay. See, red days and all these things are actually fine. Okay. But to really need to use red is here when it can do even simpler. It needs to have moving parts at all. Right. So that's, this is what I kind of got. Okay. So if you look at the cost per day, I think it's just, yeah. So it costs about 12 rupees. 55 say to kind of serve the entire thing for the entire day. Okay. So when we kind of started the discussion on Twitter, I thought I can actually achieve this in 100 rupees. Okay. And people are kind of surprised like is it really that only takes only 100 rupees to kind of solve that, but actually eventually we're able to kind of solve that with 8x smaller than actually what kind of aim for, right? It's costing only like 13 rupees to kind of serve entire traffic for one full day. So how do you pick a database or et cetera that would have actually added more complexity on top of that, right? More overheads because you had to go through network and all that. So this is what could achieve it. So, so the take away is right. I mean, when you kind of approach a problem and try to understand actually the problem, what it is and then apply the first principle thinking we should generally able to come up with solution, which is kind of quite simpler. Right. I mean, you were like, when I started, I didn't think that I could actually come up with such a simple solution. In fact, I was thinking we should, I should maybe use a pocket instead of CSP because CSP reading would be very slow. Realize that it's actually the whole file is added to be et cetera. So it's not really something that I should worry much. It makes it simple as simple as possible. So that's what worked out. We have a solution which is kind of very simple. And I look at the problem statement. Now it doesn't look that complex, right? Because we have seen a solution which is actually very, very simple. That's all about the case study and I'm open for any questions. So we can take some questions for five minutes and then move on to the next case study. If you have more questions, you can take it at the end. Alan, thanks for this presentation. I'm just wondering. Yeah. So, okay, your approach is okay. You explained everything, but that's what happened to CBC side why they have not handled that workload. Means this is the solution. Okay. We may not think about Excel using Excel at first place, but maybe if they have a user, a DV even. Yeah. So what the problem they face, maybe the number of content requests that killed. So I mean, I can only guess, right? I don't really know what happened to them, right? And this is not CBC. Okay. There's been happening for every exam results. So I think the typical setup is PHP plus my SQL everywhere. Okay. So maybe they can't handle that kind of concurrency. Okay. Or the DB is not kind of tuned and the DB may actually become bottleneck and that will basically trash the entire system, bring the whole test down. That makes sense. I don't know how they might have coded, right? Maybe they were making DB call every time. I don't know. Yeah. Yeah. So, yeah, I mean, we can only guess, right? We don't really have one inside the knowledge about how the system is working there. So it's not about one CBS, right? It's something that happens every time, every time exam results are announced. Okay. Whether it's IITJ or the gate or like I've seen it like year over year, like same thing happening again and again and again. Okay. Even same thing happened with the Flipkart and Amazon during the big billion day last year. I think when the Flipkart and Amazon site went out trashed. So that's a slightly complex problem because over there the data modeling is slightly more complex than what you see in CDSE example. Yeah. There was back to the slide where you mentioned the concurrency like how much you can be supported. Okay. So this number 816.40. How do you test it? No, I run Apache Bench. So I started two virtual machines. One was running my application. I was running my benchmark program and Apache Bench gave me that number. So I just copy pasted that number from Apache Bench output. Okay. So a small point related to this during the last elections, the election commission of India website, which was continuously updating the results from all the voting centers and all that. They use static pages. And I think it was like, you know, schedule to refresh every one minute or 10 minutes or whatever. Yeah. And all the news websites were using some kind of thing, some database driven and some of them were slowing down and all. But this thing held up completely throughout that full time. It was fast. Yeah. So that would be my first thought when I saw this. I will try. Yeah. So, yeah. So when you kind of delete and creating static pages, right, you also got to think about if you're creating 2 million files in a directory that's going to create trouble again. So you have to kind of make it like a two level directories or something. Okay. And also there's also privacy issue kind of involved in kind of keeping the exam results as a static pages. Okay. You don't want that to be crawled by the by bots and all that. Right. So right now this AP just takes roll number, but actually the original CBS actually ask you for your roll number and admit card ID or like some kind of second factor authentication. Right. So the AP can basically use roll number to kind of fetch that row and then make sure the second factor is kind of matching. Then only give the result back. Right. So this approach would still can do the same thing. But where is the static approach? I don't think that is the, it's a wonderful approach, but not for this particular use case. Not this use case. Makes sense. If it's if it's public data, which anybody can take without any. Yeah. This thing then maybe it makes sense, but here it's probably not the right approach. You're right. And he's benchmark for Jason. Right. Hello. Yeah. This is benchmark. Yes. Okay. Yeah. So how does that affect that? Yeah. So that's, yeah. So the solution that kind of plan is basically have a static HTML file that calls the API. The static HTML file is served by in the next. So that's covered by the six X request per second. Right. So that would take care of that load. So I'm putting, I'm putting the genic on for the Python process. And then that's behind it's next. Okay. So if you go to genic on and slash would actually go to itself. So it can serve like close to eight X more than that. So that static serving shouldn't be an issue. Makes sense. So one more thing. So this, like how do you kind of create a randomness for the request you're sending? Like maybe it's, it's, it will take a candidate ID and it will return back the Jason. So while doing the Apache bench, like, how do you know that it's not being served from some cash and template cash or. Okay. So I wouldn't call my benchmarks very scientific. Okay. So take that with a pinch of salt. Okay. So what I'm done is basically, so you kind of think about it, right? I mean, I could have used a slightly more complicated thing. Okay. So let's say if I kind of do a random thing, let's say maybe it kind of goes down the factor of two. Okay. Right. Even that is a pretty good. Solution. Okay. I mean, I don't expect because everything is actually in the realm of the process. The query should not take time. Okay. But I'm not kind of verified that particularly. So that's my gut feel that it won't change much. Even if it changes by factor of two, we're in pretty good shape. We're pretty close to what we need. So I mean, if you're kind of really real system, I would probably use something more sophisticated to kind of check a random ID and then kind of generate a benchmark using a real kind of data. But here I decided to kind of take the simplest approach. Okay. Thank you. I had one question. Quick time check. Sorry to interrupt, but sure. Yeah. We'll take one last question. Yeah. Yeah. The question is like, did you find the bottleneck here in this case? No. So I basically opened it stop and actually saw where is it going? It's a few is maxed out. Okay. Thanks. Yeah. So I think that's about this case study. So we have a couple of more case studies by Piyush. Piyush is VP of engineering of capital technologies. So he has two use cases, two case studies from his experience. So I'm going to hand over the stage to Piyush now. I'm stopping sharing Piyush. You can take control. Sure. Yeah. I'm just sharing my screen. You're not seeing it. Yeah. Can you see the screen coming up? Okay. Folks, I think I'll do anything. I'll, we'll take a pause at the end of a case study and take questions then so that we were able to, we are on track. Okay. So just kind of hold your questions till we finish the case study. So, right. So suppose I have two case studies to present. So one is basically to kind of focus on what did not work for me, right? And where I think I went wrong in terms of, I think the problem approach was, I think, okay, like from a first principle perspective, you know, analyzing the problem and breaking the problem. But how I ended up over engineering it all together and the end result was a lot messier than what it should have been. Right. And the second one is going to be more of like a success story where again, you know, applying first principles, simple thought process, looking at the actual problem paradigm and understanding what we actually need to solve and then choosing the simplest solution and how the results that it yielded. Okay. So let's jump into the first one. Right. So what did not work for me. So, right. So again, so this, this, this happened way back sometime in August 2012. So it's been almost about eight years. Right. So just, just, just keep this into perspective. Right. Because I'm talking about technology landscape as it was about eight years ago. Okay. So all of the decisions that were taken then should be put into that framework. So basically, so capillary provides a cloud-based customer income solutions to large retail chains across almost 35 countries. Okay. So one of the modules that we have is called as the in-store application. It is essentially a desktop application which gets installed on the U.S. Terminals and individual cashiers and the associates generate your invoices and where you would make the payments and all the, all the, all the, all the engagement happens after, after you have made a transaction essentially. Okay. And most of these, not most here, but a large, a large percentage of these systems are Windows. So we have a dot net based desktop applications which kind of gets installed on these individual U.S. Terminals and then it talks to the actual U.S. Software. Right. And it kind of sticks all the data back to the cloud platform in real time using the APIs. Right. So we had about 10,000 installations of this application across, across almost like 15 countries that point in time. Right. So one of the very common hours was that, you know, given that you have such a large deployment of client applications. Right. So some of the other application might misbehave was like the version of application that has been deployed is misbehaving or some functionality is breaking. Right. So our support are basically our application support team use had a very common requirement of arranging debugging sessions with the particular store which is reporting a problem in the application essentially. Right. So that time, you know, one of the means was using a team viewer by by access. Right. Get the cashier on the store to install team viewer on their machine. Then the support team will log in into that box, see what's happening. Right. And also we had some implementation where the application would sing their logs almost after every two, three hours. Right. We were not doing it at a higher frequency because we also have to be conscious about the amount of network we are consuming on the store because most of these stores run on very low bandwidth network essentially. Right. So we thought about, you know, how do we solve these problems? Essentially, the three use cases we have to solve is that we should be able to send a command to the application that, you know, sing the log files like immediately as soon as the command is received. Right. Or it could be refresh the configuration files by making an API call to the cloud platform so that the latest configuration files and other could be seeing the telemetry data. Telemetry data essentially means what is the size of the SQLite database that is currently on in the application at that point in time. What is the last thing that happened? What is the latest version of the data the application has? So I hope you get the gist. Okay. So let's look at the first principles that come over here. Right. So one is that it's very clear that we need a custom command protocol that needs to be established between the desktop application so that from the server we can push a command to the application and the application understands the command and is able to take some action locally and kind of then either sing to the cloud or whatever it needs to do essentially. Okay. So how do we how do we how do we sing these commands between the desktop and the server? Right. So one, one, one, one, the first thought that comes to the mind is why can't the desktop application see essentially, right? So again, very, very, very rough estimates, right? I mean, let's assume that my application is polling the server at every five second frequency. Right. So assuming using some algorithm or using some jugglery, we are able to ensure that we're able to spread the distribution of the request equally across the five seconds. With 10,000 installations, I'm still looking at almost 10,000 requests per second coming onto my server. Right. Now do I actually need to build a server for serving 10,000 requests a second where essentially the request processing would be what that, given my desktop application client ID, I have to check its queue or a command queue and see if there's any command queue for that. Right. And then sync it back. Right. So seems like slightly, you know, not, not the optimal way to go about it. Right. How about implementing long connections, right? So long connections could also work so that the number of connections will be reduced. However, long poly connections are usually one directional, one directional traffic. So basically you are sending a request from the client and unless the server has something to send back, right, it will, the connection will remain open in case the connection gets dropped, it will come back. But the server will not, it's not a bi-directional channel where the server cannot push the data back to the client. Right. So it kind of partially solves my problem. It's not a 100% green right approach for me. Right. Web sockets makes a lot of sense. Let's, let's have a Web socket. Let's have a long persistent sockets open between your client and server and let the server keep sending commands onto the socket as and when, as and when you want the, the, the command to sync to the desktop application. Right. So this seems very straightforward, right. But, you know, way back not as mature as, you know, like less number of years of experience in industry haven't broken enough software in life. Right. So I wanted to build something very, very fancy tech. Right. So what, what approach we took. Right. We, one thing we realize is that, you know, XMPP protocol, which is a previous version in which a lot of chat application software is used to run, right. Like your Hangout and your Google chat also used to run on XMPP and other protocols have come in. Right. But that time XMPP was something which is a very fascinating thing and one thing struck out that, you know, it automatically gives you persistent and long-lived connections out of the box. Right. So at least one checkbox is there that I need long-lived connections open. Right. XMPP has another property, another capability that it has the concept of extension protocols where the people define their own extensions on top of the underlying XMPP specification. So it already has a extension called XCP 0050. It allows you to send ad hoc command that stands us onto the connection and your client library can delegate it back to your application layer if you want. Right. What more. Right. We can actually embed our own chat clients too. Right. It's so cool. Right. If we are able to embed an XMPP client and have our own server running against the application, we can actually build another feature where the store associates can actually chat among themselves and we might be able to solve for more use cases in future for our customers essentially. Right. So mistake here is we are inventing a problem that did not exist actually. Right. All we need to do is solve the problem of sending commands from the server. We ended up building, solving a different problem altogether. Right. Each. So now coming to the choice of the web server of the server that you want to use on the XMPP site, each everybody was another option, but we didn't know if we were allowed. So we kind of chugged it. Next game open fire. Right. So open fire was a fairly popular XMPP server even now it is like right now it has been acquired by Ignite. So you will see a lot more improvement in that, but that time it was a fairly open source and community driven project essentially. Right. It allows you to write your own custom plugins. So you can actually write a custom plugin to build your own custom command protocol. So all the checkboxes are getting ticked, right? Like from what we needed except for that problem that did not exist for us at all. Right. So what's the big deal like we should be able to solve the problem using our XMPP based approach. Right. And .NET libraries of XMPP are easily available to embed in a client application. What did go wrong? Right. Open fire documentation was extremely, extremely fast at that time. We overlooked it. New technology. Right. New technology. Nobody in the team knows that everybody is learning it on the fly. Basic, very, very basic plugin. All the plugins all we had is just like three or four plugins source code available on GitHub. You have to kind of reverse engineer the code by looking into open fire how it's a life plugin, life cycle management happening, how is the connection management happening, how is the message management happening. Right. Scaling challenges again because lack of documentation the concurrency constructs within open fire how it handles multiple requests and multiple connections, multiple messages was totally, totally confusing because documentation was not supporting it. So we ended up writing a plugin, but the plugin was not aligning with the concurrency model of open fire. So that's what we're doing. So that's what we're doing. So that's what we're doing. So we're working on a new model of open fire. So many of the times we used to hang the open fire server altogether. It will stop accepting new connections, it will stops broadcasting the messages that we are sending from these around. So poor documentation literally killed us. Unknown unknowns get over, right. So there are known unknowns and there are unknown unknowns, right. So in this situation there's so many unknown unknowns. We were pretty much kind of just battling them on a day in, day out basis, right. Took us almost close to three months to stabilize it so production. in almost lost team towards the end and our motivation has completely gone. Finally, we got it working after about three months. It used to work beautifully until it used to stop, right? And when it stopped, it's like, again, scratching your heads, trying to reboot the XMPP server, trying to take the dump of the JVM, trying to figure out which variable is getting stuck, where are the threads blocked? It was honestly, it was a complete mess, right? If you go back to our original problem, all we needed was a simple load application on web sockets. It could have easily solved as well as, it less than two to 300 lines of JavaScript code could have gone live in less than three weeks, not more than that. Low maintenance attack and fewer unknowns, right? And we actually didn't have to solve for problems that did not exist at that time, right? So again, this is one of the horror stories I have lived through and I was one of the main actors of one of the main decision makers of going with the XMPP based solution that time. So yeah, great. So this is all on this particular case study. So any questions, any comments? You can type your questions in the chat window or you can speak, either of them works. I think that the point you mentioned, like in the end, you have to solve the problem that didn't exist. So, means I'm not completely agree with you this point, like maybe I'm wrong. So my thought is like, when you're designing the problem, when you're designing any solution, you have to think about what could be the next problem in future problem. No, I agree. But the point is the immediate problem statement was to have this command based protocol setting in, right? So the chat based application is something which we thought is coming as a extra value add on the decision that I'm making. And this will solve a problem for me that will come in future, right? That problem is not existing for me right now. And even in the next four to six months, I'm not forcing that problem to come for me, right? So essentially just by looking at a benefit that I'm getting by choosing a technology and architecture, which the, and that benefit is something which I'm not gonna realize in my application. I end up choosing that technology. I don't think that's a wise decision, honestly. Unless, as I said, right? Known unknowns versus unknown unknowns, right? In this particular case, it's a new technology, right? We don't know what all unknowns you're gonna run into. All right, in that case, it's slightly unwise decision in my opinion, so. Yeah, correct, thank you. There are a couple of questions in the chat window. So I'll read out, do logs have to be synced all at once or it can be done incrementally? Yes, the logs were synced incrementally, but the point over here is that, you know, when, let's assume that when a support guy is trying to troubleshoot a particular issue in one particular store, right? So, you know, the customers are waiting for a quick SLA. They are looking for a quick resolution, right? If I say that I'll send the command, I'll wait for one hour, or I'll wait for the log to sync to one hour. I'll call you back after one hour and debug, right? So that kind of support is something which the customers won't appreciate, right? For instance, look for yourself, right? When you raise the ticket on any of your providers that you use, right? You look for some feedback. You wouldn't be happy if the customer says that I'll wait for the log to sync up and then I'll check and get responding, right? So we wanted to do that when a command is raised within a matter of few seconds or maybe maximum one to two minutes, the support team should have access to the logs and what's happening on that remote machine, essentially. Yeah, the other question is, is today's solution closer to what you've mentioned here? Like, did you eventually switch to something simpler solution? Yeah, we did, we did. Okay. Finally, we moved on to a simple node-based application. It's still running and it's running flawlessly, right? I mean, nobody even was interested in what's happening and just running. Yeah, the next question is, how do you prevent running into this problem again? Good question. So I think as I said, right? Approach it from first principles. First of all, lay down what are the actual three or four problems you are trying to solve, right? And that's one thing I've realized over time that best thing is you write down the problem and break it into the dimensions of the problem that you want to attack, right? Just don't go by an abstract definition. I think we will come through that, I think in the next case study, right? So one of the test stories there, how we broke it down into simple dimensions of the problem that we want to address, right? So you just kind of look at that and see whether if a simpler solution is taking box all the dimensions and all the attributes of the problem, I think there's a high probability that it will work for you, right? Unless, again, as Armin started the talk, right? Because Google is using a fancy technology maybe like Kubernetes, right? I mean, actually not have use case of Kubernetes, right? Just because it allows you to scale to larger, larger systems. Do I actually even have the requirement of going to that scale, right? If the answer is no, then it may not be the right choice for me. So that's how I would approach any problem these days, yeah. So I think there are more questions but I think we can take them after the session other case is over. Sure, sure, sure. But let's move on to what worked, right? So again, this is again the case study from one of my earlier companies. So I was kind of heading the engineering team at Travel Triangle. This was like the time period was somewhere around December 2016. The Travel Triangle essentially is a marketplace for travel industry. So we connect travelers with travel agents which are local to a particular destination. And along with that, we also have an in-house team of operations group in-house operations team which helps facilitate this all interaction and engagement in the marketplace, right? So given the kind of product portfolio we had, right? We're talking about three different products and areas over here, right? One is a consumer facing B2C application, right? Where customers come browse the content, look at the destinations that kind of packages and the offers that they can get, right? And then they place a lead and they move on, right? And the next to the seller facing product essentially it's a lead management kind of a system where roughly at that point in time around December, 2016, we had roughly 2,500 daily active users on that, right? And so the consumer facing application is essentially lead heavy, right? It's a very system where you can serve as much data as you want out of your cash, essentially, right? The seller facing system is both lead and right heavy because it's a lead management system. So the users are actually acting on the leads and trying to update the status of the leads again and again, right? It's very, very heavy on listing and ranking use cases because you have to show the seller that which lead the seller should work on essentially, right? It's not like just take the leads and show them in ascending order of creation time or something of that, right? And the number of DAOs might look small over here but the thing was that we're looking at a product where a user is using it actively for almost eight to 10 hours a day, right? So if I look at the number of page views that this 2,500 users translates into this also goes to up to three to four million requests in a day, right? Similarly, the operations product that we had internally, right? So we're talking about a system which is lead and heavy. It is also heavy on listing ranking as well as it is heavy on the solution use cases because the system has to be smart enough to suggest to the operations guy that which particular lead the operations person should work on so that the efficiency of the work the operations person has done in a day is high and the chances of converting the leads are high because conversion of the lead leads to revenue and that's how that's what we're trying to optimize for. Again, long running sessions, your operations support team is working eight to nine hours a day, you have to scale it up from 20 to 50 DAUs to about 1,000 DAUs essentially, right? So all in all, if I have add up the number of page views that point in time roughly about the three to four million across all products, right? Now then the marketing team and the business team came back that we want to invest heavily in terms of promotions. We want to do a lot of search engine marketing and we want the system to scale about five to six times essentially, right? So just six million page views just on the B2C application then, you know, for us consequently, you will see about around five to six million on seller facing as well as on the administrative product, right? You have to kind of scale the system about five times, right? This was a text pack, the application code essentially is a monolithic Ruby on Rails app. Although this one, we had separate service deployment for each product, each have it under its own role palencer, right? Very, very, very straightforward web application just like my SQL Elastic Search Redis, message queues hosted on the Amazon web services with about roughly 30 EC2 instances running about 16 instances on the application services layer and like databases, RDS, Elastic Cache and load balancer and image and CDN content, different things are out of Cloudinary and Akamai, okay? Let's look at the following statement. It says, you know, so we are fairly certain that our database will not be a bottleneck because RDS metrics and all of the data were showing that we still have a lot of headroom to go before the 5x traffic is gonna be a bottleneck for us, right? So assuming your data source can handle the extra traffic, you scale up the application services in front to meet the 5x growth in your traffic, okay? Let's throw in the constraints that we were living with that time. We had limited dollars at our disposal to spend on infrastructure again, given that the smaller companies, smaller setup you try to optimize for each and every dollar that you try, you have at your disposal, right? And I had only one senior engineer who understands the AWS infrastructure and how the AWS API is and other things and even the scaling things work. I had only one engineer available to me at that point in time, right? And that's pretty much the story of, I think, most of the startups in their early stages that uses, right? So let's just open to thoughts. What are your thoughts, I mean, from the audience? Like what do you think we can do over here? Quickly on the chat, we can have for a couple of minutes. Yeah, Anand, can you hear me? Okay, I can't hear you, so I thought, okay, use ASG, auto scaling groups, good, okay? What else? Okay, so let's move on to what we can probably do, right? So can we use containers, right? So again, December 2016 was a time when Docker was very popular, LXC was slightly on the decline, right? Maybe containers can help. What could be the, if I go for containers, what can be my cluster managers, like Docker Swarm or Kubernetes Kops, right? The current distribution of EKS was not available publicly that time in December 2016, right? So you had two cluster managers available. Vsauce is also there, but again, Vsauce was again on a decline, I would say, in the community at that point in time, right? Auto scaling groups are an easy to, good idea. We should definitely try, but again, right? We haven't worked with auto scaling groups yet in the past, right? So among the three options that we have currently on the screen, I think auto scaling groups would have been the easiest among these, right? Again, new technologies for us, all three of them. Learning curve, yeah, engineers love to tinker with cool tech, as I said in the previous case study I've already mentioned, and I think as most of our engineers can appreciate this fact, right? What we somehow do end up overlooking is, maintenance cost, again, coming back to known unknowns versus unknown unknown, right? How do you, how do you, how do you get the cost of these unknowns, right? Let's look at the first principle, right? One thing is very clear that we need more servers to increase capacity, right? But let's look, take a look at the elasticity factor that we need while deciding how do we increase the capacity of our infrastructure, right? So this is a very rough sketch of a hand drawn one, right? So this is just for illustration, right? So don't go by, I mean, the x-axis is, the y-axis is not the actual representation of the number, but just to give you some, just to give you some glimpse of what the traffic pattern used to be essentially, right? So over a period of 24 hours on the B2C or the traveler-facing product, right? This is the pattern you will see, right? So largely predictable and something where you're not seeing too many bursts coming in essentially, right? The traffic will gradually or organically grow, like it stays at that for a while and then gradually dip again and then go higher again, right? And some of the insights we had that, you know, most of the travel plans people make usually after 8 p.m. in the night, right? This is like data which kind of told a story for almost two and a half years while I was working on these products, right? And maximum content on blog and research happens between 8 a.m. to 11 to 12 a.m., at least for India and the Southeast Asian belt. Most likely people are watching, like reading this content either while going to office or while they're in metro or bus or something of that sort essentially. It's fairly predictable in terms of the scale that we're gonna see, right? Similarly, on the B2B product, assuming that most of the consumers are travel agents, they work, their timings are fixed, right? Even if you span it across the Middle East, like your other agents sitting in the UAE and the Egyptian belt, right, to the Southeast and a belt of maybe Malaysia, Singapore, Thailand as well, right? So you'll see that traffic will organically increase after 8 and peak at about 10, dip again during lunchtime, again peak and then it will kind of die down. You will see that even between night and, even during the night, we still used to see some traffic coming in because a lot of customers would, a lot of the agent could end up leaving their dashboards open and the Ajax request from those dashboards keep pulling the server in the backend, right? I believe logic says that this request count would be zero at night if the customers have gone home, right? But actually they would just leave the dashboards open at night, all the Ajax polling will keep happening on the backend servers, right? Similarly on the operation product, you will just sort of a pattern that we used to see, essentially, right? So we will see a massive spike at around 9 a.m. Then the operations team starts their work, right? Around lunchtime, you will see a dip and also the customers would go on a DND mode that, you know, call us about after two hours, you're currently in a meeting or something of that sort, right? This is the kind of pattern which was fairly predictable, right? Right? So the elasticity factor given the traffic pattern, do I need to plan for elasticity at a second level granularity if that maybe should be considered Lambda, AWS Lambda or Cloud Functions, which is clearly not the case given the pattern that we have, right? Do I need to look at a granularity of minutes? If it is, then maybe I can evaluate containers, right? But again, given the pattern it does not make, does not seem to be, again, seems like a overkill to me, right? Probably an hour or a bunch of like 30 minutes, 30 minutes and if that is the case, I can actually look at virtual machines, perhaps, right? Cost savings, do I need to save costs for every extra second or a minute? Or can I, is it okay if I save the cost for every extra hour or maybe extra, every extra half an hour to one hour where my system capacity is running only after the traffic overall, right? So very, very simple solution we cooked up, we used to call it poor man's killing, right? Nothing fancy, just hired a Jenkins cron job which is running every 15 minutes. So Jenkins you are using very heavily for as a build and job scheduling system, right? And in AWS simple DB, we created simple JSON documents for every service, maintaining a mapping of at between what are to what are, what level of server count mapping you need under each load balancer. The reason we went with simple DB was that it used to give a very simple GUI to manipulate JSON document and this GUI can actually be used by junior engineers in the team as well, right? I don't want to restrict the access of this manipulation to any senior folks in the team, right? It wrote very simple Ruby script with the Jenkins cron in the executing, right? Just go then checks that how many current, how many servers are currently running by calling the EC2 listing APIs? Look at the state that I want to be in, right? Like between eight to 12 AM, I want eight servers on my one particular application, currently three are running, it will go and spawn another five, right? Simply the Capistrano was already being used to deploy latest code build to test render a load balancer, you're up and running, right? Similarly, in case the current servers running in production are more than what you need within the star window, right? You will kind of start scaling down the servers one by one, right? Very simple bringing deploying and kind of shutting down your servers, right? How much did it cost me? Three days to move to pre-production and two days to move to product. One engineer, I don't think I could have done it any cheaper than this, right? So auto scaling groups could have been a good option, but again, given that it's a new technology for us, I think when the time spent in kind of figuring out the configurations or they would have definitely been higher than what I could manage to do with this, right? What was the ROI of all this poor man's auto scaling, right? We could actually scale not just to five extra, we actually scaled this to almost seven extra than what we were seeing in December 2016, right? Solution work perfectly fine and for 18 months from that point in time before we actually explored containers. Rough calculations showed us we were actually saving almost 40% cost over on-demand instances. If we had moved on to containers, probably an additional 10 to 15% of cost saving would have come in, right? However, what cost would that 10 to 15% saving would have come? That's again a big black box because a lot of unknown unknowns are there, right? We templated the solution, our QA environments are also moved on to this model, right, regression and sanity suites started invoking the scaling job as a prerequisite. Each QA group could actually spawn their environment while using the simple solution, right? Yeah, this is all I had from my side. So any open for questions? Yeah, you can type in the chat window or unmute and speak. Thanks for the presentation. Just one question later, you told that you moved to, you explored containers up to one and a half years. Did you move to containers or I mean, what was the motive and how it worked out to the container solution? I think the container solution that we explored was that I think we were launching more and more products and the engineering team's size also started growing, right? So that was the time when we wanted to move to a model where one advantage of containers is that, you know, you ship what you build essentially, right? So that was the problem for which we wanted to explore containers, not the scaling. I think the existing model that we have for, I also think it's not auto scaling, it's a scheduled scaling essentially, right? So this scheduled scaling model could have survived for some more time for that. But we moved on to containers because we wanted to have a ship what you build kind of a model, right? Because the engineering team's size was growing quite a bit. So in your presentation, you mentioned that you have used a junking job and every R you had, every R you are increasing the size as per the requirement. Yeah, every 15 minutes, so. Every 15 minutes. So every 15 minutes, actually this job is nothing but all it does is goes and it does a easy to listing of the AMI, of the servers running, right? And the servers are actually tagged using labels, right? If you go and check that how many servers are currently live and how many are supposed to be live between these time window, right? And it will try to try to either bring up more servers or shut down the extra servers. Very simple logic, nothing fancy. But like if you took up 15 minutes gap, so did you feel any time like here? Your load is very high in that 15 minutes and, but. No, no, right? So if you see the, if you see, that's, that's again, that's the first principle, right? If you see the pattern that I've seen, right? I know that on the B2C product, I'll see a spike coming somewhere around, somewhere between seven and eight, right? So I can actually spawn an extra server at 6.45 AM, right? Similarly, I know that my traffic is gonna dip somewhere between two and three PM, right? So I can actually reduce one server at around three 15 or three 30 PM, right? Given that the predictability of traffic pattern has been established, right? These calls are fairly, fairly simple to take. And obviously, I mean, it's not like the, the, the, the call has been taken and it's what you have to live with, right? That's the reason we stored these configurations into a simple DB where you can, if you see that the current scaling strategy is not working, you just simply go and update the value in simple DB, you trigger the Jenkins job again, and you kind of, you're up and running again, right? So, so, so in that case, you are using some type of Chrome job that will run. Yeah, Jenkins actually has a Chrome module, so you can configure your jobs in terms of the run button. No, no, okay, okay. So at that duration, when you see that as a load, yeah, yeah, yeah, go left. One question, like what was operational over it? Like was it that you set up the system and it runs beautifully or was there any time period to go figure out and fine tune? And what is operational over it as such? This is hardly any overhead for us, right? Because I think I'll just say that the Ruby SDK of AWS, we're all anyways using in the current team, right? Jenkins, you were already using, right? The only thing new we introduced was AWS simple DB and even that we had hardly 15, 20 documents in that, right? And even documents are like not more than 250, 300 bytes at any point in time, right? All tried and tested technologies for us, right? For load balancer, you use a single load balancer or you also use a group of load balancer? No, so we had a top level load balancer on which the top level domain used to be used to point. And then we used to fork the request forward to another set of three ELBs. And every ELB was actually supposed to serve the request for the three products that we had. So I mentioned over here. Yeah. I mentioned it, separate service deployment for each product. So every deployment essentially is a collection of passenger servers on, right? And those passenger servers are behind a load balancer dedicated for that particular service, essentially. Okay, okay. Yeah, thank you. That looks really cool. Yeah. So, and I think when we all started and I was discussing it, I'll probably just call it out for the audience, right? So in the current at capillary, right? All of our info is completely on Docker's and Kubernetes, right? You can imagine a cluster of 1000, 1000 servers running on Docker and Kubernetes, right? And every month we run into an esoteric Kubernetes issue just 10 minutes before we were starting the call. We had an outage where we ran into a bug in EKS which they, which got creeped in by the AMI they launched on, on this Monday, right? We had outage on three of our microservices because the Docker containers got stuck in a termination stage, right? Completely black box. No clue what's happening. You reach out to AWS support to tell, okay, you upgrade your AMI move forward, right? So, seems to like that, you know, you're on the latest cutting at technology but the amount of unknowns and uncharted territories it brings in kind of scares me at night sometimes, so. Yeah. So I generally like to kind of think about this in this analogy, right? I mean, so I don't know who told me about this but so you can think about this software as like a speedboat or a ship. So a speedboat is something that takes a small amount of loads but can be handled by a single person most very fast. Whereas another hand ship really takes a crew to kind of manage it. Takes a really long time to kind of get started and then stop, but it can take really long, large amounts of loads. So now as an organization, I mean, there may be need to kind of have a ship or a speedboat, but what happens is like when you look at the scaling solutions that are done by the big organizations they're probably building ships because they have so much of load to kind of manage. So if a small organization kind of tries to build a ship it's going to be a disaster because it first need to have an educated crew. You just have a two, three engineers to kind of work with you. So you can't really manage the ship and it really takes a long time to even kind of get started, like even move an inch but what you really need is a speedboat at that point. So if you're getting out, if you're actually taking, building a speedboat or a ship is something an important way to kind of look at it. So I kind of find that analogy really appealing. I think that's a really good one, Anand. I think it's a good one. You need to figure out whether you need a speedboat or a ship, right? I think just make sure that you don't end up choosing a ship for where a speedboat will fit. So any more questions? I think there are some questions for the previous case study. I think you could take it from there. We didn't take all of them. So I'm just looking at that. Yeah, so one question was, how can someone de-risk the product from decision similar to this? So what is the question? How can someone de-risk the product from decision similar to that? So let's say if you are a team leader or someone, how do you kind of make sure, how do you de-risk your team from making such decisions? I think I personally have learned this by burning my hands, Anand. And by at least now hearing stories where people say that don't over-engineer a simpler solution is the best solution. So I kind of, at least I am able to curve my, curve my enthusiasm to pick something fancy or trying to solve problems that don't exist. Now it's, I think over the experience I've realized that these things are, and this is what I try to imbibe in my team as well. Whenever they come up with a design to me, I start asking very simple questions. In fact, today afternoon, I was having a conversation with one of my members that you're saying that if the existing message queue is already overloaded, do you want to add more messages on to it? When we did a rough calculation and we figured all we are adding is not more than 150 MB of extra traffic going on to that pipe. And he said, okay, no, yeah. We can use the existing info. I don't need another dedicated Kafka stream on that. I mean, these are simple, simple questions. If you ask people, right, they are actually able to realize that, yeah, you don't need to think of fancy things. I think the one is, did you consider SNMP at the time? No, we did not, we did not. We did not consider SNMP, right? So I think that could have been a good option. Yeah. What is, which you follow now, apart from the team familiarity with the tech and unknown tech? Now, as I said, right, pointers are very simple, right? When you're looking at any problem, right? Just like the calculations, Anand did, right? How many bytes you require, right? And that's, that's a very fundamental thing, right? Essentially building any software is nothing but an interplay of network, compute and memory, right? We have to realize that the problem you're trying to solve, is it compute heavy? If it's actually compute heavy, what unit of compute cycles you're looking for? Is it IO heavy? Then how much of IO you're looking for, right? If it's a memory heavy, how much RAM you require, right? So you just look at basic fundamental principles and you're able to arrive that, what, which direction you need to choose, right? Just ask very simple fundamental questions, right? CPU, memory, IO, and you will be able to arrive at, I think 80 or 90% analysis kind of gets done just by looking at these questions, essentially. Yeah. So one other thing that I won't kind of bring up here is a lot of times when you can build in the systems, right? We don't really, we get lost in the, so many layers. We can't really see through all the layers down what's really happening. We only look at the top layer and we kind of make decisions. So what is important is to kind of develop a sense of kind of seeing through the layers and kind of understand what's actually happening on the ground. So I really like this analogy kind of, this was one of the talks given to Paul Bush the creator of Gmail at one of the YCE sessions. So he talks about like, when he's kind of designing a Gmail, right? He thinks about, start from what the speed, the hard disk actually spinning, right? So he can actually see how much load it can actually handle and the related to what the speed the disk actually spinning there, okay? That may be kind of probably to an extreme for many of us, but that's the kind of thing that you need to kind of sense you to develop, right? So when you kind of see number of requests, you need to kind of see how much data is going through the wire. Is it CPU bound or IVO bound? Ask all those kinds of questions and let's just see what actually happening on the ground. Rather kind of just looking at the top layer. I think that is one important thing to kind of keep in mind. Yeah. I think one question from Akka is the consistent team across the case studies has been the team similarity with the tech, what would be the ideal path for a team to start using new tech, right? So Akka, it's the way I see it is that, you know, you have to actually, I mean, I'm actually a techie at heart, right? I love to explore new tech, right? But what you have to be aware of, like, is the new, you have to evaluate both the pros and cons of the new technology, right? So if, and also the new technology that you're picking up, right? For what problem and what kind of, in what context of that problem is that technology supposed to be used, right? Again, coming back to the example of, you know, that Kubernetes example I was giving, right? So we adopted Kubernetes about two years ago at 28, in some time in 2018, when we actually, infrastructure was going above almost four to 500 servers across regions, right? At this point in time, we were about at 1,000 servers, right? Also, we had reached a point where our infrastructure expenditure was extremely, extremely high, right? We could not afford to run on virtual machines anymore, right? So we realized now it is the ideal time to move on to container. It's not like we did not know that how to use container. We have been dabbling with it and experimenting it at way back. So we started it way back in 2015, 16 itself, right? But is the, is your application or is the problem domain at that level? Or is it, I mean, the actual benefit of the technology are you picking, would you be able to realize 100% benefit of that? Or is just a 10% benefit you are gonna get by using that actual technology, right? Just like the classical analogy of, do you need a big hammer for every tiny nail for you, for your two kind of drive, right? So you have to kind of look at it from that perspective. So, yeah, I think that's a very nice example. So anything that new tech does for you, it also does to you, right? It is something I heard. It's absolutely true, right? So, yeah. The other thing is basically like, given a choice, pick a boring technology, right? Absolutely, absolutely. So I think we're pretty much slightly over time. So I think we're, I don't see any more questions, but one thing that I want to kind of ask you is like, a quick feedback about like, how do you find the session and what kind of, how and what things would be good to kind of going forward? Like any suggestions or anything about the structure of the session? Okay, so what kind of things you would like to have more, et cetera? If you can just, anyone can, any inputs, that would be great. You can close it after that. Okay, I have two inputs. First is like, I think we should stick to a one hour. And second thing is like, instead of three talk, we can have two talk so that we can have enough time for a question and answer instead of going overshoot the time. Sure, I think it's a good point. Thanks. Anything else, Usha? No, topic is good. So like you have good points and good insights. Keep doing this. Yeah, I think thanks, Anand and Piyush. It was very good. Few things what I can think of is maybe some war stories which involves across the teams. For example, purely from the developer perspective, right? What is the operational challenge you face? Like for example, while running it, I mean things work well and then you deploy it, but you do the deployment, the debugging was harder or those kind of stuff. And also like maybe, is it like across, I mean it's not just AWS also, maybe something else also, maybe somewhere to run the private cloud, some kind of stories are on that. That would also be nice. Thanks. Yeah, thanks. Anyone else? I guess I don't see anything more. So thanks everyone. So the scaling from first principles page on Haskeek has a place where you can actually add comments. So if you have anything, any ideas, et cetera, if you want to discuss anything, you can actually add it there. So both Piyush and I will share our slides on the page. So we should be able to find that there. We're planning to have a session once a month. So that will probably be the third, Thursday of next month, that will be December 17th, but we'll update you by sending an email once it's actually confirmed. And thanks everyone for joining and it's a pleasure to kind of have the session and hope we'll have more enjoyable sessions in future. Yeah. Thanks, work. Thanks so much. Yeah. Thanks everyone.