 I like your background Swannan, it really says hi, I'm Swannan, so it's good for video girls. Yeah, yeah, thanks for the design team here. So there was a joke, right? You print a pointer on the t-shirt and people don't even realize whose pointer it is. It looks like we are live. Yeah. I think we should give a few more minutes for people to trickle in and we'll see how it goes on on YouTube as well. Mostly family names are coming in. So let me know, Swannan, when do we end? Yeah, I think maybe then let's just get started. We are like around 15, 17 people, right? Alright, cool. Hey everyone, welcome to Papersville of August. We are again, on popular demand, we're going back to the classics. We have a penchant for going back to all the classic papers. Personally, a big fan of a lot of the older work. I think the one pattern that has stayed with me is like early 90s work has stayed and it has aged really well. If you look at like Grahamerton or quick sandpaper or some of the other older papers, those have aged really well. Some of these concepts are like still quite relevant, active, used every day by small and big companies alike. Quick sand is one such. The paper is actually quite deep and opens up a lot of interesting discussion. I think that's what I like about this paper quite a bit. There's so many, one way to think about it like a lot of things are open to interpretation. But another way to think about is that it's quite dense and tries to speak a lot in fewer words. And that's something that I kind of like about this. And again, Sipath has a lot of interesting things to share. He's been from time to time, he shared some of these things. So I figured I'd finally get him to talk about this to everyone. Sipath is like an engineering manager at Deserve. He runs the backend and platform teams there. And if you've been around on Twitter or other places, you will know him as somebody who shares like small nuggets of wisdom from here and there. He has a habit of sharing like one little thing and then talking about it. So I'm going to let him take the cover. Since we are only few people, I was thinking should we do like a round of intro, like if everyone feels like it. Also, this whole session is meant to be interactive. And I would appreciate if you can turn videos on and keep like unmute and talk. I mean, the point of PWL meetup is that all of us sort of come together and speak, have discussions, you know, share whatever our experiences. And not just, you know, this is not like a lecture, right, for sure. So if anyone wants to like quickly do a round of intro, like that would be great. If you're up for it, you know, I don't want to force anybody. So maybe we can start with Piyush. Hey, hi everyone. My name is Piyush. I'm the head of engineering for Capri Technology. So I've been working with Sipath to get the papers we love meetup active again. Glad to be here and looking forward to Sipath's take on this classic. Awesome. Yogi, do you also have a good esteemed company today? Yogi is here. Hey, hi guys. So I'm Yogi. I'm at Sahaj right now after a year long break, kind of doing just relaxing at the kids, I guess. Yeah, so I spent a while at ThoughtWorks. I was with Sipath at that point and then at Flipkart and then couple of other companies after that. Yeah, happy to be here finally to see some interesting discussions. Thanks. Hey Aditya. Introducing yourself. Hi everyone. I'm Aditya Gorbule. I'm in Pune. And I've been doing stuff around software and software world for 16, 17 odd years, but in the last couple of years I've started working on data science and machine learning kind of projects. So yeah, excited to be here. Thanks. Hi, this is Varghese Pholm. I'm a software engineer at JP Mobile. So I got to know about this meetup from one of the other meetups that I attended and their reading papers has been pretty interesting. And for me, found pretty interesting solutions from reading papers. So I'm glad to know there's a group like this. Welcome. Hey, hi everyone. My name is Kashyap. I'm still at work late day today. So, so I'm sort of listening on. And yeah, I have been like friends like that for a long time. So, so yeah, like glad to be here and looking forward for the talk. Our discussion. Okay, all right, cool. Other members are feeling a bit shy. No worries. So you're about the, you know, your show, please take it away. Okay, thanks. So you guys see my full screen or how do I minimize the video. Yeah, I'm seeing quite clearly. Cool. Awesome. So, I mean, the title itself is talks about, I mean, I'll start with the title building on quicksand by Patel. And so the meaning of quicksand is what right. So if you step on a quick sign, it, it doesn't actually sink you immediately. But if you move, it actually sinks you. So if you try and do weird things with on quicksand, it's, it takes you down and it's very difficult to get out of it. And that's the reality of quicksand. And so the paper is named quite aptly, I would say, so the paper doesn't state. Of course, obvious that two is better than one. And I'll go in and it starts with an assumption that two is always better than one and what are the tradeoffs that we need to make to make two better than one. So for example, if you look at reliability or availability, it's actually a steep curve. So you go from 99 to at least even three nines. And drop from let's say a per day 15 minutes to roundabout point one second. Right. So that's the kind of drop that we have to achieve to achieve that. I mean, that's a drop that we see in when we say we are 99.999 availability of our systems, etc. So it's a steep curve. And so on the other hand, to achieve that kind of thing, the cost also exponentially goes up. So if you go from 99.999 to somewhat, I mean, it starts gripping up and beyond the point, it just becomes uneconomical for to make any component more reliable. And beyond what it has already been done. So, I mean, and that's where somewhat maths comes to the rescue wherein if you take two components together right so if I look at if I compute availability of two components. So the product of them, it's a product or dependent probability of their unavailability. So for example, if I take unreliable component if I joined them couple them together in a way that I can increase their availability characteristics and that's how we see clustering that's how we think who is going to be better than one. So, so just one note, I mean, if at any point if anybody wants to pitch in please do, because it's going to be interactive. So, so that is the basic thought behind all of this. I mean it's unstated assumption as I said right so and that is where it becomes. I mean that's where the paper starts with. Okay, and so I'll start going through the sections of the paper I like. And then my comments on it and you know, if you want to have any discussions on it, please stop any point of I have actually paper open side by side as well so I can project that and we can you know highlight some of these things and talk about it. So, the, the, the first section itself the abstraction abstract of the paper itself is quite interesting in a sense right so the first sentence he talks about is fault tolerant algorithms comprise of a set of item put them sub algorithms and, you know, then between this sub algorithms, the state is sent across really unreliable or failure boundaries or unreliable components, so that we can achieve reliability together, but and I was actually thinking why it has to be like this right so what does it. I mean why it is the way it is right so. And that actually is tricky question so. What's the advantage of having sub algorithm why battle and is actually saying sub algorithms right. So, and that is where it becomes interesting and I mean those questions he has answered of course. The second point is so primary, you know, primary system will acknowledge the work request and you know actions without sending the copy of it to backup system on backup and you know it will continue to work but is it almost like that or is it always like that. Because at some there will be a higher or there's a limit beyond which you cannot accept legs so you want to stop if your backup has not received certain updates and he's I mean that that aspect is never touched in this paper but what I realize after reading this in critical is that then your primary believes availability or primary systems availability is very very tightly coupled with a secondary system or backup system, because if the backup system is down you're not able to send. data or operations to it for recovery, you can operate for a while, but then you can operate endlessly so you have to either shut down or bring up the backup system. So, that way it's there couple lines very, you know, interesting way to think about it. Third is what happens when you actually start making trade offs about timing right availability of information at one place versus availability of information at the same second place so a synchronicity brings that trade off in your face like you have to make that trade off. So, Eric Brewer, Dr. Eric Brewer has famously said that the advantage of doing in single trade or single context is that some of the invariants that hold you don't even know till the time you start doing it asynchronously. So, so then primary whatever then the realization is that everything primary is doing is probably probabilistic in from the view of secondary safe for example primary server shuts down dies whatever happens. Secondary has a probabilistic knowledge of Okay, this is 99% is what I can or whatever percent is not 100% is the point. And then you actually start getting into designs which are which rely on more on things like commutative commutativity associativity of data that is transferred from primary to secondary. So that that partial knowledge doesn't hit you that hard. And that's I think the abstract what he talks about in a very succinct very obtuse way but it is what it is. Any, any thoughts, you be on this abstract or, or I'll go. I mean, I think I think the more or more side comment about it I think this initial part of it. I felt was was was kind of I mean it was detailed and kind of kind of talk about the whole tandem. You know the reliable systems that they build there, but the core ideas are fairly simple I think, but she kind of expounded later so I think I think this fine. The other, the other thing I have to add here is a lot of these ideas apply quite squarely to homogeneous systems. I tried to make sense of them in a heterogeneous systems sort of way and not did not go deep down there. For example, if you're building a distributed database like how do for Cassandra, then some of these are the mathematical models also hold. But if you have a, let's say, a web app with a worker kind of model, then this doesn't completely apply there. So that's something that I couldn't connect that kind of architecture to what he's saying here. Does that make sense. Maybe and maybe not right so for example, I think what what the thing is, even with same database same thing right, depending on the kind of conduction consistency model you use for database read and right. If you the dirty right that he is always probabilistic right. I mean in the same thread also. I mean you're so what you're saying as fun and I don't actually agree with it because even with the same monolithic model, you have multiple requests running in parallel right so in that sense they are their knowledge of each other is almost probabilistic. Right, and I guess that's why I'm like was just as something we see super. Yeah, one level advanced on the same. Correct. And that's a I mean you can't avoid it. I mean the moment you have to context running in parallel. There's no global unique knowledge like each thing says it's it's view of the world based on when that thread has read data from postress and depending on the connection consistency levels that you might have chosen. Yeah. No makes sense. Yeah. And it does feel like that it is talking about using functional approach in disguise like if everything is functional and there is minimum sharing of the state. Then those functional components you can read it or substitute as back up. Well, no, I mean it talks about item potency right but it's about algorithms so how do you express those I think in a way yes everything is an operation and that is where it comes from. I think the implementation could be anyway I mean it's not bounded to be a functional approach. I mean I don't put in see and the implementation of some algorithms can be in any paradigm need not be always functional. Cool. I'll go to the next section. And I mean I'll put on the pretty much on the left on the part of the paper I found interesting because a lot of work which is there. But I think some of the key ideas, anybody wants to talk about any other thing also please well, I mean, welcome to do it. The section to he starts with modeling the system and how should we look at black box versus etc etc. And in the middle then he says in practice system so evolve to be idempotent as either preconceived notion by designers or bug fixes as a result of fix it and which is I found very, very snarky and at the same time very true actually. And so in my opinion I think the challenge there in that sentence is how do you decide idempotency of a request right is it going to be MD5 hash. Then what you mean by idempotency is it like create a new respect just will not do duplicate creation, or would you not do duplicate updates or how do you differentiate between two updates for the state. Right, so our deletes right I mean delete can also be idempotent right then how do you delete if there is an important deal it then how do you deal with it. And so I deserve the thinking points I mean these are all the pointers that light up when you start reading this paper. I mean we take idempotency so at the face right. Okay I need to pass some key and that should not do any side effect. It is easier to do on create, but very hard to do on updates and partial updates and delete. Because then update is where you'll have to decide what to do with it. Because same key can be updated twice and in certain domains that may have different repercussion. And then you, and what is update idempotence I don't like to hear more about people I have no. So is there a mechanism to raise hand and then zoom I'm not seeing it. I think we can keep it informal yogi like just basically I couldn't speak when you know you can. You can go in the actions and there's a button you can click the. Oh is it okay. Okay thanks. So I think I think one of the things that surprised me in this line which says that the MD5 hash of the entire incoming request. Seems to be extremely erroneous actually it's a wrong way to do it because the the the ID should be something I mean at least it should be something which is completely random in a sense either either you have a completely randomly generated one over a large state space or have something which is very business specific where you know that the duplication is within a certain bound which is okay within your application right but when you do it this way I mean it seems to be a mistake because the moment you have a say if you're doing the MD5 over a portion of the message which is which can change maybe a header right so that can introduce a very odd behavior in the system that you know somebody wants to is depending on that property that this is a MD5 of the maybe the body which probably is what he's saying not not including the header. I just give an example but I think they're. Yeah, no, I mean. Yes, but fundamentally then you give my question is who decides idempotency rate is it externally or intrinsic or extensive. For example, is the service going to determine if the request has to be idempotent or with the mechanism that you're saying header key or that it's actually external right external party. As far as I understand it's always external because there's no way to idempotency is what allows the caller to retry right and you cannot do that if it's a server side generated ID it has to be generated by the client so that the deduplication can be handled at the server but that is determined by the client the client has to generate the key and send it always otherwise there's no way to deduplicate. Because it's actually very closely related with retryability. But then the way to look at this like assume that you have a client who is invoking a service or an API right so the client is passing an entity on which the service is doing an operation. That entity could be a well defined entity or it could be a derived entity or entity in motion as well. As long as I'm trying to I'm asking a service to do a perform an operation on the same entity that entity ID is what is defining my idempotency. I hope I'm able to make sense. Actually, that's not the case. See the problem is that that will break very easily so in fact idempotency should never I mean again I'm making very strong statements but they all weekly health so please. So they are. See the moment you do that right there's no way to differentiate between us cheaper though saying between the update or delete on the same entity ID. The way the way at least I think about about this is that the idempotency key is at the level of a. Yeah, it's a request saying that update entity or delete entity that request needs an ID so that in case there are failures in case there are whatever and the client retries that operation is safe and you know it can be eliminated by the server so you can you should never be using an entity ID to because then then all the operations on it can basically non. Yeah, I was trying to I was trying to I brought this point up to understand to say that whether the idempotency is an externally induced or internally induced. I was trying to pull it to that. So I think one way of thinking one way of thinking about this is to make it easier to think about is just think about you're making a post request. You're not using API. You're doing a post call create slash entity and your connection breaks. Now what do you do, do you need the post call again. You can't even go get called to check because you haven't got any ID. Maybe the server has created the resource maybe the server has your connection broke there's no way to tell. But if you have a idempotency ID as you've been saying created at your site, you could make the same post request again using the same ID and the server can discard it. In this case, the ID would be in the post body entity of time to create that entity would be having some key to uniquely identify it. For instance, if I'm, if I'm a client, I'm trying to create a resource in my name. So my name kind of tries to enforce the idempotency on that right so. Yeah, but I think the point I wanted to highlight is even simple concept like idempotency is like a phase full of rabbit holes. Like, I mean we can't even decide whether it has to be internal or external then that's where I mean we'll have opinions but I think. Because in this case, internally actually internal is more idiot proof right so it doesn't matter what client says I'm always going to do that thing on my side and be consistent and be correct. But on the other hand, I think I strongly disagree on that strongly. No, no, no. So, you're the other end. You make the connection breaks. How do you ensure consistency? Yeah, no, but I think my personal opinion is I think client decides whether it's a retry or not. the client, what might end up happening is something what the client doesn't intend to happen. If the client makes two post calls, you're creating two duplicate entities on the server. Then the client decides whether it's the same call or not. The identity of the call itself is decided by the client saying it's a retry. And that's by passing what Yugi is saying. Client should decide whether it's a retry or not. Not the server. I think the way I understood what Yugi is saying, instead of encoding the identity in the domain, you generate an arbitrary large state key and then depend on that. If you try to push that identity on the domain, then that is fraught with errors, like at this connection breaking and other things, amongst some of those problems. Also, I think W3C has formalized the header, idempotency id, gently x idempotency id for STTP request. So one place where this is very common, if you look at any of the APIs of these payment systems, if you look at UPI, they will always have a transaction id or which is essentially some sort of a call and they'll always have it. One real example is we are working with a WhatsApp API business provider. And one of the challenges we've seen is they were not sending that id and that caused all kinds of problems. And the moment we asked them to do a global unique identifier, like Yugi said, that enabled a lot of tracing and other things on our side. So I kind of agree with that idea that you send that you encode it on the client side and send it. That's how we do it. So we integrate with a lot of payment gateways, especially in Southeast Asia. So whenever you make a payment request, you have to mandatorily send your own id. And in case you don't get a confirmation on the payment that you want to retry, it is the client's job to ensure you're sending the same id again. And that's how they will dedo payment. So let me flip that around. I think this is an interesting topic. That's why I want to extend a little bit. Now, this is like a system boundary issue, right? If you're talking about adding policy on the system boundary. Now, if we take it a little more internal, then do we expect the same behavior or now are internal systems going to work on a shared knowledge where they can have a different definition of and. So I think I think I was about to actually talk about that. You have to have pretty much the same same knowledge, but there are tricky cases. I think he refers to it in the later stage about how an order could kind of end up creating two shipments. I think there was one. But this is something I've seen in Flipkart. We used to have this problem where an order at the front end of the order management system, for example, you might generate, there might be an id there. But the amount of workflow internally that generates out of that one request related to different systems like payments, shipping, logistics, procurement, etc. It's just so vast that we ended up actually having two IDs almost, right? There's one external ID, which refers to the true, you know, outermost ID, ID, at the whole supply chain level, right? Almost. And then there's a inter-server ID key when you want to just have for order management is sending to warehousing, for example. And different transactions would kind of deal with a different data. And look at the different keys differently. And so to Yogi to relate this, right? So it's actually first thing he mentioned that every large algorithm has to be, you know, divided into sub-algorithms. And they can be, you know, I mean, to give an example of, I mean, to give it a notion of fault tolerance. Cool. I think we're just on the first play. I think we need to, in the interest of time, I'll just slightly move slightly better word, I think. So, and this is, I wanted to come to this, but he talks about transparent fault tolerance, right? It's a very interesting topic. When I read, I've read this paper three times at least in different ways before and doing this critical reading again. So he talks of it as a standard idea, basic fundamental idea. And very, very, I mean, it's a deep learning for me after this. You know, once I realize what he meant. So, and as I said, like he says, we have observed the pattern in which fault tolerance algorithms is broken into item-potent sub-algorithms. And by capturing sufficient information at each level and sending across the boundaries, you can actually give an illusion of fault tolerance. And why, what's the need of sub-algorithms? Like why it is so important to have this for transparent fault tolerance? So, and this is, I think, one of the favorite sentences of Patel and his accountants do not use erasers, right? They always do a compensating entry, but never erases any entry. And that is kind of the crux of it. So, for example, he talks about it later in the paper that read and write are not commutative operations, but business transactions can be. So, if you look at it, business transaction, I mean, you can do one plus one minus one minus one and the end results still be zero, right? So, if a transaction succeeds or fail, can be a combination of tries, I mean, try or trying of a particular request, and if it fails, then roll back and doing compensation, compensatory actions, et cetera, et cetera. And because it's encoded in algorithm, because in algorithm, I can encode action and its compensation. And together, I can replay it with confidence, with immutability, across boundaries of fault, faulty or boundaries of the system primary and secondary. That is why you will need to, you know, break your algorithms into sub-algorithms. For each step, you have to then define compensate reaction if need be. And that is where the whole system can be then made very, very easy to implement in terms of fault tolerance, because then if I can do well log, I can say ABCD compensate, you know, reverse something. I can replicate it with confidence, serially, without loss of any information or without trying to know any semantics about the business application. By just doing at the infrastructure layer, I just copy that log as is and just replay it. Because the fault tolerance is actually encoded in the algorithm, not in the infrastructure operations. And that is where this line is, this thing is so important for us to even think about, even when we start thinking about the whole point of fault tolerance. So that was very, like, that is the crux of the paper, you know, if I put it in a single line, because I never thought about like this when algorithms of algorithms and compensate reactions. And what role they can play in terms of how do you recover from it? Yeah, this made me think about information loss, basically. Like, how do you prevent it and like parities and duplication. Like, all of that ties to the same principle. Any comments? I think Goel had an interesting point to make. I think that is similar to what he thought. Yeah, exactly. When I read this section and just how you were presenting, like compensatory actions and all, while in the paper also this reminded me of the Saga pattern people try to have for multiple microservices you want to have, you know, have all or nothing kind of implementation. And how you presented as compensatory actions actually reminded me of the whole Saga pattern. Yeah. But Saga pattern distinctly talks about, you know, coordination across different components, but not really. It's not exactly true, but I think it is very similar. Yeah. Yeah. And then he talks about scaling and item put and algorithms, whatever description here, whatever he has given in that section is almost looks like an actor model. Like you have one entity, it keeps updating, keeps the state and operation together. And then that can be replicated across. So I'll not go into details of it, because that may not be one of the interesting sections or others, or we can move on. And then he, in the next sections, he starts talking about random nonstop, like random, I think it's not talked about much because that's such a phenomenal system built in 80s and 90s, right? And like operating at a 99.999 sort of availability and there are a lot of very interesting concepts in here. And so between on the left is 1984 and 1986 and the difference there is, they actually chose to checkpoint all the data at the transaction boundary, not at the individual operation boundary. So individual operations in 84 model were checkpointed to backup disk at every read, right? Whatever happens, it is checkpointed. Versus in 86, it is done at the transaction boundary. I mean, it's clearly shown here. One, I mean, the disk process interacts with app pretty much in like instantaneously without waiting for the information to travel to backup. And once client says, okay, I'm done, commit, et cetera, and then all of that is given to the ADP process for it to log into the disk. And that's, I think, one of the, that I don't know how they might have thought about it, but idea is simpler, but I think repercussions are, they need to check validity of that idea because now you are loosening your consistency guarantees. And he talks about it, but it, I think, took them more than, I think, design and implementation and flushing out all the edge cases might have taken them two years to get it through. So I mean, it says because transaction has to be awarded and the manual says it can be awarded at any time and guarantees transaction not read, right? I think it worked out well and increasing the throughput of it or latency of the request. And one point I observed is the disks, the file system is actually a process, not really, you can't open it. So that actually give them a nice abstraction and to replicate it across the, I don't know how it will work. I mean, maybe read would also be similar, but it's still synchronous. One more thing, again, going back to the first point of DPP is down, then DPP is also down. And that is where backup actually starts impacting availability of your first system. And that is unsaid, but it is very critical actually because DPP cannot work without DPP being online at the same time. And that is tricky, man. I mean, I didn't like it, but that is a repercussion of thinking even backup can actually impact. So recently we had one incident wherein we had actually enabled RDS replication across primary and read replica. And because of that read replica went down for whatever reason, it started backing up log files on the primary. And that kind of caused some issues for us. So I was reminded of that here. We need to be, even this, I think number five can actually impact primary. So even backup processes, secondary availability will has a correlation with the backup system. And that's, I think, one of the questions Swann had tweeted. And a lot of people got confused, but I don't know. Any other thoughts on that? So what was the question? The question is, can your secondary system systems availability impact your primary systems availability? Are they correlated? Indirectly, yes. Yeah, so actually this is, he has his reference to this. I mean, this is later the whole distributed transaction problem, right? And this is essentially saying that if you have synchronous replication, you will basic your availability is definitely going to be affected by the state of the replica, whether it's caught up or not, and whether it's backed up or not, right? So especially this is even more relevant in microservices and service-oriented architecture, because this is something that's often forgotten. And it's much harder also to deal with. I think it's a very tempting thing to go for that. If we just do distributed transactions, then we're all good. And there have been a lot of missteps down this path. I think it starts with the idea, the first thing you break end up breaking encapsulation, because most really good DTCs, turns the transaction coordinators actually work at the resource level, which is the database. And if you expose that, you're basically breaking the service encapsulation. So beyond that, I think there were some efforts by the W3C, the Death Star and XML kind of protocols to build a DTC protocol at the XML level, and it never really took off. But I mean, it's always very tempting, but the problem is it definitely affects scalability. And it's very simple, actually. It comes down to the fact that the locks on the primary are held for a much longer time, because it includes the time to kind of replicate that change and write to disk on the backup. And that even in a really good network, on a local network, on a high bandwidth, 10GPS network, can add so much latency over time that it becomes kind of, you know, even if the other backup system is not caught up, your performance will degrade significantly. So this is like a pattern for failure, basically. Yeah, but I think, and that's what he actually talks about, and there's subtle change between 84 and 86. So they moved from read-write abstraction to transaction abstraction between, and the trade-off that they made is like even for smallest trade-off, better latency, you have to actually move a layer up for fault tolerance and availability and latency reasons. For example, here they went from read-write to transaction semantics. So now their semantics became transaction, they guaranteed transaction not read-writes. And I think that might have taken them a long time to even sell internally, and even flush out edge cases and say, is this even right? I mean, because in 84 world, everything works fine, right? Everything is uncadory or whatever. They have some experience of it. But the moment you move it to a transaction where order of read-writes, probability of read-writes is not guaranteed, how do you deal with it? Whether it's going to be consistent, correct, et cetera. And so move to that abstraction also is very tricky in, I think, but it is what comes out of it eventually. Yeah, and this is, I think, very dense. When he starts talking about lock shipping, this section four is very dense. This paragraph is very dense. I mean, he has mentioned so many points. So as Yogi was saying, backup system does have an influence on primary directly. Like what he says that primary, the commit request on the primary holds till it knows that it has been received and acknowledged by the secondary. And that is where you'll start. So Yogi, I would defer its distributed system problem. I would say one primary, a single database with replication has this problem. So that is... Even when you do, for example, Postgres has the synchronous commit option. So that's also dangerous. But the nice thing about it is, I think it's got statement level control. So in Postgres, you can say that for this statement, I can do synchronous commits. I think there's a later reference to where he kind of talks about, where at the application level, you can decide between, but I've never used that pattern. So I mean, I think Swannan is here. So redo log buffer, right? That is where this hits you hard. Like you have to size your redo logs really, really carefully. If you're running a batch with a very large commit boundary, then this sort of this database also start breaking down. And some operations then become irreversible. And there is information loss. Although like the one point that I got my thinking was, relational databases sort of flip this a little bit. And what they're saying is that you give the ownership to us. So in the whole paper, Pat Herndy is saying that this reliability is built on the retry ability, right? Completely. That's where the add-in potency and the retry is coming in. Whereas Postgres or I call it whatever they are saying, fine, we'll own the transaction. And we own the reliability. So that's, I found like maybe that research or something came later. MVCC probably was not there at the time this paper was written. That's what I think. So one interesting point I think I said, he clarified as in very recent posts in 2021. He clarified, see the meaning, true meaning of consistency. Is to give application a chance to control the transaction boundaries in the original asset model as well, not as in consistency enforced by database. So you, so the whole asset model is basically application decide the boundaries, which database will honor and enforce. But yeah, I think to PC talked about, we'll just slightly. I mean, the reason I brought it up is in this paper, it talks about the pushing the responsibilities to the client, right? To the point that they may not sync with the server. And then the relationship is basically took the radically opposite approach of pushing the responsibility to the, to the server. That's what I think I found like a lot of contradictory. Like the way I was thinking about the ability is actually very different from how bad it's talking about here. This section. Actually, I have a point, right? So here, for example, and this is real practical thing, right? For example, if you have a database, cluster database, I mean, typically, and some transactions get hung. If you are doing across some handshift of recurring from the transaction becomes really hard, right? And if you do it on a day to day basis, it's better to have some, you know, your own framework rolled out application semantics rolled out because application semantics wise, you know, as I was mentioning what is the compensating and reverse transactions. So even if you, if you want to do it manually, that's easier as compared to resolving home transactions, which is handled by your transaction controller or resource managers. Yeah. I think I want to go back to Spanner and read it again. They cover some of this. So this thing again moves faster. Like he now says, now we have moved from read write to transaction. Now again, better thing to do is move from the transaction boundary of database to transaction boundary of the business semantics. And that is where the business application operation based, they are commutative. You can define them to be commutative. Like I can say add, add, add me, you know, add minus, minus plus whatever. Those operations because of compensatory actions for every action, I can compensate it in a different way. And I can define those semantics and that way they become, I can make them commutative in a way based on, I cannot make read write commutative because or transactions commutative or associative. So I cannot reorder them. And that's the semantics that he talks about going forward. So realistically, I mean, probabilistic business rules, right? I think it's also very interesting way to think about it. So how do you deal with uncertainty? How do you deal with loss of synchronous knowledge? Like if I do on A and if I do on B, there's some gap. And if A crashes, then B takes up as primary. There is going to be a delay or maybe loss of knowledge between two. And what A committed may or may not be honored by B. So and so forth, right? So that is where he says probabilistic data business rules are now then probabilistic, then pretty much, I mean, I've seen developers talking almost deterministically about state of applications, et cetera. So I think there's slightly, they need to be humbled about saying that they know history of the history of the state of that entity, not the reality. And that is where he says rights to database are not commutative. You cannot do, because they're infrastructural things, I cannot do unright. I mean, so if I, sorry, somebody. Yeah, sorry for it for things. Can you just go back to the previous slide? So can you just elaborate a bit on the second point you mentioned? Webbooks are almost always going to be probabilistic. So by the time web book reaches you, depending on the life cycle of it, the entity state might have already gone ahead. Got it. And so it's more of a callback rather than a web book. So I think. Sure. I got a different connotation of a web book. That's right. So I will. Yeah, I think it's actually a notification in a way of some entity state change. Those are always in the past. And those tend to be lost as well. Right. I mean, you can lose them that there's, there's no guarantee they'll order arrive in order, et cetera, et cetera. And fundamentally they are, they are things in the part that they replay history, not the state distance. Got it. Got it. Actually, stupid. There's like you mentioned that, you know, people need to be more humble about whether the thing is synchronous or not, right? I think it's not even true for, I mean, for distributed systems though, and no replicas, no, it's definitely true. Even for a single node server, right? Yeah. I mean, there are a lot of gotchas there, right? Like when you think about the F sync setting on a given server, I mean, there are a lot of things there, which when, and normally, for example, my sequel has a one second, I think, loss window. I think Postgres also has some setting around. So even there, you can actually lose data, even which is committed. Yeah, yeah. Indeed. And he talks about it. I mean, there's a reference to it in the paper. So again, see, rights are not commutative. I think that's, that's what he's asking people, even encouraging people to think in business semantics for commutativity and associativity, not freedom rights. And he actually, he has a very snarky way of putting things. I mean, very thing, right? So he says, if any transaction bears to read in between, right? That does not commute. And it's actually annoying and stop the concurrent work. Actually, read of information introduces implicit lock generally. Right? So that is not because read by definition has to be a synchronous operation unless you're asking for still reads. So if you're asking for, and that is where you need to choose what kind of information you want to read. And that is where it becomes annoying. Yeah, that's a good point. Because I think Alexi Shipilov who wrote JMM, Java memory model has a very simple example, which is spoke about three hours. It says C plus, sorry, I plus plus plus plus plus plus I is not deterministic operation. And he spoke about that single line three hours. In terms of now, he went into all the CPU, semantics, L1, L2, L3 cache and how does it work out and why memory models are important to solve some of these use cases. And that is, and the problem is because of the order of read where you read determines what you see. And then it actually is a implicit lock. Yeah, I mean, a lot of the stuff you see at the application level and the database level is first kind of first attempt at the problems are the CPU and core level also, right? I mean, you have cache coherence issues, which is fundamentally the same thing. And you've got something like the messy protocol, which kind of deals with it at the, at the core and kind of what do you call the memory read barriers, which are there. So yes, I mean, it's the same problem at different levels, you know, it's that's the, yeah. So it's kind of locking by the way, the way he explained is a pretty nice, very pessimistic locking, but it was most of the time I will not go into it, but there's a separate paper on it. Very nice way of doing it. And now he starts talking about what do we do, right? So, so he says, then, you know, either have a human, so you first of all, detect the problem and have a human do it, or if it's too expensive, then, you know, then define like merge functions, you have to define violation functions as well as the first class entity of your domain. See if this rule is violated, what should be the action is, what should be the action, et cetera. So it's a, it's a, it's a nice thinking tool to think about problems. Again, what he says, I mean, in his view of the world is like he talked about, it's a, it's a MGA kind of model, like MDMA sort of MGA memory guesses and apologies. And what he says, what you've seen is your memory, what you do on based on your memory, partial knowledge is a guess. And because your guesses may be wrong, you have to be ready to be, you know, to apologize for your, you know, in, in, in lack of a better word, incomplete knowledge of the, maybe physical world as well as actual duplicated world. And, and I think that's a very shift. I think when we start writing software, I think if you are aware of this, then a lot of other things also automatically comes up. I mean, when you look at a scenario, like you are looking at memories and guesses, and what you do is a probabilistic operation is a promise, not 100, I mean, it's not a deterministic, but it's a promise that can be, you know, that can go. So this line, I think was, is one of the most liberating things that, and for me at least personally, when, when I saw this line here, I think he has another paper about, about this whole thing, right? And here, yes. And to kind of hear that from somebody else that, you know, this is a problem which he's pretty much given up on. I think it's both liberating and extremely depressing, right? That at the end of the day, you can't have a notion of truth and systems. And what we depend on so much in terms of consistent knowledge of RDBMS is, is just a, the moment you have a replica is just an illusion, right? So, I mean, a lot of emotions tied to this line. It's a fabulous line. I mean, I think this also drives home the point that, you know, it's all about dealing with probabilities, right? I mean, you can only improve the probability of a system being a bit consistent. You can never reach 100% percent. Yeah. Sipar has an interesting bit where he says, the world is good when you have to only write things. It's when you start reading things, things start getting interesting. Yes. Yes. I mean, if you write, write by definition can be asynchronous. Read by definition has to be synchronous. I think Eric Brewer had, I was trying to find a reference. I couldn't find it in time, but he, Eric Brewer has a reference to it. Like read by definition is synchronous. You cannot asynchronous, asynchronous read because you have to read from the short cash or whatever, right? And Aditya would remember, see there is one more very interesting, I think I have down, I'll not disclose it, but I think it's covered in a very different way. So after reading this, I think I was looking at other, like how is, this is a 2009 paper and I think I was looking at what, what, how's the state of art kind of move forward. And I think the whole idea of CRDTs kind of touches on this, right? Like where the exact example of read and write is avoided. And you just instead just distribute. Define. You know, the operations that you want to like increment counter example is a nice thing. So, but yeah. But I think that is where CRDT based approach is a higher business level, abstraction level, where in you define addition as an operation. And he talks about operation based semantics rather than instruction level semantics at operation. I can, I think that's the difference between MySQL and Postgres in terms of the way they record intentions, right? So somebody will say, this has been done or actually do the change. Or I think Postgres does opposite. They record the intention of that action. And that is where reordering of wall files is actually easier in some cases if you do CRDT based approach. Yeah. And that's where he always keep parking. Like don't read, right? Always define your eventual consistency in business operations and accountants do not use erasers is his favorite line. So because they use compensatory transactions and not really change anything. So that idea, right? What was just mentioned about do stuff at the application level. And it reminded me of another paper which is quite influential for me, which is this, it's a very old papers. It talks about says the end to end argument. I think the 70s or 80s it was published at the time when the TCP IP protocol was being developed. And there was questions about where should some functionality decide should be at the, which layer of the stack, right? And the conclusion that the paper basically makes is that have dumb pipes and smart end points, right? So the end to end argument and that's such a powerful idea and it applies in so many cases and helps to kind of, you know, make decisions in a sense. And I think this is another way, right? Like you can't really have smart databases. That's what effectively it says, right? No matter what you do, you have to have the smarts at the application level to deal with it. And sometimes it's tough, but that's the only way to kind of do it. But yeah, I mean, this is an interesting point. The pipe is dumb, but it should be like non-leaky basically. So it's all, just make sure your pipes are non-leaky in the case of database like you just write the right data, but like don't depend on that being the smart layer. Because he's also, he's saying Amazon Dynamo does not do it by selling like shopping cart. Implementation actually does add or remove from the cart and not Dynamo. Dynamo just knows commit or reverse. And I think now he's dealing with mechanisms to deal with uncertainty and at application level, like when we say sorry, now he's moved to a section wherein how do you say sorry, right? Or how do you, you know, not say sorry? So one way to do it is over provisioning. Like you can always say 20% although it's expensive. You always provision more in terms of quotas or for a particular operation. So that even if you're wrong by certain mean like certain allocations are exhausted, you will still have some capacity to still compensate for, so it's almost like, you know, reserve quantities that you'll always keep over provision. So you always, you know, deny after certain times and that actually gives you a way to scale as well as, you know, be correct. That has a economic cost. The provisioning can have economic cost or booking airlines typically do it. They, you know, I mean, probabilistically they assume 10% no shows and whatever and then sell maybe 110% and, you know, something goes wrong. Then it may still work out that they actually pay out a lot at that moment spot instance and, you know, maybe do it. I think, and that is where terms of service comes into play because everything cannot be controlled by a system, right? So if a book is lost, destroyed fire, et cetera, your terms of service should, and invariably they'll have it. So whatever it cannot be, because lawyers actually tend to use far more exact language than in all the scenarios than programmers do because laws, I mean, it's a probability that something can fail very rarely, but in that cases, I think it's covered by legalese rather than software. So, and that's where I think, I mean, we know in certain cases we actually resorted to saying, okay, we cannot solve this problem systematically. I had a line in this terms of service which covers for pretty much anything else. So nobody can sue us basically. We'll do best effort basis, but we cannot guarantee. I think it's a very common technique in inventory environment systems and the very, very common techniques. Again, this quest for fungibility is such an eye opener, right? For example, this theme of going higher and higher abstraction to achieve fault tolerance is continuous here. For example, moving from booking a room, now you're providing, sorry, booking a specific room, now you're providing a booking a room functionality, right? Now, what it means is the generalization, I mean, the details are lost. I cannot guarantee you which one, but you'll get a room of particular, you know, disposition and that way, actually, I don't have to lock. I just need to know probabilistically, this is what it is. And again, provision and compensation, all of those things comes in, but it's a very nice way of thinking what is the general level I can take it further to even make it more reliable and resilient and latency sensitive. For example, you can't reserve a room 31, but you can book a standard room with blah, blah, blah attributes. I think, and that actually enables fungibility of resources. And he talks about it. There's a very nice paper by eventually asset business transactions, wherein he says, then you can actually compose very different, you know, businesses or business offerings if you start doing it because then I can always pick from a pool and give it back to a pool so that the unit becomes pool and not an individual room. And then because of that, it enables a lot of other, you know, optimizations in the system itself. So it's a very, at the same time, the generalization actually leads to original details, like some, at least in US, they don't actually assign seats. You can, they sell you a particular type of seats and then you can occupy one of those class of seats rather than seat number one, two, three, et cetera. So that actually eases a lot of, you know, locking, et cetera, because now you're not locking for a particular seat. It's first-come-first, so you can delay, you can minimize the locking on a particular resource unless it's very pricey or too important. So it's actually a very interesting way of doing it. I think that's pretty much it. I mean, everything else, asset two, I didn't like that much. I mean, it's nothing fancy. But again, from me, I think the learnings were sub-algorithms like defined in a way that I can do compensation, which is again tied to business semantics. You know, if you do generalization, that actually makes operations easier. And losing correctness to deal with uncertainty allows you to scale and then you handle uncertainty or you take, you know, provision for uncertainty, either in legal languages or in finance or whatever, and then still make more money. And those are pretty much proven strategies because that's the only way people do it. And I was thinking, see, it's also a shift in mindset, right? For example, we take overdraft fees from bank as given. It's also a very probabilistic thing we actually do. And because it has been done, we almost take it as given. This is how business works. But business also need to sort of reflect on this and, you know, leverage this. Yeah, so I think that's my reading of paper and others also coming now. Awesome. Thanks. Thanks, Shikha. Any comments? Any other thoughts? I was just wondering if you have a case of a microservice system or any service or in the market, whether a transaction essentially will get broken down into multiple steps, right? Or think of it in terms of asynchronous if it makes it easier. So you were saying saga pattern, right? It's exactly the same thing. What is the saga pattern exactly? So, sir, do you want to go? No, no, go ahead, Shikha. Go ahead. So saga is basically, you define, it's almost like a workflow. You define the correctness between steps, et cetera, et cetera, on a single, I mean, you define it rather than relying on underlying infrastructure. So you define what it means as a single transaction, single unit, what are the steps? What are the compensation reactions? How do I revert back? How do I move forward? How do I retry? All of these things are encoded in that saga. It's a long running transaction, basically. So complete your question, I want to hear more because there are multiple ways to address this, but I want to hear your question, please. Yeah, so it seems to me that especially at high scale, what you would end up doing is sort of have a separate data store for in like transient or in transition transactions till all parts of it are complete and then take the whole data of the transaction and try to atomically persist in a master database. So is that what ends up happening? How do you maintain consistency across multiple data stores which are factored when you want to enforce transaction or I should say atomicity at a distributed transaction level? Yeah, so I think this is a really hard problem. And there are a few ways in which you can approach it and none of them are ideal. So one is obviously the whole distributed transactions thing which is really not an option. So that goes out of the window straight off. The other is to basically depend on, say for example, synchronous calls. So one service, basically the example you gave, right? Where you have one service, which say the order management system receives a create order. But as a result of that, you want to create multiple order, multiple actions in different systems. So the warehousing system should allocate inventory and the procurement system should kind of get a note, can payment system and accounting system. So I assume that there is that kind of a fan out of sort of transactions as a subset of that atomic transaction of create order, right? How do you handle that? And if you use synchronous calls, the problem is that there are a lot, because of the nature of partial failure, you will have the systems go completely out of sync very easily and it's very hard to reconcile. And it's an easy way to kind of implement it, but you end up dealing with problems which are need to be handled manually. The other way to do it is to kind of, so there are two broad categories that one is, you use orchestration and the other is to use choreography. If you look in the literature, like the other two broad options. The orchestration approach, one of the approach I've used in the past is where you use essentially a local commit. So the order management system, for example, when the create order happens, it writes to its own database, right? And it has to also send a message to the warehousing system in a sense or a call to the warehousing system. How do you make sure that that is atomic? That's the first problem to solve, right? If you make a synchronous call and the original transaction, the automatic system rolls back, you basically have a warehouse entry, which is not kind of, you know, correlated to an order. So the way you kind of solve that is the two ways. So in the traditional way, IBM MQ series, for example, would use a distributed transaction coordinator in which MQ series and DB2 would be part of a distributed transaction and be able to write to the DB and commit a message to the MQ series in a single transaction, right? That works for enterprise applications. It does not work for web scale applications. So then what you have to do is you essentially write to an outbound messages table in the same transaction in the automatic system. So you create a, in the automatic system table along with your domain of orders and order items, et cetera, you also have an outbound messages table, which kind of records that and that happens as part of the same transaction scope. Now, and then once that commits, you know for sure that, you know, both these things have happened. Then you have an asynchronous relay mechanism which then sends that message to the target system. And at least you made sure that one leg of that multi-step hop is atomic and it's correct. Then you go to manage the flows that emerge out of this and there are many ways to kind of do that, right? Now the problem with this approach is that most databases are not built to support tables that are used as queues. This outbound message table essentially is a queue, right? And MySQL, for example, does not have the permitives to kind of deal with that. You need us, so Postgres does have it though. So Postgres has a primitive, you need a primitive called skip locked rows, right? That's a primitive and it's an extension. I mean, and that is extremely powerful. Once you have that, you can build queues on top of this. Without it, you essentially will, that system will die. MySQL will die using it as a queue at scale, right? So yeah, so there are many ways to do this. It's all hard, essentially. I mean, after dealing with all this, I just like monolithic systems, man. Yeah, but like, so if there's one thing where it's just crashes, you're left with orphan transactions, right? How do you clean up those? See that again? If one of your downstream processes just crashes, irrecoverably, you're left with orphan transactions which have gone midway and are stuck nowhere. So do you know what to roll back in that case? No, you don't. You don't, you never, usually you never roll back, right? Compensating transactions of sagas are a myth, I think. I think in practice, it's extremely hard to kind of implement that scale, right? You can. It's never 100% rollback. You are just trying to mitigate the impact of what happened when you were actually proceeding with the transaction. So we have a similar implementation in our systems and it's actually not a rollback rollback. It's just trying to mitigate what happened during the progress of the transaction, essentially. Yeah. Hey, so a follow-up question on this question is like, then how do you introduce business level observability in this kind of pattern, right? Like a wall in the mark kind of thing. Oh, I'm, I have progressed here. I'll make a wall in the mark in the wall just to see everything is done in that order. Yeah. So here I think the idea of domain events is very powerful. And that's what we used extensively, right? Like essentially any state transition should be, should be kind of emitted as a domain event, like order created, inventory booked, right? And then you publish that onto a PubSub channel and then you can have some listener kind of track that and create a complete order history of where the transaction is stuck. And then you can go and debug it across different systems. So that's very powerful. I think that whole idea of, I think the, which even this guy mentions in this paper about, you know, making operations as a first-class entity is extremely powerful. And if you see the kind of growth of Kafka in a sense, right? Of a writer head log, which has been kind of rifiled as a first-level concept. It basically uses the same idea, right? You basically have, I think the idea of messaging is very kind of overplayed in that sense. So I think the idea is how do you... Operations. Yeah, you make operations first-class entity. I think that's the most powerful insight in this, I think. So I think again, classic way, right? So if you have to reconcile, I mean the world reconcile, right? I mean, banks actually, when there's your balance books, they actually, you know, get data from multiple sources and arrive at a consensus, what should be the balance, et cetera. And whether it's the right thing. So similarly, I think every business will have some sort of reporting, they call it reporting or whatever. They'll go and visit each system and then try and match across the systems and say, you know, then you need to have some correlation to trace it so that I say A, B, C, D, F, whatever has happened has happened. And then if I have a record of it, and if I have then record it and then move forward. So I can consistently say after the fact, this operation is either successfully done, not done, do I need to do? Is there any violation? And that detection happens as a first-class entity. This violation, detection violation, then becomes a first-class entity. So you have to really then think about the domain and then emit those domains and then reactively start accepting whether it's going to complete, not complete. There's a violation, et cetera. And pretty much I think I've seen microservices end up doing it. Like they do reconciliation. You have to have. Even so, the web service example, web events I was saying, even though we send out events on a, like the reactive basis as and when happens, that gives you speed. But then when clients ask, do you guarantee? I said, no, you download a file at the end of the day and see all the events have reached you or not. And if, I mean, there's always a possibility that something might have happened. Either you're down, we are down something and they may have reached you in order, not order. We can't guarantee you. So I think that, that happens like across the boundaries. People will ask you for some sort of data, which is the second source of validating what has happened. Cool. I think pretty much we are the block, it's one paper, like it's a very interesting paper to read. And Hey, so Pihu Shogi, is there a follow-up paper we should consider? I was about to ask you, I think Spanner looks like the perfect follow-up to the set. Yeah, but I feel Spanner, I don't think 60 minutes will do justice to that one given how intense the discussion was yesterday. Spanner is actually, yeah. One and a half hours is good. I think it was still late 30. I think we can give it a shot, at least the first pass through it. That's one mysterious kind of, I've never understood it. I've not really looked at it carefully, but yeah. I'm still trying to, you know, get around that concept of true time. I mean, I've read it three or four times, not able to understand. So actually, so in ThoughtWorks, for TMT they actually implemented similar to true time, but not exactly true time, but it's called PTP. It works in data centers, not between data centers, but I think, because see, for TMT they need nanotime accuracy because they actually detect phase shift of incoming electromagnetic radiation based on time. And the time, then they treat time as the ticker or arbiter actually. Yeah, I mean, in Spanner, like if I don't have the atomic log, then it will Spanner work for me. That's the question. No, it will not, right? That's why CockroachDB works only in one data center, not across data centers. Oh, is it? Is that true? Yes. But that's completely, I mean, that defeats the point of it, right? I thought it had some mechanism to kind of do it across data centers. They might have had, but when I asked, so actually one, Shriram Shrinivasan, right? So he wrote GTA specification, maybe implemented GTA for WebLogic. So we should invite him for all of these discussions. So in fact, actually, I know one person who's actually worked on using this CockroachDB in anger, right? At Rubrik. So Janmijay Singh, he was at ThoughtWorks in Flipkart. So he's, he was, maybe we can see if we can get him. Yeah, he would be good. I think he also has a lot of other nuggets to share. Yeah. Can you take with him, like if he's willing to read Spanner? Yeah. Awesome. So Sona, I wanted to just bounce off an idea. I think, do you want to start a telegram group for papers below Bangalore and probably have all the four folks here and probably have the discussions on paper choices and all. We already have an observability group and all that would be good to have a telegram group. I mean, I just want to hear from other folks. Is it okay we started and would you folks be willing to join? Yeah, other folks also. We've been talking. You guys know telegram in all, man. No. So let's stick with Slack or something. I mean, Email group. Let's give it an email group. Good old fashion email group. Slack is also good. I mean, everybody. Discord. Discord or whatever. Telegram. Yeah. But I think, it's not just about stacking, right? Yeah. Other questions and if you have questions and you're feeling like, oh this is going over my head. Feel free to ask honestly. You know, we can. If that has happened, it just means because there's no more. Sorry. Yeah, of course. It has gone over. All right. Great. I'm assuming not a lot of other questions. So just to conclude, we'll try and get a span. We'll try and read Spanner next time. It goes significantly well if you read and count because it's not a very easy paper to discuss. If you haven't read some of the concepts actually quite. I've not seen them expressed anywhere else. Except maybe the Dynamo paper, which also talks about something out of the similar things. It will go really well. So before, before that, I think, because as. She was saying, should we even, I think there's a paper by Gary Neville or who actually implemented NDP and PTP. Because that timing, the moment you understand timing issues, that becomes critical. I know that. I think we can have a, we can do it as a two-part series. The first series is just to understand the concept of time and like as we were saying in the second one, can they stuff into it? Because in even in Linux, there are I think two time commands to get accurate time or get approximate time. So the physics of it, actually there are a lot of issues with that. So once you understand that, then probably Spanner becomes more easier to digest. I think. Do you think you should first then cover Leslie Lamport's paper, which covers monotonic time? Yeah. That could be the first introduction. Sorry, I didn't take over. I couldn't hear you. I think we could do that. That is, I think we can tackle that in an hour and then we can build up on that and discuss Spanner. I forgot the name of the paper is Vector Clocks. Vector Clocks. Here is the idea. The reason I said Spanner is we can we can discuss it. Like assume that, we have this a lot of time and then go back and read like a that's one idea, but I'm okay to do either way. Up to your choice. I think that I think let's dive into. Let's take on, ambitious stuff. If we get stumped, we'll go back and dive deeper. But if we keep going diving deeper first, you will never kind of, there are 100 papers you need to read to understand Spanner. We'll get it. Spanner next time, everybody please read and come. How frequently do you plan to do this? Once a month, I think. I've had to get more time considering everything. Once a month is healthy. Let's do it once a month. That was much earlier. It's because Sipa and me, we couldn't find time. Let's try and do it. More people are interested now. Yeah. We'll certainly need help to spread the word as a Yogi. I mean, make the Twitter hashtags popular. Also follow the papers you love. Twitter handle and help us spread the word. Get more speakers. I think that's always going to be a problem. I think as long as it's a small core group who's enthusiastic, I'm happy with sitting with my drink and having 10 people to discuss with. It's too hard a problem to get this popular. If it happens, it happens. If there's good conversations, the word will spread. I'll just let people know Yogi will be there in the next meet-up. Yeah. But I enjoyed it. Thank you. Thank you. Awesome. Bye. See you next time.