 So, this talk is about evolution of the tech stack at Goaibibo. My name is Jyoti. I'm the chief architect there. As Vikul said, let's keep it interactive. We can interrupt me anytime and we can discuss anything that you guys want. So, Goaibibo by most metrics or matrices is the number one OTA in India. Every year, we do five times more business than the last year. So, you can say we're growing. We have an average of about 400,000 requests per hour. But sometimes our super aggressive marketing team sends out push notifications. And what we effectively have is a DDoS, right? So traffic is very spiky. And we need, so our stack has evolved to catch up with those spikes. And a significant percentage of our transactions happens on mobile. And that percentage is ever-growing. Goaibibo is part of the Ibibo group. The other things in the company are Redbus, which acquired a couple of years back. We have something called as Ride, which is basically a car sharing app. You guys should check it out. And we also have Seller Side Inventory. So if you have a hotel and you want to advertise your inventory, we have apps and mobile apps as well as desktop apps for that also. So what is the problem we are trying to solve, right? So we have, so like most e-commerce companies, we are a marketplace. We have customers who want to buy things. And we have sellers who want to sell things. And we try to connect the two. But all of them have different requirements. So the customers want something which is consistent. Like if they search for a price on the search results page and they go for a coffee, come back, start booking, they don't want it to change. Otherwise they go to Facebook and say people are cheating us, this and that, right? So they want the price to be consistent. They want it to be performant, like if we try too much to be consistent and we don't give response in time, people are just going to get turned off and go somewhere else. They want options, right? So some people want to optimize for money when they are traveling. They want the shortest price to go from A to B. Some people want to optimize for time, right? And they want these options to be sensible. So we can't give generic options to people. For example, if a labor is for 18 hours, right? Most people say, I mean, this is a bogus flight. I don't want to look at it. And they may go to a competitor, right? But for someone, let's say he's going to US and he's saving 60,000 rupees by that 12 hours of stay over. He might say, okay, I'll save the 50,000 rupees. I'll book a hotel in Hong Kong and roam around a bit. It's all good for me. So basically, customers are looking for sensible options. On the other side, the sellers, they have a multitude of formats. So some of them still give us soap, some are JSON, some are RPCs. They have something called as look to book. So if we do 100 searches and we don't do a booking with them, they charge us a lot of money. So we can't go to all these sellers every time we get a search to optimize our profits. They don't have any SLAs, right? So they can go down anytime and we don't want to define our SLAs by theirs, right? So typically, when you search for hotels, it's not that we search for one. We have to pretty much search for everyone. And we don't want one vendor who's having a bad day to basically slow down our searches. What is the SLA? Service level agreement. So basically, how fast do you want to get back? A good question. Price, price. And they also give us duplicates. Everyone understands duplicates. So like, for example, right? I mean, Indigo will give us Bombay deli. Spice set will give us Bombay deli, right? And sometimes there are guys who are aggregators, like Galileo. So they give us Indigo inventory as well as Spice set inventory. So we have to dedupe that. And we have to dedupe that in real time. I mean, as we get a request, we go hit all these guys. They give us requests in their own sweet time, we dedupe. And we give the search results back to the customer. And there are other stakeholders there. Like, we publish hotels to Google. And they have very stringent SLAs from us. Like, if we don't give responses which are correct or which are performant, they blacklist us. And we don't want that. Because the significant searches happen when you say hotel in Goa on Google. That is a significant percentage of our traffic. So we want to keep Google happy, right? So basically, for Google, we push data. And what Google does is it also crawls the price. So for example, if we push saying this price is at 2,000 and Google crawls at some time, it finds it's at 1,800. You basically say, you guys are giving me false data and basically blacklist us. Other guys like TripAdvisor pull from us. And what TripAdvisor says is, OK, I'll call your API. You need to give me all 10,000 hotels in Bangalore within one second. If you don't give me, I'll blacklist you. I'll never call you till you make me happy again. So I am the lack of X imply of parameter thing online. So we have so many GDS to call the API, right? So you are talking about, every GDS will give you the same result in terms of if you are getting the result for Delhi Bombay, right? So let's say that an example of if you are getting a flight from Delhi to Bombay, it's 3,000, right? And again, at the same time, if you are getting the price for same flight from Galleglio, but the time is, you are getting the response very fast. But from Galleglio, you are getting the response in slightly bit later, right? But the price is low, 29. So which you will. So yeah, so that's a good question. And we have algorithms which kind of do all those. We wait a bit for you guys to respond before we merge. And we also merge results. For example, Bombay, Delhi, you might find a very cheap result from Bombay, Jaipur, Jaipur, Delhi. So basically, we try to combine results also. So I mean, most people can't. So you are saying that you will wait for the another result. Right. Till you get the result, then you will. So yeah, so it's a evolving algorithm, right? So what we try to do is if we are not getting results, we show it, then we refine our search results back. So we stream results to the front end. If you see your search results, it's continuously getting updated. And in fact, we are doing something even more cool features where we are going to show personalized SRPs. So for example, Vikalp has 2,000 GoCache in his account. And we are going to give a discount of 1,000. We'll show you results considering the discount he's going to get on the search results. Because very often, people don't know, OK, I have to apply this promo code and that promo code, and I don't know what's the final price I'm going to pay. So we're doing all these things. So yeah, so basically this algorithm of basically merging results and giving it out is very complicated. Absolutely, that's a trade-off, right? So that's the trade-off. I mean, we don't make calls every time. So that's why we have caching. So that's why sometimes you see prices being updated. The question that I'm trying to formulate is, do you have an algorithm in place which runs at a certain specific interval which keeps the data fresh, or is it like? It's not like a cron job. Basically, it's not like a cron job. We're trying to look at that, but because of the look to book, right? We can't just run a crawl service to all the vendors, because we'll simply end up paying them a lot of money. So we have intelligent caching. So basically when we get results, we store them. Till as such point in time, we know that things are changed. Some of these GDSs, they do give us feeds. So that makes our life much more simpler. We can just listen to a feed and update our cache. But other times, we basically, so every time you book, but every time you go to the booking page, that's a real-time reprise which happens. So when we do a reprise, you know, OK, things have changed, and then we update our caches. So that's the basic algorithm. Sorry, I couldn't hear. No, that's not what they think, right? They want to show basically consistent data. Because then people will not trust Google. They'll say, OK, I'll not go to Google. So then we ran out of GoCache. Yeah, it's possible. Yeah, I mean, we have, I mean, people want to give promotions based upon mobile or whatnot. No, no, no, no, if I use two different mobile for two different providers, at the same time, I'll get two different prices. No, as I said, I mean, I'm not sure whether that is intended. I mean, the prices keep on updating. So I'm not sure that's intended. I mean, we give, but prices for mobile do vary from desktop searches. So we started out with a simple stack. Django was our mainstay. And we used Redis for things like sessions, MySQL for the DB, and Memcache for caching DB lookups, right? And this was working well. I think the initial team had like two weeks to build it. And they did build it. And the amount of traction it got was way more than expected, right? But with increased traffic transaction and our feature velocity, things got really complicated. So basically code kept piling up in the same Django application. And we kind of ran into, we started building monoliths, right? So the problem with that is the code quality started getting worse at a much faster pace. So I'm doing a feature. I have got, like, we release three times a day. So I got a feature. I have to release it fast. I see a function with 10 if conditions. I just go add a couple more, right? And then very soon, you've got a function which is like 4,000 lines of code. And you don't know what's happening there. So technical depth was getting accrued at a much faster pace. The thing is you can't refactor. I may want to refactor it. But then I think, OK, what if I break something, right? What if I regress my performance, right? So I can't take risks in refactoring the code. Same thing. So certain key components that some few devs who know that component, but if it's not in office and I need some change in this code, there's no way I can find my way around it, right? So you need tribal knowledge. You need that one special guy for that one piece of code who will help you out when you're building that feature. And basically, this is the main point, right? So developers are unable to experiment. All of us as devs, we hate being bored, right? We don't want to just do bug fixes or that one small feature. We want to refactor, throw it, throw everything out, build things again. And that is really good for us. But because we have a monolith, right? We have this. You don't know where your code is getting the entry points, where it's all touching. Basically, you're unable to experiment. And this is a direct impact on developer productivity and not good. It also has an impact of performance, right? If in Django, if you have 100 URLs in your URL dispatcher, every request will then basically match all those 100 requests before it gets to your handler. Basically, it's an O and lookup, right? Similarly for middleware. If you have tons of middleware, your request will go through all the middleware before hitting your handler. So even though we have New Relic and you can always run C profile and figure out hotspots in your code, I mean it's not easy. It's not easy to steer your code through that code path which you think is a problem area. But how are we going to do that on a laptop? You can, I mean, so it's not easy to identify a performance bottlenecks as well as change them. And scaling is basically all or nothing. So you have a payment service. If your payment service is being running hot, but everything else is fine, you can't take that one piece of code and run it again, scale it out. You either scale a whole app or you don't scale anything at all. So that is very inefficient. Yeah, and this is a little bit Python specific. So the SRP algorithm we talked about previously, right? So that is a lot of number crunching. You have data from a lot of people. You apply discounts based upon the user agent, this, that and all. So I mean, even though if you have a four core machine in a Django, you're just utilizing one code. So how do you utilize all the rest of the hardware that we have? So monolith means we can't do that. There are ways around it, which I'll talk about, which we tried initially and then we did something better. And use the same tool for every job. For example, right? I mean, one of some of our guys wrote a optimized JSON library, which basically was very fast, optimized in C for parsing JSON. But the problem is if I am a new service or I'm doing something, I have to use that code. The other option is I have some other library and basically then you have code bloat. So basically what ends up happening is you use what is being used for everything in your app. And lastly, I mean, this is a big problem. Django is good, you are serving HTML out, but then we have the mobile apps. And we want the mobile apps to have the same or even much more feature velocity than the web. So this is difficult because those guys know Java, they don't know Python, we don't know Java and things break. Yeah, hold that thought, I'll come back to it. I mean, so there are times where it's a bad idea. That's true. Let's say I'm working in my branch, right? I make this change. How do I know I have not broken something else? What do you test it with? What do you test it with? I mean, see the thing is most, I mean, realistically in most QA organizations, we don't, I mean, most organizations, we don't have a comprehensive QA test, which gives you 100% coverage. Yeah, so I'm coming to that. So basically once you start building code, it's difficult to write unit tests. We want to get to what you're saying, right? I mean, we want to get to a place where I can change everything inside, but I run automated tests and I know nothing is broken. So I'm just coming to that. So initial attempts basically caching and this is still the cornerstone of our performance. So all static assets are at Akamai. All dynamic things are with, I mean, we utilize varnish and if you guys are interested, I can talk about varnish at the end. We use it quite a lot. Then we separated out all this heavy computational tasks into a separate process. And this was the start of there. We started refactoring our code a bit and we use salary and salary beat for asynchronous tasks. And I mean, this is not a, and the last thing is not something which we, I mean, which happened over time that we started serving more JSON than HTML from Django. Sorry? What resources? Right, you're talking about now or before. So now we've migrated to OpenStack and we want to go towards dockers. Yeah, I mean, we've not gone dockers yet because of, I mean, the thing is with dockers also, your kernel version affects how the code behaves on your laptop, even though you have, even though it run dockers, the same container on a laptop and theoretically it should run the same in prod. But the kernel version has its play. So we've not done dockers fully yet. We are experimenting with it, but we have done OpenStack and we are very happy with it. Size in the sense, initial deployment, because we want to, so, I mean, it's not, so basically we have discounts which are mobile specific. So we look at the flavor and then we have variable discounts, yes. Guys, we have got time, time interval, I think. So let's just, I mean, we'll take a question. Just to update, we all are there in our stall. So any questions we can take us over there. So what have we done for the next, last six months, right? Everyone knows SOA, Service-Oriented Architecture? Sweet, I'll stop. So basically we want to basically have an architecture of discrete independent component, something like that, right? Where you can change things around, run an automated test, and then you know nothing broke, right? You see a nice framework if your friend is using somewhere else. Go ahead, you know, refactor, run the test and you make it live, right? You need to have well-defined interfaces. And interfaces, obviously, the first is API, what API you expose out, right? So if you have a simple model CRUD access, right? You can have a REST API. But let's say you have something like a recommendation service. Maybe it makes sense for you to give us sync updates. Like it's going to take time and you just stream results as you get them. You need to decide about serialization, right? You have JSON, message pack, protocol buffers. People familiar with protocol buffers? So pickle is not a serialization format, right? Because we want to be heterogeneous. We want people to come in with their skill set, not force them to use something, right? So we want a serialization format which is standardized and can be used across tags. And you also need to know how you're going to send this data out. For example, if you use Thrift, the decision is made for you. Thrift comes with its own RPC bindings, right? If you use JSON, you can use HTTP. So all those decisions are key when you're defining. And that is not your only interface. You also interface with your other services, your peers, right? So you can have like a wallet service which we have which you call GoCache. So you interface with that, you interface with your DB. And what we tell our developers is to mock every one of those interfaces. And those interfaces, like if you're interfacing with the Redis, your interface is not get set as set or M get or all those things. Your interface is in terms of the data which you want from Redis. So get flights or get hotels which in turn maps to what Redis is providing you. And so how much do you refactor? I think that was the question somebody asked me, how small do you make your microservices? And I mean, the thing is, I mean, there is such thing as to way too much decomposition because the guys needs to talk to each other and there is latency involved. So if you're not aware of things, then you could very easily lose out on performance. And you can do, I mean, if you're using HTTP, you know you're importing HTTP lib, you know you're making a request. But if you're using an ORM or you're using something like Thrift, all that is hidden from you, right? I mean, I reviewed a code sometime back where, you know, the code was making 50 requests for just one web request which could easily have been optimized. The problem is you don't really know when the network call is happening if you use a ORM or a Thrift. So you need to make sure you define your system boundaries well. Allow a developer to develop and test on a laptop and allow for automated tests. So basically, this is our blueprint for any service which you want to build. Does that mean I don't use Django, right? I mean, we love Django, it allowed us to scale pretty fast. And it is a very strong community and people know how to use it, right? So we didn't want to let go. To be fair, I mean, not all of our microservices are Python or Django, but we start off with saying, can I do this in Django? So let's look at, I mean, what I mean. So this is a typical view in Django. I mean, it's a toy code, I'm not using shortcut render or let's just for demo purpose. So basically your view takes in a HTTP request and gives out a response, the standard Django, right? What we want to move from there is to where it gives you JSON, right? This doesn't look like, I mean, it doesn't look simple, but if you look at this class, it's doing a lot of things. Do you guys recognize what framework is this? Cool. So we love DRF, right? So this is a class-based view. So basically, this allows you to do create and list. And most of the time, you don't have to write any of this code. You just say my model is this, my serializer is this, and paginated by 10 results or 100 results at a time, right? But when we looked at TastyPie versus DRF, what DRF, where DRF wins is, if you don't have a CRUD API, right? If you want to do something more smarter, it allows you to go in the below abstraction and change things very easily. Like this query set, for example. This is again toy, but you can do a lot of things here. I mean, at the worst case, you can just subclass API view, and then you have methods like get, post, create, et cetera. But at a very high level, if you don't want to write all that code yourself, then you just subclass a generic view or a view set, get all that for free. And so people ask, why are you using a framework, Django REST framework, right? So things like serialization, like how do you do validation? How do your client do content negotiation, right? Some guys may ask the same results in JSON, some guy may want XML. So those things are already built in, and those are patterns which are well-defined, right? So that is something which we didn't want to build again. And last before I want from this code is, this is one more thing which we really love about DRF is the Swagger plugin, right? So most people, where do you document your APIs? So basically your code should generate a documentation. So if I have an API at gocache.goibobo.com, just go to slash docs, it'll actually give you a postman-like interface of exactly what the URLs are, what the parameters are, what, I mean, if something is an integer or a float, and you can try it out. And this comes pretty much built in with Django REST framework. So I saw some heads nodding saying, oh, we are using a framework, there's a lot of overhead. So this profiling is by Tom Christie, the author of DRF, right? So if you look at it, this is the time spent by a web request as it traverses the Django REST frameworks and comes back in. And you can do this yourself by just subclassing three methods, three, four methods, right? So if you look at it, the most part of your web request is spent in database lookups, right? And the sweet spot is column three where you are basically using a cache framework and you're getting results from radis. So all other things have diminishing returns. You can actually, DRF has a response thing which gives a context-aware response. You can remove that, you can give HTTP response. It's not going to give you much more benefit. So basically with the point of this slide is to show that the framework rarely gets in your way, especially DRF. What gets in your way is doing code which does a lot of network lookups or database lookups. If you really, really want to optimize that eight milliseconds of your request, don't use Python, as simple as that. Don't try to over-optimize Django. Django is there for a reason. It's there for maintainability. And this code eight milliseconds, even if you give to the user, it's not going to make that much of a difference compared to the lack of maintainability of your code, right? So don't over-optimize. Be really sure you want to, for example, if you don't want to use a generic view, make sure you really understand why you don't want to do that. So this is what we'll end up with bunch of services, some on DRF, some mostly on Python. Some of them are in, so we had to do real heavy number crunching. We have migrated to Golang. And in some, actually we have interesting pattern where in some one service, the main front end is Python and all your assigned tasks are going to be done by a go worker. It's like salary, but unlike salary, you can actually utilize the full course, full course of which are available, and you can use all the sweet things which Golang gives you like channels and concurrency and things like that. So basically we are moving to a very heterogeneous environment here, but they are not free, right? So basically you have complexity. Initially you had just one front end and all deployment headache was the DevOps, right? I mean, how is going to be load balance? What happens if it goes down? The DevOps guy takes care of it. But here you need to build in things. And you run into interesting things like the cap theorem. Anybody know about the cap theorem? Sweet. So I mean there is complexity involved here, which you don't feel when you're a monolith. Network latency, right? So you can very easily degrade performance and implicit interfaces. So let's say I tell Vekal blood, okay, you know, let's add one more field saying is budget or tell, right? Now is budget may mean the string TRUE for me. It may mean one for him, or it may mean Boolean true for someone else, right? So if you are not explicit about the interface, you will have implicit coupling. And you will realize this on the day when your code is going live and then you'll have to spend a night fixing things, right? So how do you get around this? Any ideas? So I mean one thing which we did was use protobufs. With protobufs basically you have a file very much like a C structure, which says this message has this integer, this float and this string. And then from that file code gets auto generated for all your apps. We use this in mobile and we'll come back to it a little bit later. So how do these guys talk, right? So this is a standard model. You have a client making a request to a service. It delegates to someone, some other service, gets a response and gives it back. Any problems with this? Nothing is reliable, right? Right, so we'll come to that. So we have built a pipeline for that. So we don't want. So there's a lot of problems here, right? So here you're interacting with one service. Imagine 10 and you have one, you want to couple, you want to basically the whole idea of microservices was to build things using higher levels of abstraction, which is not really happening here. The other thing is nothing is reliable, right? So what if that goes down? Let's say for example, it's a wallet, right? I mean in wallet, I want to deduct 10 rupees from your wallet and then update your statement. Now I don't, the statement update need not be real time. It can be eventually consistent, right? The wallet can be eventually consistent with what your actual, the statement can be eventually consistent with what your wallet is, right? In that case, you can actually go back to the client saying done and let this other guy retry. But if it's a transaction, right? If you're doing a transaction over multiple services, you can't do that. So basically there's a lot of complexity which each service needs to handle on its own, about its peers going down like basically how do you give up failure? I think that's what you're talking about. How do you roll up failures, et cetera? So what do you want to do is do something like this, right? When gives a request, the main server who, service who is driving the transaction gives a request, gives a response saying your request is ABC, right? I mean your response time has gone down from five seconds to like 50 milliseconds. So this service X then does its work and updates the request saying, okay, your ABCD X is done. Your client then, and basically then it tells some other service, do why, will come to how it does that. When between your client may say, okay, what's the status of this request ABC? You can actually tell it X done. So you're not coupled, right? Your client is always, your service is always responsible. Your client always knows what's happening. Your client can make decisions. The client can say, okay, X is done. Good enough for me, move on. Or some other client may say, no, not really. I want Y to be done. So how do the services tell each other about things, right? And that's where pipelines come in. So this is one example, right? Incoabibo is a hotel side app. So you have, let's say a hotel manager saying, I'm going to give 50% discount, right? So that goes as basically as a message to a topic called reprise, right? A topic pops up. Everyone familiar with the pops up thing? So basically you have the SRP service, the guy who's actually driving the SRP, listen into reprise, and then say, okay, I'm going to change certain hotels by this much. It then basically goes to another topic, say, and says, okay, invalidate all SRP results. And the hotels is the front end service, which is driving the desktop web, which basically knows saying, okay, I will purge my internal cache, and I will call Cyclone again for the results, right? So the advantage it gives us is, and roof is something which we built on our own. It's built currently on top of Kafka, where it's a, and so right now it's optimized for throughput. So basically it's like, and it's optimized for things like this guy's too fast and that guy's too slow, right? We want to add recipes into roof where you want to do latency. And Kafka is not really that great at latency, where you want to give an SLA saying, I make this push, the hotel should be done within some X milliseconds. And that's a trade-off. I mean, you can't give a high throughput with a smaller latency. So you need something else. And we are building those recipes into roof. This is on GitHub. You have a look at it. So what's the advantage, right? So let's say we want to do fraud detection on this. We don't want that admin in the hotel to basically change prices more than 10 times in a day. So you build your suite fraud detection algorithm, right? And it can just subscribe to these logs and then run its own algorithm. And it can generate a DB for you saying, hotel, Lila Vati somewhere is doing a lot of fraud. So please block it. So I'm pretty much like how Uber blocks users, things like that. So what you've done is not change anything. Your Ingo Ibo doesn't really know that some other guy is consuming its data. It doesn't care. Same way, let's say we have a dynamic pricing stuff which goes and looks at our competitors and say, you know, price there is less, let's do a reprise. That can push to the same thing, right? And then the rest of the pipeline basically stays up with you. You don't really need to be bothered about who's consuming or who's producing what. Other cool thing is, let's say someone is deploying Cyclone. And someone makes an app. If before this, this request is to fail, right? Saying, oh, Cyclone is down. What do I do? Raise an alarm, call the Cyclone Dev and figure out what's happening. Here it's not, here you don't really bother because these messages are stored, persisted at the topic. Sorry? So our default TTL is one week, but for topics which are very heavy in IO, we can overwrite that TTL per topic. So some topics, we have very short TTLs. Default is one week. We are not LinkedIn, so we don't have all the disks that we wish we had. So another thing here is like, you know, which, let's say a hotel is in which city. Seems like a easy problem, like, but the thing is, hoteliers want to change that. Like Shimla Manali, they want to say, I am in Shimla, or some days they want to say, I am in Manali, right? So let's say you search for hotels in Manali. Earlier Cyclone had to call this guy Voyager. And Voyager knew all about which hotel is in which city. So it's not really good idea, because this data is not really changing. It changes a lot. But you are actually incurring that overhead for every request. So what we did is, let Cyclone cache all the mappings in which hotel is in which city. And Voyager will basically send invalidations whenever there's a change for Cyclone to purchase cache and call Voyager again to get the latest mapping. So how do the clients talk to this services? So this is an app. So this is a good idea. So now you've broken down your one Django app into multiple services, fine grained. This is a good idea. It's not just business logic, right? I mean, I mean, they. Processing in a client on the mobile device. Right, all pretty much are correct, right? I mean, you have too much processing. And these are basically happening over like very latent networks. So even small things like DNX lookups, they are basically very costly, right? And we have SSL handshakes and all those things. Plus all these guys talk different things. Imagine your app size, if you build every Celerizer in your world, it's going to take forever for the app to load, right? So we don't want, so basically short story, we don't want all the services to be exposed to the client. And one major thing here, right? You can't release your app as often as you release code on the web, right? I mean, Apple has its own sweet time where it does QA. And you can't force people to update your apps every time. So theoretically, you have multiple versions of your clients out there. And if you make a mistake, you are basically screwed, right? I mean, so what we want to do is not do this. So we built a mobile gateway, which basically aggregates all these different services and then spews out protocol buffers to the client, right? So for Android, this is built-in. I mean, Google protocol buffers, it's pretty much built-in, but there is very good support, even in Objective, CE or Swift, et cetera. So basically what happens is there is one ideal file which is shared by the mobile gateway and basically from the same file, code is generated for your client as well as for your gateway service. And if you make a mistake and you released it and you find out there's only one place you can to make the changes. And the other design decision we had to do is how do we do proto buffs? If you compile proto buff, it actually gives you RPC stubs, but we decided to go with HTTP simply because of, I mean, it's much more firewall-friendly and it's just easier that way. So another pattern, right, which is not new, but which we employed very successfully is like what happens if your one external service goes down, right? Your one external service, as we discussed, it doesn't guarantee an SLA and that service can quickly bring down your old architecture. If you're using G event or green lights, you're not blocking threads, but you're blocking other things like memory, connections to your DB, et cetera. So even if you have green lights or all those nice things, this is still a major problem. So we built something called as PyStrix. People know about Hystrix. So it's basically a thing which Netflix invented and they built it in Java. We built this in basically Python. So basically it's a bulkhead. So every request for external service goes through Hystrix and basically Hystrix is like a circuit breaker. If you feel the other guy is down, we can give you a default response. The default response could be a cash response or anything which your service can see fit. And multiple instances gossip between each other. So if they know that external service is down, you want all your instances to know whether the service is down or whether it comes back up again. This is a code where how you basically use Hystrix, PyStrix rather. So you have subclass command, which is a Hystrix class and you just define two interfaces, run and fallback. Run is something which is going to talk to your external service and give you data. Fallback is something which takes the same arguments and gives you a value if run doesn't finish. And you can run it synchronously. You can run it with a timeout or you can run it with a sync and it gives you a future start result. I mean, we are working on that but basically it will give you what is typically called an observable in Java. So you can actually cancel or do something in it. So any time you subclass the class, basically a new worker is forked and all the communication to the external service happens through that worker. Yes, Jitav. So transaction over multiple services, I think I'm losing time so I'll not go over it. Basically this is how we do transaction over multiple services. We have MongoDB, I can go over if there are questions. Sorry? Sure, so basically what we did is we have, this is the way we do it in one service. You have a doc in MongoDB. Every service, you say, I mean, whenever you have a transaction with three services, ABC, you basically go to each and each of them has their own docs. So you say as part of this transaction, I'm going to touch doc A, doc B, doc C. Then you go and mutate each of those docs. And at the end, you delete the doc. That's the sweet normal case. The thing is what happens if something goes down. So in every transaction, someone is driving the transaction. The guy who's talking to the client is driving the transaction. If any one of them fails, he knows which other services are there and he can roll back those changes. What happens if the master himself dies? So basically what we do is we say every transaction has an ETA. So a transaction can't go over for two hours, for example. If the transaction is going on for two hours, something is wrong. So next time, some other transaction which involves B and C, you see that they are in transaction. They have a pointer to this DB, right? You see that the transaction is not really moving on for the last two hours. Then that can actually, yeah, that that can actually roll, revert and roll back. I can talk about that. I mean, losing time at the booth. So lastly, I think the question was about Cassandra. I mean, why we use so many different things. So this is where we do, I mean, use Cassandra just for a very specific use case. So basically we store hotel search results per day as well as for the length of stay. So if you're on day two, you're going to stay for two days is 200 rupees. Day three, 300 rupees. We do this to return the results very fast. So basically this standard denormalization. The problem is right amplification. Let's say D three changes and we're caching for three days. D three changes means every checking, all cash results where D three is the start date, they have to be invalidated. For D two, this doesn't change because it's just one day on D two. This and this changes. And for D one is this. So basically it's a progression is three plus two plus one. And we store for 15 days. So a single write amplifies to 120, right? So I have a lot, all these things are talked about our open source. Jitab.com, go I be both. And if you're intrigued, we are hiring. And we have a hackathon by the way in two weeks. 5,000 developers have signed up. You can check out our API. And if you have something interesting, do get in touch. Any questions? Okay, there are two questions. But essentially what we want to do is have only for every ORM have one service interface with the ORM. And all the other services interface with that one service. So we want to avoid multiple guys talking to the ORM. Okay guys, let's thank Jyoti Saru for the lovely talk that he's given us. And please a big round of applause for him. And I would like to, from the PyCon India team, I would like to give him a small gift for giving a wonderful talk. This is small. Thank you so much. Thank you.