 Hi, my talk is about the journey we took when we moved from one of the architecture to a microservice architecture. So, how many of you here know this? This is the Krishna's Butterball in Fabio. This is a single granite rock. This is the monolith and this is probably a team when we were trying to support it. But you don't have to worry. It was supported by physics, friction, all those things. But our monolith wasn't like that. So, what we ended up was supporting multiple microliths, if you want to call it. So, the journey and the lessons that we learned is the crux of the story. So, I heard them, our product engineering team at Math Street. We are artificial intelligence in the computer which is in the startup. We build a suite of products for the retail industry. And we have different kinds of products right from recognition engines on e-commerce websites to marketing tools, video marketing, social media marketing to products that can automate most of the arrows and operations stuff. So, a lot of people think when we say Math Street and as a combination AI startup, we do a lot of deep learning stuff, build models, neural inputs, those kind of things. But the other half of the story is the engineering, right? So, that is what I am going to talk about. And I basically prefer to call myself a software engineer, but I also have interests in general productivity, marketing, psychology, all those things. And I talk about it in my blog at CNU.com. And you guys should probably follow me on Twitter at CNU. So, coming back to the product that Math Street and builds, this is called as B.AI. And this is the retail automation product. And we have multiple different, this is sort of a platform on top of which we build multiple products. And to give you a very brief overview of how the data flows, the customers are retail customers who give fashion or e-commerce websites. And they give us the catalog and user events that are generated on the site to us. And we do a lot of processing, image processing, data science, all those things. And finally, we give them recommendations and other kinds of products and widgets. So, this is what happens. And when we started two years back, when I joined Math Street as a solutions architect, I was given this monolithic architecture, which did this basically. There was image processing piece, there was a script, a Python script which did image processing took all the client's data, all the client's image data, did processing, stored it as individual files. Then there was an image searcher which indexed all these, the data that we process and store as vector information. So, it's a high dimensional vector, about 21,000 elements in the vector. So, this image searcher is able to index them and it is able to do a very fast search and return back the response to it in a period. So, this is what we had. And the first, the problem is this entire thing was contained in one server. One server, and it was a very computer intensive server. We used like C44xlarge because the image search was very, it needed to handle millions of requests per day. And naturally you need to put two machines behind, minimum of two machines may load better than four. Some of our very big clients, we had multiple machines and the cost for each client kept going in few thousands of dollars per month. And the way we were using the servers were, there was practically no difference between using it as a bare metal server or a cloud server. We could replicate the same thing, put it in a bare metal server and it would work fine. But we were using AWS and we were losing out on a lot of the cool features that AWS managed services as, right? And we were spending a lot of money for that and that was something that we wanted to change. And expensive because you, like I said before, thousands of dollars per client and you're going to wrap your bill. Beyond expensive, the problem was our product needed to grow and we wanted to have a real-time pipeline through which all the product information comes in, all the user metadata, all the user event information comes in and we wanted a real-time solution. And with this monolithic action it was not possible. And the key difference between our solution and other different visual e-commerce or visual recommendations and solutions was we are personalizable at user level. So user B looks at the same product. The recommendations that we give for both users is going to be completely different based on the user's past history, his advocates, his bi-history, all those things. So this level of personalization cannot be achieved using this kind of old-style architecture. We want to break it down so that there are a few pieces that can be cached. There are many pieces that can't be cached. Those things can be improved or can be made more efficient. So these were all the things that we wanted to do. So if I show you this diagram, it is going to be just another module based architecture or a microservices based architecture. Of course, this is the actual structure of how it has been built. But when we started building it, we didn't want to do it like this. We wanted to have a code name for it and it was Project Gotham. And we are going to show a lot of characters that are in Gotham and that is the few of the things are internal jobs but you can get just of what each piece is about. So coming to the first piece, intuition. This is the piece where the client gives his product, we take a time to mission. And they give us in various formats like XML, JSON, CSV, and so on. And they also send it through different kinds of protocols like HTTP. There are a few who send it to SQS and there are a few who put their dump in the FTP server. So this intuition is the most important piece because this needs to, only if the data that comes in is clean, all the other pieces will work. And this was the major cost of chaos. So naturally we had to call it the Joker. And the Joker microservice converts the feed that the customers give and converts it into internal format. And the format we chose is called as man format. So the Joker's job is just simply take the feed, convert it into man format and push it to the next service, microservice. So it has to process both batch process and real-time data also. So a lot of other requirements that Joker had to do. And this used to change pretty often because they used to screw up some fields. They used to miss fields and a lot of problems that happened internal to the Reconversing piece they propagated to us. So Joker was one of the first piece. Second is Gordon, he is Commissioner Gordon and he polices the data, routes the data to the correct microservices based on different rules. So if it is a new product that has come in, he has to send it to microservice A, B and C. If it is an update to the product, he has to send it to B and C. If it is a specific update like price change or availability of stock information, he will probably send it to just microservice B and so on. So this is a very rule-based router. And this used to, this did care about which client's data, it was a multi-tenanted service. And it used to flow through Amazon SQS and the output also through SQS and other HTTP calls. Then there was one of them, this is technically not microservice, but it was actually a tool which we used to define how the different categories of products that the customer sends. So if you guys have used Photoshop, that is the last one tool which you can select the region, right? So one of them has the lasso and it almost does the same thing. So that's the idea. So we use a sample dataset and we draw, not draw exactly, have rules which draw these kind of lasso and we pass it on to the image processing microservice which is further down the line. So this watchtower is technically not the ingestion microservice, but it's a data store. And this is the single source of truth that we store all the metadata in. So Gordon will send the first copy of the data to watchtower and it is going to store it in Postgres and we can access data through HTTP or we can take a dump of all the data and all those possible features. So when we take ingestion itself, these are all the different pieces. There are multiple jokers. For each client you will have one joker, then there will be Gordon which routes them all to the different microservices, the next stages. It sends one copy to watchtower and it also sends a sample data to Wanderman. This is how ingestion, the module itself has been split in two. So second thing is the image processing. So the microservice that we call is, or image processing is called Nightwing. Nightwing is actually a DC comic. Technically it's Robin who grows up to become Nightwing. So Nightwing does all the deep learning, all those kind of cool neural network kind of thing. What it basically does for, according to us is it takes the images, converts it to the high dimensional vector, about 21,000 features, and it sends it off to the next source. That's all Nightwing is. The job it does is very simple, but it's a very, very compute intensive task, microservice. And we prefer to use compute intensive instances instead of GPU instances for most of the Nightwing tasks, but in case there are a few very heavy millions of products needs to be processed in a few hours, those kind of things we do have a GPU version also. And then there is the image search of microservice, which is Batman. And this is the most heavyweight instance, heavyweight microservice that we have. It basically does two things. It indexes the features and all the features for the millions of products. And it is able to do a very fast visual search on these features. So these are the two things that Batman does. And every microservice has their own SLAs, but Batman needs to return back in a few tens of milliseconds. So even if you have tens of millions of products in your character, the visually similar products that Batman returns is going to get back in about 20 or 30 milliseconds max. And to do that, all the data needs to reside in memory. So this is a very, very memory intensive application that is running. Of course, there is a disk backup. We use Dynamo for storing all the vector information in the backend. So it's a backup kind of thing. So Batman is the image searcher, the heavyweight thing. So I was talking about the product side of the things. That is the user side of the thing. So data from user events, like user clicks on a particular product, he interacts with the recommendation engine, the widgets in that, or he adds to wishlist, add to cart, buys that, all those different events that are about like 36 or 40 events that we track on the customer's website or the mobile app. And this data flows in through CloudFront, and there is a separate data warehousing, like in the data pipeline which stores all this information. And the personalization engine takes all this user data, and it creates different data models out of it. So there are two pieces in the personalization engine side, and that is slow, man. So the first one is the user behavior-based data. Like I said, the history of individual user, all those things. So this is the widgets that say people who bought this also bought these products, or people who viewed these products bought ended up buying these products. Those kind of widgets that you've seen are amazing. So this is the microservice which gives you the output that I just talked about. And it has products like collaborative filtering, cross-product, all those kinds of data analysis kind of products. Then this is overall you segment users into groups and then say these kind of users do these kind of things. But like I said before, a unique setting proposition is individual user level personalization. So we needed a microservice which is able to do very fast personalization and it sends a particular set of products or recommendations for individual users. And two-phase did that. Initially, two-phase was designed to do an A-B testing. It was built as an A-B testing framework. Then we said, okay, personalization is closely ties into this and we did a personalization. This is called as a dynamic personalization framework and the two-phase is the microservice that does this. So going quickly to the other data stores, like I said before, there's Watchtower, which is the actual source of truth. Then there's Flash, which is a very fast data structure server, basically a real-est instance. And we store stuff like the user event history, the session related data and product availability information, those kind of things in Flash. And it's very fast because there's one realist and it works well for us. And the important thing is the data you have in Flash shouldn't expire. It is actually, we don't set any expiry on this kind of things. Then there's GCPD, Government City Police Department, but we call it as Global Cash for the products they just turned. This is the acronym for that. And this is a rough first level of cash that we have. So Batman, Superman, all these microservices, they put the data into GCPD, which is to be used by the next layer, which is the API. I will explain about that. And this can expire, but it shouldn't expire based on this expiry logic. So we had to build a custom library, which has, even though the cash element is expired, we have the content in the GCPD server and then there was some logic that was built into that. Coming to the API layer, Robin handles the API. This is very lightweight and we decoupled, if you remember the old API and the image searcher architecture, we have decoupled Batman and Robin. We have split them apart and Robin does very fast HDB API responses. So any kind of, let's say, for example, for user ABC, he's viewing product one, two, three. Give me these recommendation widgets. So give me visually similar, give me cross product products, all those things. So what Robin does is it does a very fast Batman call. It does a very fast Superman call and collates all these results and then presents need JSON response. So it does use part of GCPD but because personalization needs to be taken into account, the result from GCPD is now mixed with the personalization results and it does a very quick calculation and resorts the results and then sends it. And there's the second level of complexity here because our products, like I said, the product at all is sent through a real-time stream and availability stock information and pricing information, all these things change at a very fast pace. Every minute it changes. So Robin needs to do a very quick watchtower call and then say, okay, for these set of products, these are all the latest available products so I have to not send the out-of-stock products and then it does a lot of filtering based on the custom fields that we have. So this is what Robin does. First of all, this is how it all ties together. Of course, I haven't explained about all the other microservices that are there but these are all the main things that we use and which explain what product does. So there is Joke of God and Wonder Woman which is part of ingestion. Then there is Nightwing and Batman which was the image crossing and image searching. Then there's Joke of God and Wonder Woman which is part of ingestion. Then there is Nightwing and Batman which was the image crossing and image searching. This is the intro in the API that our clients use. So this is how it finally, after two years of different kinds of iterations, this is how it went through. And the lessons that we learned were more interesting. So this one is given, this is lesson zero. You can't start with microservice architecture in mind and then build it. You have to start with the monolith and you have to break away small pieces and identify this service needs to be presiding on its own and this service makes sense to be together and so on. So this Conway's law, you could have heard about this. So organizations which design systems eventually build systems which mimic the communication channel in the organization. So initially before I joined there was one backend engineer in my student and then I joined and then Narendra who is the next speaker joined. So once we joined, we split into microservices and we started building this. But the corollary is not true, right? So since we have three people, you can't expect three microservices only and each person has to manage multiple microservices and they have to balance them. So this is a very important lesson. And second is we had to deploy heterogeneous microservices in a server. So this explains it better. So there's a compute optimized server which is used primarily by Nightwing. But there are few microservices which doesn't require so much CPU. And for example, Joker or Gordon, these microservices are in the ingestion side and they are more IO intensive. So you can cluster them together in a single server. Of course, you need to leave a little bit of gap so that it can grow. But you pack them up or wind them together and have a separate instance. And there's other side which is Batman which is like I said, more memory heavy service. And they have... it doesn't require too much CPU and Robin does a few CPU intensive tasks. So we have talked both Batman and Robin together and since they are on the AP or recommendation serving side of things, it makes sense to move them all together, right? So this is one thing. Of course, in the recent days, you can just put them all in Docker and expect you needs to handle them all together automatically. But not every mic... like other speakers have told, not every microservice can be cut and erased. And we have to... there are few things which aren't Dockerized yet. For example, Nightwing is... we do a lot of GPU intensive things too. So we don't Dockerize that Batman is more... is also running on its own VM. So there are few things that we don't Dockerize. Lesson two. Create immutable microservices. This was a thing that we learned later, but we were doing this without even knowing that this was a rule, right? So this is famous in the Netflix side of engineering. And what they do is they create immutable microservices. Each microservice version will run and still all the data that's coming to that is stopped. They don't shut this down, even though there are multiple versions coming up. So to give an example of how we were doing this without even knowing this rule was... Gordon. So Gordon was a commissioner, but he didn't start off as a commissioner, right? So we had different versions of Gordon. First he was a constable, then inspector, assistant commissioner, then a commissioner. So each version we had this kind of commissioner, not pay, just a commissioner, pay. So the versions were like... the same way we were using these kind of things. So this is how... So even though assistant commissioner version of Gordon was running, the inspector version was not decommissioned. So they were like running in parallel. There were a few clients who were still on the older version. Similarly, Batman writing. Everyone has their own different versions running simultaneously. Lesson three is asynchronous. It's better than synchronous. Not always, but in majority of the cases, we have started using SQS queues and message passing and all those things instead of doing direct list calls or HTTP calls and things. So basically queues are better. So start following queues and use message passing. We use... To give an example, we have SQS for all the ingestion-related queues. Then all the user events and all those things are handled through RabbitMQ, which is much faster and easy. So there are different kinds of queues that we use to handle this. And this is a lesson that we learned. There are a few services that needs to be synchronous. For example, Robin needs to be synchronous. And there are... Nightwing also has a part of peace which is synchronous because if we do image search of user-generated content, it has to be done in real time. You can't wait for the queues to be consumed. So there are a few pieces that are synchronous, but try to use asynchronous as much as possible if you can. Then... Yeah, this one I said before, not all microservices needs to be containers or on servers. So there was one microservice which I said was Wonder Woman, which it's actually a tool that is running on a laptop or thing, on a computer. And people, whenever a new client or someone comes in, a competition guy will come and he will look in the sample data and say, okay, these rules make sense? Fine, you can go. And that's the job of Wonder Woman. It is not deployed on a server. That is one lesson. And you have to build tooling to automate your microservices. So like I said before, in the first two images, right? Balancing a single rock is problematic, but balancing tens of smaller rocks is even more painful, right? So we had to build tools later to automate most of the main tasks that we used to do in the... And that helped a lot. So this one is very important because, of course, I haven't mentioned a lot of microservices. There's one microservice which handles all the logging related stuff, and Narayan, my colleague, is going to explain about the logging microservice in detail. But when you're logging, make sure you attach a transaction ID or a request ID to all the log messages. It's very important because any kind of microservice does some kind of processing based on nature that is received, right? So it could be an image that comes in, the metadata that comes in. It passes through multiple microservices, and when that ingestion happens, attach a transaction ID to that and propagate that transaction ID to every microservice. That helps a lot in debugging when you have a problem. So initially, we build tools, and later on, we realize, okay, this needs to be done. Similarly, the other side of any HTTP APIs that get passed, it needs to have a request ID along with that. And people will forget this when the initially start writing microservices. So make sure you have a request ID or a transaction ID. Last is more important for me, at least. So give a character to your microservice because the team comes together with the characters, right? So there are people who work specifically on specific microservices. We just say image processing microservice, image searcher microservices and so on. It's not too fun to work with, right? So we have spent countless nights, not days, nights, working on these kind of microservices and since we had this kind of... we were spending a lot of brainstorming sessions on what we should name each microservice and a lot of fun was had during the building stages of this architecture, department architecture. So there are many other characters that needs to be put in. We have started with DC, we have still Marvel and a lot of other comics to add to. So this helps us in... also this is more funny because then there is the on-call thing and people say I don't get any results. You can say, okay, Batman is down or something like that. So it's more funny and these kind of things are cool. So if there is one takeaway that you can take from this talk, I would probably say this is the one. So that's it I have. So the slides are available in this link, my personal blog if you want to take them, and you can follow me on Twitter at cv. Any questions? Questions, anyone? Obviously everybody is talking about serverless architecture like Lada or something like that. So have we explored some of those? Yeah, we have explored some of those. So in multiple places we have explored serverless architecture. For example, there is the logging thing that I said, right? It was initially built on servers. Then we have started exploring the serverless architecture. We have started using basic lambda. We have been using lambda for a few other things. And there is the Aetna, there is ST, all those things. So there are a lot of serverless things that he's going to explain about during the logging thing. Apart from that, there are multiple products that we have built on top of this platform, right? So there is one particular micro-service which... That's one particular product which we have been building which is video processing product. So there are videos of e-commerce companies, right? They have models, all these things. So we detect the key frames where the products are visible and then we do a match against our product capital across the client and then we give recommendations. So all the video processing, all those things are done using servers. So we use AWS, we use Lambda to kick off the transporting service and then the output is all passed to the exchanges and so on. So there are smaller pieces that we have moved to servers and logging is something that we have started to slowly migrate towards servers. That is happening. Any more questions? Thank you. So if there's anything, please catch us in the...