 Thanks Pushpesh. A wonderful talk. I'll just pick up from wherever you left. I think we've had a similar journey when we started But at focus this talk more on you know what we did as a V2 of you know how we started earlier I'm going to talk about NIDA platform patterns at SK Bit about me. I'm Jayesh. I'm a director of engineering at Hotstar PradaVis have led data engineering teams at Grow First and Tanyao Quite active on Twitter reach out to me at Jayesh with Vani and I'm always available on Jayesh at hotstar.com Bit about Hotstar to sort of set the context, right? I'm guessing most of you have either heard of Hotstar before or have you know use Hotstar But a quick quick recap Hotstar is you know number one ODD platform in India We have over 350 million You know downloads across our you know iOS Android apps and you know the living room devices At peak we've been able to do a 10.3 million Concurrency that's 10.3 million users were active on our platform at a point you know to sort of put it in context it's it's a Little more than the entire population of United Arab Emirates and Portugal individually, right? And all of those people were there on the platform right at the same time We just hit a world record with that concurrency. This happened in IPL 2018 The finals and at peak, you know, we've been able to consume Approximately 75% of the total internet bandwidth in India, right? So the entire bandwidth we took up 75% of that We're available in more than 15 different languages today And will soon be available in nine different countries. We started off with India But we're going to US Canada UK Indonesia Vietnam You know, Australia Singapore UAE. Yeah, I think what what sort of mixes a little different and sort of imposes different Technological challenges is the variety of content that we have we do live content We do on-demand content and then you have sports news TV movies and everything is there in the regional catalog, right? So there's a lot of, you know, a diversified understanding of our users and our demographic and hence building for that Stock is more about, you know, why we chose to build a platform and not what so I let us quickly start off with What do we have and then we'll you know drill down into details of each component a quick thousand feet overview You know, we have a very simple traditional data platform, you know Starts off with, you know, clients on the left. We have around 13 different clients Like I said earlier, all the mobile mobile devices and living room devices. We have 13 different clients who constantly send data points to our platform These data points mostly are clickstream events. So, you know, what did you view? You know, which page did you open which button did you click? You know, likewise you've instrumented around 120 different, you know events all of these events are ingested into a central message bus This message bus is built on top of Kafka, right? And Kafka is only proxy to any data, any external data rather coming into our system What's the message bus? We have a bunch of processing compute units, right? So these processing unit continues to listen on the incoming data, massage the data, transform the data and, you know, push it for downstream consumption and storage We tried to make a very simple storage layer, you know, I'll talk about that storage component later But we have a data lake which is primarily built out of S3 and we have a data warehouse, which is primarily H base But we'll dig into the details and our hot star moving to the data products hot star. I think data is very ubiquitous, right? It's it's not just used for analytics, but we, you know, engineers build a bunch of data driven products on top of it as well In fact, the entire app is, you know, personalized for your experience, right? And the whole personalization engine runs out of the data that's invested here, right? So a lot of content personalization, a lot of ad personalization, a lot of traffic modeling, and of course analytics, right? Those are the consumers of our data platform With that said, you know, this is this is what I'm going to talk about I'm going to talk about different patterns that we know we came across along the way A few anti patterns, you know, given the context of where we were and we'll break it down into ingestion patterns, storage patterns and consumption patterns Right quick brief ingestion patterns, you know, we've been able to build a platform that has been able to ingest a peak load of 1 million events per second Right and we've been able to do this for, you know, a good period of 30 40 minutes, right? So throughout the period, there were 1 million events coming right at us and you know, you'll be able to ingest it with high durability Like I said, it's built entirely on Apache Kafka, highly available, highly durable because those are one of our primary expectations of this tool We'll talk about it later in detail 1 million events was what we did last year, I built 2018, we've grown tremendously, you know, from that point, this year around we have a projection of 50 billion events per day during IPL 2019 Our storage, we produced, like on a regular day, we end up producing about 3.5 terabytes of data per day The data is primarily stored in S3, we're also actively looking at HDFS now, now that our engineering teams are more mature Whereas like I said is entirely in HBase Like the 50 billion projection, we're looking at 14 terabytes of data to be ingested per day this IPL 2019 Our consumption patterns are also very unique, like I said, you know, the entire company uses data to build their products And you know, on a regular day, we end up, you know, doing a supporting rather 300 terabytes of data scans each day, right? This is including everything, all our analytics and you know, the data products that are built in house You wanted to make the consumption very simple, so we have one single interface to talk to data Again, talk about it in detail later You know, to bring parity between streaming and stationary data And we didn't want anyone to be limited on the amount of data they can scan Start-up with data ingestion, you know, broad vision, we wanted to build a scalable HTTP, you know, ingestion proxy Sure, there are multiple other protocols, but a simple HTTP proxy that everybody, you know, in the company could understand And something that's unique for us is the data spikes that we get, right? We handle about, you know, 3x spikes in less than a minute And you know, 3x is, I'm talking about millions, right? So we do from 3 million, we go to 9 million in less than a minute, right? So the platform needs to be aware of, you know, how to handle that scale Good, pattern number one, right? For data ingestion, allow everyone to ingest any data, right? And that's important, right? Because, you know, one of the most important things that we realized is, you know, one of the biggest success factor Of a data platform is you can't stop anyone from ingesting any data And people really don't know what data they want to ingest They only know things in English, right? So let them ingest JSON, Avro, you know, protobabs Whatever they want, let them ingest it, right? You should understand binary and that's what you wanted to do, right? For us, bunch of clients, like we spoke about send data A lot of microservices, you know, send the data via the databases So we replicate the entire database of our different services, right? We have a content service, we have a user management service And each of these services have their own databases Which could be MySQL, Postgres, Dynamo, etc, right? All of that data is ingested from, you know, the right-ahead logs, you know, like bin logs In case of MySQL, it's ingested into our common interface, right? And from there on, available for query Like I said, we wanted to make the ingestion simple So HTTP for people who do want to understand anything beyond it Low-level TCP libraries for, you know, our power users Who want to, you know, extract more juice out of the system And, you know, Kumar spoke a lot about how client SDKs, you know, should be built You know, we pretty much follow the same principles I'm going to skip over it, but in a brief, client SDKs are super simple You know, all that the client needs to worry about is just one function It's track, right? And that's just about it You know, the SDK, you know, transparently takes care of batching Retrying, ensuring, you know, high deliverability, et cetera And this allowed, you know, our consumers to just go all wild And, you know, just instrument any data that they thought could remotely be useful And that gave us a lot of power The second pattern is, you know, quality of data is super important, right? And I sort of tend to draw parallels with how we would build a database system, right? Imagine, you know, you're starting a new service New application and you have to choose a database Let's say, you know, I pick up a MySQL The first thing you do is start modeling the data, right? The first thing you do is design the schema of that table, right? But I think what happened with this and what's also unique with a lot of A lot of companies handling data is events and data is not treated as a first-class citizen There is not enough thought on what should be instrumented and how, right? That hit us hard I think for a good, good one year, we were, you know, suffering with clients Not thinking about what the data format should be Two different clients thinking about our data differently, right? And a simple example would be, you know, let's say watch to video Is one of the critical, like, you know, the most important metric at Hotstar There's one property called watch time, right? So, you know, the clients would go to an extent where an android would send watch time in minute as an integer To some other clients, you know, who would send watch time in minutes as a string Now these things have a trivial hurt really bad When, you know, the end users end up consuming it, right? And think about it, right? We have like 150 engineers, 45 data analysts, they need to have that inside Hey, iOS is sending a string. Let me cast it, right? So all those things, you know, hamper does big time And we brought in this whole schema registry layer right in front of our data ingestion system And we mandated everyone to define an event before it could be ingested into the system, right? So arrest the quality right in the beginning, right? Not in the end And we ended up doing fancier things, you know, later, right? A, we mandated everyone to, you know, send us data and Avro, right? Which also made it leaner on the wire, right? And we started doing strict checks on the data, right? You know, checks from, you know, if you're sending a value, let's say subscription status, right? It can only have a fixed set of values. It has to be an enum, right? So if, let's say you misspelled canceled or you introduced your own subscription status For God knows why the data won't be ingested, right? And we sort of have two, two flavors to not letting you ingest a, we just cut it off And say, you know, you send me a bad data. I'm not going to let it go through And I'll raise an alert trial. I'll, you know, raise a page of duty if you've been doing it for a very long time Or I'll just send an email to the, you know, the owner of that event, right? And in that case, you know, the event goes to a dead letter queue and, you know, guys come in manually scan it See what's wrong, what's right, fix it and replay it, right? The other mode, of course, is, you know, pass through mode, right? In which, in which case, you know, if the producer takes the, you know, responsibility of, hey, look, I don't, I don't care about Or I don't know that, you know, rather I am for certain know that the downstream consumers won't be impacted This is just an event for me. You could optionally choose the event to just pass through with A normal alerts, right? Though we don't, you know, casually allow that in practice, but we have an option to do that The image on the, on the right, you know, talks about how a monitoring system catches the alerts So you could see a not in enum alert, type mismatch alert, a missing field alert, right? So all of these have, you know, allowed us to ensure a strict control on the quality, right? And of course, garbage in is garbage out. We don't want to do that The third pattern for us is very common. You cannot drop my data, right? And I think if I had to sort of only pick one thing that's very important on that's very crucial to, you know, being very durable It's hot that has abnormal peaks, right? The graph that you see on the right is an actual traffic pattern during, you know, the IPL finals, right? This was a match between Hyderabad and CSK and the peaks are crazy, right? The data comes unannounced, right? Dhoni comes on the, on the crease, he hits a six Boom, we have three million users on the platform, right? Nothing happening four million users just get out of the platform, right? So these are eventually just cannot predict, right? But you still have to build for that, right? And we've learned our things the hard way, right? For, you know, scaling is abnormal peaks we started off with proactively putting our boxes, right? So if we knew that there's a Bombay versus Chennai match, we used to put up like, Hey, look, let me provision 60 boxes because I know at least for, you know, one second during the match I'll have like a eight million concurrency, right? I'm building for that for the entire six hours, right? Of course, not useful, right? We ended up burning money, burning money like anything. And then from there, we've matured matured very well. The second thing we did was we started scaling up on the number of requests, right? So, you know, we always plot the number of requests that we're getting and a simple derivative, maybe off of the incoming requests could tell us that, Hey, look, you know, we're expecting a more, you know, peak in the next three seconds or next, you know, a minute, right? That worked well, but I think now we have more richer data. I think now we've done at least I've been in the company where we've done three. I visited the third IPL. So with these three IPLs, you know, we've got a good sense of how the traffic behaves. Now we are in a position to do a lot of predictive traffic modeling, right? We could figure out if Dony's coming into bat. We could see a, you know, X percent of increase in the traffic. So that's helped us, you know, be more proactive with scaling. The other important thing we've realized is systems will fail no matter what, no matter how. And one very good example is again, IPL, right? There was this match. IPL usually starts at 8pm, 7, 10, our Kafka brokers go down and we don't know why, right? We're struggling with figuring out, we're looking at logs. We do everything, you know, we could do in the, you know, in the world. Panic kicks in, of course, right? There's 8pm match. The whole app is, you know, just based on what you've built, right? And luckily, you know, for that match, we kind of 753, 755 ish, right? The broker suddenly started working, right? We don't know why we don't know how, right? And we could do the match, right? And it's one thing that we realized from that, right? You know, you are not always lucky, right? And I think we did an RCA, a CTO just lashed out at us, but we took all that. But we realized one thing, right? Like tomorrow this Asia cup, the systems fail. We're not sure it'll, you know, be back on time. So you spend a lot of time in ensuring resiliency in the system. We have a lot of circuit breakers now, right? But even apart from that, we figured out a degenerated mode of the system, right? Now, one thing is very important to us. Like I said, we cannot lose data, but then we have degenerated SLAs, right? It says that they look SLA one, I'll give you the data and I'll give you the data in real time. SLA two is I'll give you the data, but I'll give it to you delayed, right? So now what we do is that if let's say at seven, 10, I realize that my systems are not up to the mark. My circuit breaker opens. The data starts going to a secondary data source, which mostly in our cases is three, right? So the data goes to a three and then we have a backup channel that takes data from a three and you know, puts it back into the, you know, main flow. Right? So that will be at least ensure that, you know, we don't panic when the match starts and we don't, you know, still end up losing the data. So that happened. And I think, I think, you know, we just survived that scare and then you got another scare. The other scare was when we started doing a peak of one million, 1.2 million events per second, our load balancer started throwing up. And for us, it was the first time that, you know, load balancer couldn't survive the scale. Right? We then, of course, after a lot of research ended up, you know, sharding a load balancer. So now we have a LB one and LB two, et cetera. That worked for a while, but now we've again matured and you sort of, you know, become a little better at it. Now, you know, we do smart routing on the edge, right? So on the CDN provider, you know, we now have compute that's happening on the CDN provider. That has allowed us to do smarter things. Like, let's say, you know, I am, I am browsing hot stuff from, let's say Kanpur, right? I am not connecting to my Mumbai data center. I'm not connecting to my Singapore data center first. I'm connecting to my, maybe a UP pop of my CDN provider, right? And we run computes there, right? We run computes to figure out, you know, from the headers, you know, what kind of event is he sending, right? Is he a premium user? Is he a free user? And those decisions, you know, help us figure out how critical that event is for me. So a certain watch video gets a fast lane that, you know, all the events get aside. This is the event I'm letting go through, right? Versus maybe a downloaded video. Dude, you come around here, right? I'll, you know, start analyzing you in maybe an hour or two hours, right? So those smarter decisions have now helped us scale better. Have, you know, now helped us unlock maybe whatever input scale that, you know, comes, comes onto a platform tomorrow. I'll quickly move on to data storage. So we just wrapped up the first part, the data ingestion part. Data storage is the middle, middle part. Data storage is simple, right? No frills. I just want to store my data and I want to query my data, right? No other asks. Our data size has grown dramatically, right? I spoke about this earlier from last year to this year. We've grown 200%, right? And my sense is that next year we'll grow maybe a triple digit, maybe I'm sorry, a four digit percent, right? Because now until now we were only India, right? But now we have India, Indonesia, Vietnam, who are as populated and, you know, as crazy entertainment driven country as India is, right? So this is something that, you know, the things that we've built today, of course it'll be useless tomorrow, right? So we have to continuously reiterate and, you know, make it better. But data storage, one of the most important things you wanted to do is he wanted the data to be available in very real time, right? And I'll give an example. Let's say there is a RCB versus CISK match happening. Okay, I'm a Dhoni fan, so I'll keep calling Dhoni a lot. So it's Kohli, you know, RCB and CSK match happening, right? And let's say for a brief period there's a lull, right? Like three, four overs, nothing is happening in the match, right? Dead phase in the match. And our users drop, right? And they do. And let's say suddenly Kohli starts hitting six. He goes buzzer like he does whenever he gets into his mood, right? We want an ability to maybe, you know, call back those users who are dropped from our app in the last five minutes, right? Because those are the users who are super, you know, sort of likely to come back on the platform, right? And you want to build a storage, you want to build a platform that can allow our analysts to do that. You can't expect an engineer to be pulled over to run your scripts then, right? The analyst should just come and write a SQL query. Hey, give me the users who dropped in the last five minutes. I want to send them a push notification. You should be able to do that. So that in mind, we've, you know, built a system that allows the data to be available for query in less than 10 seconds. Uh, data storage pattern one, that noisy analyst will always be noisy. I don't know how many of you, you know, uh, sort of how many of you relate to this, uh, but it's very common, you know, thing that I have noticed, right? So this is one dude who will run queries for eight hours, 10 hours, and nobody in the company will be able to use the data system, right? How much have you tell that guy he won't listen to you, right? So we did the other thing we said, we won't tell you anything, but we'll build systems that can, you know, allow you to do whatever you want to do, right? Uh, and hot star is very seasonal. You know, uh, so you have IPL that happens every year. GOT is happening after two years now, right? And while it is seasonal, the other important thing for us is that, you know, every day there is this new season that will happen, right? So, uh, this month will be launching hot star specials, which is our version of hot star originals. So every month there will be a new, you know, original season. Every month there will be a new original season coming, right? So this every month is a new season that, you know, our analysts will get busy, you know, analyzing data for. And for each season and season, they end up doing like two years, three years worth of query, right? And just overall in my cluster, uh, when we started, it was very difficult. Folks used to just, you know, tap a button, run the query, go for coffee, maybe go for a smoke break, have lunch, come back and still see it running. Uh, but we've, you know, done better from there. Uh, we realized one thing, right? Like one of the super important thing for us to, is to decouple storage and compute, right? Because storage is not elastic, right? Storage always keeps on growing, but compute is very elastic, right? Uh, if you're running a query for three, three years worth of data, maybe I could provision, you know, 10 boxes for you for that 15 minutes, right? But I should be able to scale it down immediately. Uh, when, when we started off, we were, you know, mostly, uh, on, on Amazon redshift, you know, and that didn't allow us to decouple the data well. And I'm talking about last year, but from there, we've given a lot of thought on just decoupling it, right? And we've been able to do that, right? With that now, we've been able to do resource isolation very nicely, right? So you have a retention team who have their own compute instances, which could just scale up, scale down, however they want to, right? Our, you know, sales team have own cluster and incident teams, they have their own cluster. It is independent. Everything, you know, just scales independently. But the data is same, right? All of these compute clusters in the end go and talk to our, you know, data stored in S3, right? So then that now allows us to scale independently. Data, data storage pattern to, uh, simple find patterns and optimize the query further. And this is one thing I've realized, right? How, like, how much ever you optimize a query, there's still room for more optimization, right? And to that extent, one thing that would be very sort of particular and, you know, alert about is we log everything that happens in a query, right? The, the, the query planner, right? What happens in the planner? How is the planner running the query to how many scans is a query doing? And all the while we're interested in understanding, hey, look, is there a data which is frequently queried by multiple queries and all of the queries, you know, just run through the lineage, right? And what you've done is that, you know, like, you know, periodically after all of these analysis, we've been able to figure out that, hey, look, if I could break down a query into multiple sort of lineages, lineage one and lineage two are common in most of the queries. Let me do one thing. Let me create aggregated data or, you know, a consolidated data of lineage one and two. And let me ask the analyst to start from lineage three. Right? So that has allowed us to, uh, you know, make the queries be more performant. And when we do that, our analysts are, you know, like, of course much smarter than us. They come up with new definitions, you know, of data which is breaks the whole lineage pattern. Right? One such example was, you know, last IPL IPL 2018, our analysts ran out of all the definitions of, you know, things they wanted to do. And they came up with new definition called IPL reach. Right? And that's anything on the platform that could have a first degree, second degree, third degree association with an IPL that could also mean an ad run on the platform that is related to IPL. Right? I want to analyze all that content and figure out what that on, on the users. Right? So then again, you know, our engineers jump into jump into the picture, figure out, hello, please, please, please don't run the query on the raw data. I'll create a new definition for you, which is IPL reach. Right? We did that. But again, that, that, you know, uh, meant that our engineers spent a lot of time on doing this. So progressively we, you know, built this thing called ETL as a service. We call that loosely, which now allows the analysts to just write SQL and, you know, all the aggregation, all the heavy lifting of scheduling it on a spark cluster, running it on, you know, a map, reduce days, et cetera, et cetera, taken care by the platform. Right? So analysts still continue to write a query and create, you know, a derivative data source. So to say this worked beautifully for us. And, you know, we've now investing more resources and making it more better. Uh, my last chunk is on data consumption. Right? Uh, super critical for us. Like we wanted our producers to be, you know, uh, we wanted a person to not think anything up before, you know, ingesting any data. Likewise, we didn't want our consumers to think before consuming the data. Right? Talk is cheap. Sure. Data cliched, but we swear by it. Uh, and I think it's something, uh, that runs in the organization top down. Our CEO wouldn't entertain any, you know, business meeting without looking, you know, our data on our BI tool. And that just helps us, right? Helps us. And, you know, also helps us get more work and also helps us be more relevant in the organization. But yeah, I think I think this, this is a, this is built in the DNA. And, you know, we just stick to it. Nothing wanted to do is, you know, we wanted to give a single interface to query. Right? And you wanted the single interface to be tightly controlled by engineers because we realize one thing, right? The more the control we have, right? The more quality checks, the more goodness we could extract from the system, right? And in the end, it doesn't just become, become a matter of convincing an analyst to work in a certain way, right? We just do it automatically on the fly. Did I condition pattern one? Don't make me think, right? Uh, the data stack is very complex. When we started, we had Kafka. We had Hways, S3, Redshift, right? Uh, for a very long time, we were thinking of using Ignite, right? Bunch of technologies behind the scenes, right? But our analysts need not worry about it. They shouldn't even know what's beneath or what's between, you know, their query and the data, right? Uh, we spend a lot of time, you know, building platforms, building libraries that abstract all of this away from, from the analyst. And we have only one single SQL interface to the data platform, right? So analysts can come, write a SQL query. That SQL query can talk to data coming in from Kafka, talk to S3 via Hive, talk to HBase or talk to any other data source that's there for, for that matter, even a relational data store, like a MySQL or maybe a document store like a MongoDB, MongoDB, right? And again, right? Super simple. You just try to SQL query, right? We build systems that resolve where a query needs to go and how. And while we do that, we sort of, you know, given a lot of thought on treating a streaming data and a stationary data alike, right? Um, I think Kumar, you mentioned a point where he said that the data that lands in the data lake and the data warehouse is enriched, right? It needs to be full of information, right? If you want to support that, the data ends up coming there very late, right? So in our case, the data in the data lake and rather in the data warehouse comes up one hour after it has been ingested and we wanted it to be that way. Yet we wanted folks to query the data in real time. So we let people consume data from Kafka, right? Our Kafka ingestion system, like I said earlier, gets the data in 10 seconds from the moment it's generated. So you click a button. Within 10 seconds, I know XYZ person has clicked a button, right? And I could run my analysis, my aggregations from that 10 second window, right? So now analysts just try to cycle query, right? Underneath the figure out hello key needs a data which is within one hour window. So I query, I talk to Kafka. If it is beyond that, I go and talk to my, you know, data warehouse or my data lake. This is what we've been doing. Uh, we are always, you know, enhancing whatever we built, right? I spent some time talking about our future work. One of the biggest things we are now working on, you know, having something that I like to call local data and global view, right? Like I said, you know, we've been 15 countries, right? So we need a data lake, a data warehouse which is local to that country. So and Vietnam CEO, Vietnam hostile CEO should be able to minus data efficiently and a global CEO should get a visibility of how all the 15 countries are performing, right? While we do that, we have to ensure GDPR in Europe. We have to ensure whatever the new privacy law is there in India and likewise in other countries, right? I think one, like a bulk of this year and next year would be spent on, you know, building systems that can, you know, scale globally for us, right? Our vision is to have like a global bearer right? I come right to SQL query figures out where it needs to go and then, you know, the routing happens transparently. The other thing we want to be more leaner on the wire, more in-flight enrichments, right? The client should just tell us, hey, look, I'm a user ID U1 and I'm watching a content ID C1. That's just about it, right? All our in-flight enrichments should do, you know, streaming joins, get data from multiple places and make the event more richer, right? And our expectation is that the one hour that we take to enrich the data and make it available via data warehouse, we want the same data to be available in the 10-second system, right? So within 10 seconds, you should have everything, you know, add a fingertip, right? Stop talking to your warehouse, start talking to Kafka, right? That has the rich data real time. The third thing we want to do is build in a lot of anomaly detection models and it's super important for hotstar and I'll give you an example. How many of you know that show? Yeah? Only one, Zainab? So that's apparently the most so do you also know that? He does, no? You're hiding. Yeah? So this relationship is the most watch show on hotstar and one stat I'd like to share is for this particular show, hotstar, the OTT platform, there's about 25% of the watch time of the entire watch time that the show does and that's remarkable, right? Because OTT industry is just a year and a half and a year and a half years old, right? And within a year and a half, internet has been able to do 25% of what TV has been doing for 20-30 years. Two more years, I don't know if cable will be there, I hope it is, but you know. So you know with that, I think there's a lot of animalies that happen in the system, right? One such instance in sometime in August, you know, a watch time on the platform suddenly dipped, right? And it dipped by a double digit margin, right? Everyone panicked right from the product managers to even the CEO, right? People are not sleeping in the night, people are in the room, you know, doing their hypothesis, running their queries, analyzing data to figure out, hello, this is not why it had dipped. And you know what? And like in a day and a half time, you know what is the reason for the show to dip? The reason was that two lead characters were planning to have a divorce and people didn't support that storyline. So people stopped watching that show, right? But we wasted 48 hours figuring out why people watched it, right? So and this is very common, you know with us and you know, we now are running a lot of energy-building anomaly detection models that, you know again, that 10-second window figure out if there's an anomaly in that 10-second window, right? If there's an anomaly, alert it, right? Alert it to the person who's responsible so that, you know, we can action it immediately. Last, I think we are heavy consumers of open source tool. Everything that we've done, of course, we started off with hosted on-prem solutions but slowly over last year we've moved entirely to our own in-house platforms and everything is built using open source system, right? From Apache, Kafka to Hive to everything, right? We will be, you know, open sourcing our ingestion system. It's highly available durable system which I'm sure a lot of folks would benefit from. Something you're working on. We write a lot about whatever, you know, whatever I just spoke about at tech.hotstar.com, right? Whatever we talk about, you know, we write it there. We have monthly workshops across our Bangalore and Bombay and Gurga offices. Do check it out if you want to know more about how we do, how we work. That's all. Open for questions.