 I hope everyone is going to filter back from lunch over the next couple of minutes, but my talk title is enabling SRE with adaptive feedback loops kind of a buzzword title, but I hope it becomes a bit more clear as we keep talking about it. So to give you a bit of context, I've kind of floated between being a production engineer being an SRE and actually working on software products for most of my career. And the thing that I'm really passionate about building is real-time data pipelines. So data pipelines that work in a manner that guarantee low latency and reliability for the data that's being sourced. And in my sort of guise as an SRE, I've also taken on the on-call and monitoring strategies for how you actually maintain these real-time systems and how you keep them up. And this talk is going to be focused mainly on case studies and learnings from the latter, like how do you actually create systems that can remedy themselves and have positive feedback loops that can help systems recover from wrong or saturated states. So to start off, like some little bit of a cartology here, but we all know that building real-time software is hard. And the nature of our world right now and the nature of how consumers have gotten used to real-time software means that we have very, very high expectations. So even from consumer apps like WhatsApp or Google Maps or your favorite charting tool to B2B cases, we tend to have these three stringent expectations of our real-time software. The first is correctness. We want to ensure that all delivered messages are accurate and what comes out on the other side of the pipe is the same as what we put into one side of the pipe. We want completeness. We want to be able to see the entire window of history of our messages. And we want to know that when we're absorbing a certain view of the data, that view is complete and doesn't have any missing gaps. And the third thing we care about is immediacy. We want this kind of guarantee that messages arrive in seconds and not minutes. And we architect our systems and architect our assumptions around this fact that whatever message we put into one side of the pipe is going to arrive at the other side in the order of seconds and hopefully in the order of even milliseconds because that's how stringent our requirements have become. And at the same time, while we have these very, very high expectations, ease become coupled with high operating complexity. Real-time systems tend to have very heterogeneous load just because of the speed at which data is produced and the speed at which data is absorbed. It's very easy to go from a steady state to a peak in a short period of time. So if you think about something like Flipkart's big billion sale where a lot of the traffic is human driven, but there's also a lot of bot traffic happening. When users are flooding into the Flipkart site and bots are flooding into the Flipkart site, every single interaction with the site kind of compounds in terms of the data points that they produce because you have one piece of telemetry capturing a login, we have one piece of telemetry capturing a checkout and you have various other data points produced and this can compound in a very quick manner to mean that you can go from steady state to peak load very, very fast. And the other thing with this kind of system is that even though load is heterogeneous and even though we can provision for high peak loads, whatever elasticity systems do have tends to be within a certain limit. The limits might be just more prosaic in terms of costs like you can't spend on an infinitely elastic cloud and you can't auto scale from 50 to 1000 instances because you just don't have the money for it. But this limit on elasticity might also exist in the application layer. Maybe you have some requirement for stickiness in your HTTP sessions that means you can't just like put a bunch of new servers into pool. If you have some requirement for consistent sharding of your data such that adding more clusters, adding more servers to a cluster requires resharding the data on that basic cost. And all of these end up creating limits on your elasticity. So you have to think about ways to get around those limits and avoid hitting them. And the last aspect that makes operating this real time system is very, very complex is that the meantime to impact, the meantime between an incident occurring, a degradation occurring and some end user feeling impact is way, way, way less than the meantime to action. So the meantime to action here is the time between an on call engineer becoming aware of the problem and actually being able to put some fixes into remedy it. And because humans tend to have tend to be like high latency creatures that tends to be lag in how long a human takes to respond to a problem, you're almost always assured that if there's a production incident in your real time system, the impact is being felt somewhere before human being can come in and actually address it. So all of this paints like a kind of gloomy picture for operating this, but where engineers we look for solutions to problems as opposed to getting bogged down in the in the particularities of this. But the key component of relying on real time systems is trust. They're built on trust. Customers have trust in WhatsApp. The messages are going to get delivered. Customers of cloud monitoring solutions or cloud event streaming solutions have trust that if alert is triggered that it's fired and delivered to a user within a certain period of time. But when it comes to remediation, humans are not inherently trustworthy. The reason I say this, even though in spite of the fact that we're all interested in being better engineers and we're all interested in being as trustworthy and as consistent as we possibly can, human beings just inherently have inconsistency in the way they behave. This is one really good quote that it often gets attributed to Einstein, but I actually looked into it and it's not an Einstein quote. It's a quote by this guy, Stuart Wallish, and his main name to fame is this quote and he's done some other writing and stuff. But I really like it. So the quote is, the computer is incredibly fast, accurate and stupid. Man is unbelievably slow, inaccurate and brilliant. And the marriage of the two is a challenging opportunity beyond imagination, which is to say that because human beings have our high latency creatures and because they are slow to react and react in inconsistent ways, they still have the capacity to architect systems and automation that can take advantage of a computer speed and a computer's accuracy and a computer's consistency. So what we're going to do in this session is I'm going to walk through a couple of case studies that go through this process of building virtue cycles into your software system. So virtue cycle is in this scenario, we mean a positive feedback loop, a feedback loop that allows a system to recover from a state of degradation. To kind of just give this analogy a little bit more cover, we want to distinguish between vicious cycles and negative feedback loops. You see an example on the left, like if you have a need of energy, maybe you were a little bit tired just now and you gave yourself a shot of espresso at Flying Squirrel. It's quite strong coffee. So maybe it's going to keep you up at night and maybe you're going to wake up in a tired state tomorrow morning and going to need more coffee and that's going to keep you up again and this creates this negative feedback loop that can end up being a vicious cycle that we know quite well. But contrast that to a virtue cycle or a positive feedback loop where a stimulus to a system, actually the system is able to recover from that and benefit from it. So a really good example of this is if you think about a social media site, the more visitors to the site, the more content generated on the site, the more ideas and topics that are created there and this influences the search ranking such that the site gets even more visitors in the future. So software actually has a lot of examples of negative feedback loops and we're going to go through a couple in this talk. But one of the cool things about what we do for a living is that we're always able to transition these negative feedback loops into positive ones just by identifying the key stimulus that are affecting the system and figuring out how to tweak them. So the first case, I'm going to go through two case studies today. They're going to deal with the infrastructure and software issues that we dealt with at Datadog when building a real-time monitoring system for the cloud. One is asynchronous case, a rather simple kind of client server feedback loop. One is a more asynchronous case, a distributed feedback loop across a variety of machines and services. So the first case study I'm going to talk about is using the concept of Jitter to overcome the thundering herd problem. Who of you is familiar with the thundering herd problem? Just show of hands. There's a good amount and there's a few who are not familiar with it, but I'm hopeful that this example will give it a bit of color. So I'm going to walk through a really simple piece of infrastructure for a real-time analytics use case. So let's assume that we're building a monitoring system. I know I spoke to people yesterday that were building monitoring systems and I'm hopeful that their architectures might resemble something like this and it might resonate with them. So on the left side of your screen, you have a number of clients. They may be laptops, mobile devices, watches, tablets, whatever you have. And you have software running on these clients to determine how much free disk space there is available on a loop. So you want this available at a 15-second interval so that you can guarantee the immediacy of the data. You don't want to know what the disk space was five minutes ago as much as you want to know what the disk space is right now. You are running software on a loop to collect this data, submitting it to a load balance service. And the goal of the load balance service is to essentially ingest this data and make it available for graphing. And the same expectations that I mentioned earlier in the talk applied to this particular case. You want correctness of the data. You want completeness of the data. And you want immediacy. And if any of those tenants have been compromised on, you're going to see it in the end graph. The graph is not going to look the way it should and you're not going to be able to trust it. So this is the infrastructure we're working with. I'm going to walk through a rather common failure scenario here. So the scenario is that one server errors out. It serves a 500 to our clients. Let's say we have clients that are numbered one through M, servers that are numbered one through N. We kind of expect with load balance services that errors will occur on a rather frequent basis. And the idea with health checks and the idea with load balancing. I hope some of you were at Puget's talk earlier where he really adept about this. The idea is that our load balancers should be able to respond to these situations by doing something sensible, either evicting the server from the pool entirely or somehow routing requests away from it. So in this particular case, let's assume we're doing simple round robin load balancing. We have one server that's erroring out serving a 500. And as a result, in the next iteration loop, we have a certain number of clients retrying their requests because they received a 500 the first time around. The reason we need to retry is because we want to guarantee completeness of the data. We want to make sure that we're not dropping data on the floor. And because a transient failure of a single server happens, we don't want to discard the data points we've collected entirely. So we have a subset M by N clients because one particular server is errored, are retrying simultaneously. And we have one less server in the pool because our load balancer has, smartly enough, ejected the server that's been erroring out and is unable to respond to the health check. So what this creates is a scenario where if you assume the steady state load was a certain amount, by losing this one single server, our request load is a little bit higher than average and our server capacity is a little bit lower than average. And we have a certain number of clients retrying simultaneously, along with all the new data that's coming in that creates a slightly higher request load on this load balance server pool. So the end result of all this is that if you consider a baseline amount of time that requests take the service, this particular scenario creates a situation where every individual request takes a little bit longer to service. You have a smaller server pool than before and more requests load to handle than before. And as a result, you're a little bit closer to your pool being saturated than you were in the original case where you were running at full capacity. So the health of the system kind of really depends on the fact that you can have way more servers than you actually need to handle these deviations and throughput. And it leads us to a situation where as engineers we constantly think about how do I provision my architecture to handle a peak load as opposed to a steady state load such that I'm not scrambling to add more machines or add more capacity due to these sudden deviations in load. But there's over provisioning itself, to me is a bit of an anti-pattern and this is kind of a controversial take because it's a very popular way to solve these problems. But one distinct anti-pattern that I've seen is people using over provisioning as the sole means of guaranteeing system recovery from these instantaneous deviations in throughput or in load. And the fact I'm doing this has really dangerous consequences once you reach a certain limit because as we mentioned at the start, no system is infinitely elastic because of cost constraints, because of application constraints. And by relying solely on over provisioning to recover from these kinds of situations, you're actually hiding the weaknesses in your system. You're not actually seeing how your systems perform when they're at a saturated case and you're not actually learning anything about your system that will help them be more efficient and be more optimal. You're just solving the problem by throwing more machines at it. So this isn't a real path towards making your system more efficient. And to just highlight that case, let's assume that way more servers error in this pool and 25% of the server pool goes down, let's assume. And then as a result, we see that 25% of the clients have to retry at the exact same time. Your request load is 25% above average, but your server pool is 25% below average. So this creates a really, really extreme scenario where you expect to have a certain degree of throughput. You expect your request to still get serviced. But the fact is in this scenario, far more requests will actually get serviced than you think. And the reason for this is because the clients are retrying simultaneously. They are creating this instantaneous increase in the peak load on the server pool when the server pool is a whole 25% below capacity. This is a prime example of a vicious cycle. The simultaneous retries as a result of server unavailability leads to a saturated server pool. And because the pool is saturated, several requests go on serviced, triggering further retries. So the fact that we're retrying simultaneously at the example uses a five second interval, but it could really be any particular interval. The fact that we're already trying at the same time creates a very high peak load, which causes these requests to time out and makes it very, very hard for the system to recover from this kind of scenario. So the solution that we came up with for this was actually quite simple. So we examined the possibilities of some kind of back off mechanism, some kind of exponential back off. The reality is that exponential back offs in real time systems where you're expected to guarantee a certain latency in delivering the data. And you can't back off an infinite amount to like 15 seconds or 30 seconds. Exponential back costs were a far less appealing option than just finding a way to submit the data as fast as possible, but sacrificing simultaneity, sacrificing the fact that we want all these requests to be done at the same time. So what we added was this concept of jitter. So jitter is just a piece of randomization applied to a client server request, such that when several requests are piling up in a backlog, you prevent the request from occurring at the same time by spacing them out by a small randomized interval. So this is just a little piece of go code that hopefully even if you're not familiar with Go, it should be quite legible to you. Well, the goal of actually figuring out what the retry for a single request is, is taking a base retry time and applying a small degree of randomization for it such that each distinct client. So we saw that 25% of our clients need to retry each distinct client that these retry is not retrying at the exact same time. So what this creates is this scenario where as opposed to reaching our peak server throughput, where staying consistently below the throughput by just evenly spacing out the request by applying this small jitter to the various clients that need to retry. And this ends up having a really positive impact on the feedback loop we saw earlier, because as opposed to retries stacking up and having to hit a saturated server pool, we see that client retries are efficiently spaced out. Our server load is sufficiently below the maximum. And even though some requests time out, we have this assurance that over time enough requests will get service that we can recover back to normal operation. So requests get service consistently enough that we can guarantee system recovery. This is just like a small example of a single client server feedback loop being optimized and changed by adding a degree of randomness to it, just to ensure that the system can recover from degraded states. And this for us was a far more efficient solution than having to page out to an engineer and ask them to like log on to our system and increase capacity of our ingestible. That was case study number one. The second case study I'm going to talk about is a more asynchronous use case. And it relies on this notion of adaptive state machines. So state machines just in general are people familiar with the concept of state machines just by a show of hands. Okay, that's a good percentage of the room. So just to clarify exactly what we mean by a state machine, I'm going to use an example from the world of monitoring that will hopefully help you guys internalize this idea. So if you think about monitoring or monitoring a system to be an application to be a server as a state machine, we can identify a set of distinct states that are particular piece of monitoring particular alert can be in. So the alert that that's happening in questionnaire is using a query language that checks for the error rate on a particular service. It does a time aggregation. So it says if the error rate for this service called the odd service over the last 10 minutes is less than 5% then my state is okay. And then the goal of the state machine is to constantly respond to external stimuli such that if at any point the error rate increases past 5%, it transitions into an alert state and it can flip between okay and alert depending on what the most recent window of data is. There's also a rather interesting state here where despite having some amount of data, there may not be a full window of data to evaluate whether the error rate is actually below 5% or not in which case this particular alert definition would transition to a no data state. So we have okay, we have alert, we have no data and these are the three states that are being transitioned between by the state machine and they're all in response to external stimulus. So how do we operationalize these state machine transitions across a large number of alerts and a large number of target metrics that these alerts are actually examining. So the architecture that we came up with a data dog was to essentially decouple data ingestion from the evaluation of alerts. So on the left side of your screen, we actually see the metrics being collected. We see this errors being counted submitted to our data ingest pipeline that we discussed in the previous case study. And we actually like decouple the ingestion from the actual evaluation of the alert. So we would write things to Kafka queue and as a result of the buffering that Kafka provided us, we were able to read from the queue and do both the materialization of the data in the UI which is the graphic as you see in the bottom of the screen and also the evaluation of these individual state machines to see what the state of all the alerts that were configured on the system were. So now that we had this kind of decoupled system where data was being ingested, it was being used for visualization as well as alerting, we were able to kind of evaluate several thousand alerts in real time to make sure that we were telling our customers, telling our users when things were going wrong. But because of this particular system, we exposed ourselves to lag or systemic delay in the ingestion of data. So let's assume that the same server pool that we were talking about earlier was responsible for ingesting this data from the variety of clients. Let's assume this server pool is delayed by a certain amount of time and isn't writing data fast enough to our Kafka queue and hence our real time state machines aren't being evaluated at the rate that they should be. Now as a result of this, we actually end up having an incomplete window of data. So even though we might have data for the first few minutes of the last 10 minute window, we actually have you see like the NANs represent the fact that we don't have data for any of the recent time periods that we were evaluating these windows for. And this can create a really terrible situation for our state machines because what happens when we're experiencing systemic delay in the pipeline is that we trigger this vicious cycle where because there's systemic delay in the pipeline, several state machines at the same time are transitioning to no data state, which means that they do not have the data to actually make a judgment on whether the alert is okay or not. And as a result of those state machine transitions stacking up at the same time, we see several false alerts being triggered. So the false alerts are triggered not because the metric is actually passed a critical threshold, but because the alerting system just does not have the data to make this judgment as to whether the metric is critical or not. So when we have delayed data, we experienced several false alerts, several state machine transitions. And this actually bottlenecks the system itself because it interferes with our ability to evaluate new data and actually respond to these alerts in real time. So this is a vicious cycle because it rates on itself in a negative fashion. There are several false alerts that get triggered as a result of this. And there's no really good way for the system to recover from this scenario as is. So this was a really tough anti pattern that was exposed in the way we had set the system up because we were cheating the delayed data in the same way as absent data. And if you work with any kind of real time analytics tool or any kind of system that relies on this real time feed of data, this is actually a very dangerous thing to do because data sources can fail very independently, like my laptop or my server may all of a sudden cease to submit a particular metric. And that's that server's problem, it's scoped to that particular entity in that domain. But when there's systemic delay in your actual pipeline, you experience this in a way such that all data sources are affected. And if you can't differentiate between these scenarios, and if you treat delayed data in the same way as absent data, you're going to enter this vicious cycle all the time because you're going to trigger these transitions and trigger these feedback loops that are just not going to help your system and going to give it a very hard time in recovering. So what was the solution to this? So because we had this stringent need to differentiate between delayed data and data that was simply absent, we had to set up some synthetic data, a synthetic metric submission to actually be able to look at our pipeline and actually understand where there was lag. See the goal of this was not really to insert probes that slowed down the pipeline in general. We wanted it to be a very transparent, unintrusive thing. So what we basically did was in all the middleware throughout the pipeline, we captured a dial tone. So the goal of the dial tone was similar to when you, back in the day when people were picking up landlines and would hear the dial tone to essentially tell them that, hey, you do have a connection to the network. We wanted that same kind of feedback for this vast distributed system. So what we did, so this is another piece of go code. What we did was every time we were passing a payload, a piece of data from one system to the other. We looked at the time sample data and compared it to what the time is right now. So if I'm processing data from five seconds ago, we can assume that the lag in the system at this moment of time is five seconds. And once we captured that actual dial tone lag, we would submit it to a metric source by a UDP. And we would scope it by the API key such that every customer that was submitting data to our API would have a dial tone measured so that we could handle the fact that customers get put into different charts. They might get load balance and distributed different places based on their API. And this ended up being a rather complicated system to implement, but it gave us a lot of really good feedback. So we had a synthetic time series generated for API key and then the synthetic time series would be emitted at predetermined intervals. And eventually as a result of submitting this data over UDP, we were running in Kubernetes, we could write the data out to a daemon set, collecting the dial tone, and we could make this dial tone available to all our systems. And this was a really key improvement because it meant that any given system within our infrastructure was aware of the inherent lag in the other systems. And when you have this kind of feedback loop in your system, you can ensure that something like a state machine transition or any program that's relying on data being recent and real time knows exactly how far behind the rest of the system is. So we were able to rewrite our alerting engine to damp out the systemic state changes whenever there is systemic delay. So because our alerting system was aware of what the upstream lag was, we could determine the dial tone for a particular API key based on what we'd submitted. And then based on the time window that the monitor was being evaluated for, we could apply a certain function to damp out the state changes. So if upstream lag was too high, based on the threshold for the monitor, we would make sure that we wouldn't transition the state and that a state change that might have happened if we thought the data was absent didn't end up happening because we knew that the data was delayed because of our dial tone collection. So this was a really significant improvement because it created rather than a vicious cycle, a virtuous cycle. So when we had, as a result of having dial tone input along with delayed data, we were able to suppress our alerts as opposed to having them exist and having them trigger the state machine transitions that bogged down the system. So because we were able to suppress the alerts, the system got a chance to recover and data ingest was eventually able to catch up. And the thing that triggered this problem, the stimulus of delayed data no longer existed. So we eventually transitioned back to normal operations as a result of implementing this really tight feedback loop distributed across a vast variety of our systems. That's not the only thing that this setup gave us. It's because we actually had this precedent of submitting data for distinct requests and measuring it in various parts of our system. We could use the same semantics to actually generate distributed traces by attaching individual IDs to each request that went through the system. So now, as opposed to having to look at each piece of lag independently, we could generate these cool visualizations that broke down the path of processing a payload between the various steps involved. So HTTP ingest, whatever preprocessing and database lookups need to happen, the path that the payload takes through Kafka, serialization, deserialization, and finally the actual load being evaluated. And being able to get this view into our system was super, super critical for us having the visibility to know where systemic lag is and where we can direct our optimization efforts as engineers. So ultimately, as a result of collecting all this data, we were able to hit upon another anti-pattern of ours that we were using a bit too often and that we realized that we could change. And that particular anti-pattern was we architected this really fast low latency monitoring system. But the people that were responding to this low latency data were human beings who were by nature high latency. So even if you do your best to absorb the data and alert on it in a matter of seconds or milliseconds, it's still going to take at least a minute for a human being to be roused out of his sleep, pull up his laptop and log into the system and be able to address this problem. So the idea here was that whatever gains we were making in the latency of the system, we were sacrificing those gains by still relying on a human being to remediate on the basis of those of that data we collected. So this led us to this idea that instrumentation should always be leveraged. And by leverage, I mean that any data point you collect, you should try to make sure you can use that not just in one place, but in a variety of different places. So the way you can leverage instrumentation is by working for both human beings and other systems. If you can provide this instrumentation as feedback to say your auto scaling service or your service for traffic shaping or applying quarters, that's a far better and more scalable way to move forward with instrumentation because it creates gains at various parts of your pipeline. And although human beings have the capacity to interpret correlate and hypothesize by visually looking at these monitoring systems, machines still have the capacity to act faster. And auto scaling is the prime example of this because we know that if we're plugged into AWS auto scaling or Google Cloud auto scaling that relying on that system to increase capacity is far lower latency and a far tighter feedback loop than having to rely on a human being to be paged and add capacity manually via the console. Plantation and actually using telemetry to generate these very positive feedback loops in your infrastructure. But the challenge that we had and I challenge I think anyone who's trying to implement this sort of solution will have is actually being able to experiment on it and actually being able to take this idea or this feedback loop from a developer's machine and put it into production. So what I want to close this talk out with is just a few strategies we came up with for being able to experiment at scale and actually be able to take system wide changes like this and implement them. So the real answer to how you can actually start implementing is control chaos. So are people familiar with the term chaos engineering? Is it something that's well known? It's by a show of hands. Okay, good. It's glad to see that many people are familiar with it. So control chaos with the idea here is that you split up your chaos engineering efforts into three distinct pillars. The first pillar is generating synthetic load. So load that can help you capture user traffic to a system and simulate it in a way that gives you a really good chance of catching any particular use case that a user might hit using your synthetic load testing. The second thing is a playbook for actually running chaos experiments and figuring out how you're going to introduce these degradations and these errors into your system. And the final piece of course is just central monitoring. So you can observe what's happening during the course of these experiments and react to it either to roll back the experiment or to push it forward. So one of the key things you want with synthetic load is to try to generate dynamically as opposed to relying on static samples of data, because the fact is schemas change, usage patterns change, APIs change. And if your load generators are maintained as code, they can be changed as part of the same review cycle that the APIs that they rely on are changed one. Of course, if you include randomness and fuzzing in this scenario as well, you're able to actually represent user data across a wide spectrum of use cases as opposed to relying purely on the steady state data that you might be collecting from your HTTP logs. So we're running a bit short of time, so I'm going to skip ultimately to this one slide, which I think captures the kinds of metrics you should be collecting from your system in order to analyze these experiments correctly. So on the left side, you have your work metrics. So anything that captures the kind of business case for a particular metric service. So could be checkouts, it could be the size of cards that are being added to a in a particular e-commerce portal. It could be various measures of error rate or performance. In addition, you have resource metrics that actually collect the resource, the utilization of the underlying resources, and look at your load balancers, your servers, the various pieces of infrastructure that underlie the actual application. And you sort of correlate these with instantaneous changes. So code changes, alerts, any deploy that's happened in your system, you're able to correlate that with changes in your time series metrics. This gives you a really good set of tools to be able to experiment at scale and to be able to respond to changes in traffic based on your experiments. So final parting words that I'll leave you guys with in terms of why automation is necessary and when you can sort of automate in a really effective fashion. So these are the kind of three things that I think represent good use cases for automation. So if you can use automation as leverage, so for imagine our digital module that we wrote for handling the client-server communication for submitting metrics was useful across all the client-server communication loops that were happening with our infrastructure. And so we could take that piece of learning, compartmentalize it into a library, and then push it out to various other systems that would experience the same problem potentially. So that was a really good piece of leverage because we could write the code once and then improve a bunch of different systems with it. The other thing is if the automation brings you consistency. So because human operators might respond to the same situation differently and a machine won't, you can rely on your automation to basically damp out this inconsistency and make sure that a particular procedure gets applied in a consistent manner. And the final one of course is urgency. So an autonomous feedback loop will always be faster to remedy a problem than a human. And recognizing this fact makes it really easy for you to figure out what are those cases where I need a really urgent remediation to a problem. Whereas what are those cases where I can wait for a human being to apply their intelligence to actually correlate and analyze this problem. And if you have a problem that can be solved under these parameters where your code change or your configuration change gives you leverage, it enables consistency and it enables you to respond with more urgency. That's a really good use case for automation and that's something that you should be thinking about in your infrastructure. Yep, and that's it. Happy to answer any questions. Thank you. Thank you, Aditya.