 Hi everyone, my name is Rohit and today I'll talk about Spark Lens. This is an open source tool that we have developed at Cubool and it's basically sort of designed to answer two main questions. One is given a Spark application, how many executors do you really need? So if you add more executors, will your Spark job run faster or you're just wasting compute and not really getting any value out of it? So the agenda for the talk basically is that I'll talk about performance tuning principles in general, some of them applicable to Spark and then something which are very, very specific to Spark. And then I'll talk, spend a lot of time on theory behind Spark Lens. It's mostly around how scheduling works and what are the constraints that scheduling adds to the scalability of the Spark applications. And then I'll go through an example where I'll show you how we can sort of use Spark Lens and identify what are the areas where there could be problems and how we end up sort of solving them. All right, so performance tuning principles. So mostly if you look at performance tuning, right, we, there are very simple, simple things that we typically end up doing. One is, you know, make some part of your competition faster. So this is very typical, right? We'll profile the application, find out the area where we're spending most of time and then see if we can make it faster. You can, maybe you're using an order n-scare algorithm, maybe we can sort of use an order n log n and that makes life easy. You can also do things like, you know, make, you know, CPU faster, use a, if you're on cloud, you can use a better instance type which has more compute. Most of these things will basically give you some level of, you know, make, if you make some part of the computation faster, obviously the total time will reduce. The second sort of principle that is normally used is, you know, don't do what you don't need to do. A couple of things here, for example, if you're using Spark and let's say, you know, you're storing all your data in some sort of files and you had need to sort of scan them. Let's say you're doing some computation to find out what were my orders in last one year or something. And if your data is not partitioned, you'll end up scanning all the data. So instead of scanning all the data, if you could partition your table, you couldn't specifically choose, you know, what, you know, maybe for example if it's day wise or month wise, you reduce the amount of work that is required. And the same thing sort of applies even if you go deeper. Let's say you store your data in CSP files and you have 100 columns and typically your query only requires 10 columns. Now 90 columns that you have read for every record are complete waste. And if you can sort of use file formats like parquet or ORC, then we don't have to sort of read these columns. And essentially we save some time, some computation. The third principle essentially is don't do again what you have already done. And that basically refers to caching. And typically, you know, in SQL it's a little hard to sort of, you know, do these things, but when you write programs in Scala, it's pretty easy to, for example, put a for loop and forget to notice that, you know, there's some computation which is happening again and again. And there might be advantage there, just, you know, stop back, look at it and see if you can cache it and reuse it during the rest of the computation. The last part is use more resources, paralyze and distribute. And that is where Spark comes into picture, because that, essentially Spark is a platform where you can put a code and it will just get distributed over a set of executors, get executed, and give you results. Now, Spark's make it very, very easy to really paralyze and distribute. But depending upon how the data is partitioned, how, what are the sort of constraints, how stages interact with each other, they all impact the scalability of your application. So even though parallelism is easy, but ensuring that the work is actually distributed is a difficult task. And that is what I will be discussing today. So let's just sort of think about Spark application. So Spark application basically has two parts, two distinct parts. One is the work which is done in the driver, and the one is the work which is done in the executors. The difference between these two is that in the driver, we are basically doing work in alone. By alone, I mean, we're doing work which is sort of restricted to the driver. And when driver is doing any work, there's nothing which is happening on the executors. Executors are completely free. And once the state of the execution reaches a point where it goes to the executors, then it's only executors which are working and driver is not doing anything. And this sort of structure makes it very easy to sort of understand what's going on because, so one way to think about it is let's say, if you imagine the basic hardware and you were to ask a question, where is my program counter right now? The answer to that question would be either it is in the driver or it is sort of parallel sort of executing somewhere on all the executors. So that is what I mean by the application being split into two parts. Now, this structure, so there are other constraints for example, some stages should finish before other stages can begin. Some tasks, all tasks of a stage should finish before the next stage can work. And all these things basically have a role to play in determining the scalability and performance of your application. To really tune your application, what you need to understand is its structure. So when I said Spark is a profiler, I was sort of a little bit wrong. It's actually an inverse of a profiler. So typically when you profile an application, a profiler will tell you this is where the CPU is being used, do something about it. Sparkline sort of works in an inverse way. What it tells you is where something is not being done. So you have a resource, you're not doing anything with it. How can you do something to make use this resource, this CPU, which you have allocated, which you have purchased, but you're not using that compute? So in some sense, let's still sort of stick with the profiler. So one way to sort of improve the efficiency of any Spark application is to look at where are my executors not doing anything? And then work backwards and say, how can I make them do something? And if you can basically make sure that all of your executors are doing some work all the time, you basically get a very efficient Spark application. So essentially optimizing Spark is basically a manager kind of job. Just be a good manager. Okay, so this is, I believe, one of the most important slides. And I'll sort of keep referring to this over and over in the presentation. So doing nothing is basically what we are going to focus on, as part of understanding how SparkLens works, or how even Spark works. So if you look at the y-axis, we have driver, and we have cores, core one, two, three, and four. And then on the x-axis, I have time. If you have looked at SparkUI, this is probably very, very familiar to you. But if not, then just take a look. And so these are the resources, and the green bars boxes that you see are the tasks which are scheduled on different cores. So for the purpose of this particular discussion, I'll not distinguish between cores and executors. So a four core executor, ten executors with four cores, or four executor with ten cores is same for the purpose of this discussion. It has implications, but we can ignore it for now. And then you can also see that there is a, we sort of talk about stage one, stage two, stage three, and that is where these tasks are being scheduled. Now, all the gray area that you see here is essentially the compute time, which is getting wasted. And we'll sort of try to analyze it and see how can we split it up and understand where it, what is going on here, and what strategies can be used to minimize this gray area. And that is when you get a very excellent Spark application, or a fully tuned one. All right, so the first one is driver side computation. So as I said earlier, right, the structure of the Spark application is such that when a driver is running, no stages are actually running. And so all this orange area that you see here right now is covered because driver is doing some work. Now, for example, let's say you have 100 executors running and your job, let's say take 10 minutes. And for five minutes, there was some computation which has run on the driver. So literally, those five minutes will get multiplied by all the executors, that is 100 of them. And 500 minutes out of your thousand compute minutes are just getting wasted because there is no work for these executors to really do. So first sort of principle here is, if you can minimize your driver side computation, things will become much, much faster. And so what do we do in the driver? One of the things that we do in the driver is file listings. Especially if you're looking at large tables, which typically we partition by date, you'll see that we have seen at least 1000s or 10,000s or even 100,000 files being listed in S3. And this is not too much of a problem if you're using Hadoop because the name node operations are pretty fast. But if you're working on S3, the listing could take a lot of time. And at Cubol, we have invested heavily in figuring out how to make file listing better for S3. The second reason where I've seen we spend a lot of time in driver is loading of five tables. So if you're writing to high tables from Spark, Spark tend to write to a temporary directory. And then once the whole competition is over, it will copy the files from this temporary location to the final location where the table is situated. Now the problem is that usually when you work on Hadoop, this movement happens using a calling a file system API called rename. Now rename on Hadoop or HTFS is metadata operation. But rename when done on S3 is a physical operation. It basically copies the file over to a new location and deletes the original file. So instead of being a constant sort of time operation, which is a metadata operation just updating a memory somewhere, it becomes operation which depends on the size of the data. So we have done some work at Cubol in basically ensuring that we can do writes in a parallel way, do these copies using multiple threads so that some latency is hidden. We have also invested in can we do these writes directly to the table instead of sort of going through all this temporary location stuff and ensure that in case of failures, the things are cleaned up properly. The third place where I've seen it is many people sort of use this innocently. They'll do a data frame then collect it and start for each loop. This essentially what it does is that you get all the data from all the executors into the driver. So usually you will sort of fail, we'll see out of memory happening here. But in case you don't, this is something which is essentially going to cause all the computation to come and happen on the driver and leave everything sort of executors totally free. To the extent that I have seen some people actually calling a rest API from on the driver itself. Basically for each record, they'll call a rest API and update some external system. One more way where I've seen people, typically when you're using POS Spark is they will convert the data frame to two pandas. Now, naturally a Spark data frame is a distributed abstraction. But the moment you convert it to pandas, you basically get a data frame which is running only on the driver. And this again leads to computation which is only happening on the driver, and you're not using resources available to you. The second reason why we see wastage of computers not having enough tasks. So if you look at stage two and look at core four, it doesn't have any work. So if you have four cores available and you're only giving it three tasks, one of the core is not going to get any work. And if you look at stage three, similarly, we have a core one and four, there are no tasks. So if you don't have tasks, there is no way that compute can be used. So be very, very sort of sensitive to the fact that there are enough tasks available for Spark to really execute. Otherwise, that compute is not going to work for you. So how do we control the number of tasks? Multiple things here. So example, if you're running on HDFS, HDFS block size is one parameter. The smaller the blocks, the more the number of tasks. But similarly, on S3, you can use min and max split size which sort of defines the granularity of each task. Then Spark default parallelism is another parameter, which typically, it's a property, essentially, and at runtime, if your code refers to it, you'll basically get the current number of cores that are available to the Spark application. Now, this property sort of varies during the duration of the application. So sometimes, if you're whatever you're doing, right? It requires you to set this to high value, sort of set it to high value. Fourth parameter that is very, very sensitive to task is the Spark SQL shuffle partitions. Anytime you do a shuffle, you'll basically end up using this parameter and the default value is 200. So if there are more cores using a lot more cores than 200, you might want to sort of revisit it and see, should you basically go and increase it and to a point where it sort of matches at least the number of cores you have given to your application. And the last one is repartition. That's a function available on data frame, and you can basically convert any data frame into number of partition that you need. Again, if you're changing, maybe you developed an application when you were doing something on a staging cluster and you picked up a right value, that value, when you run it on a larger cluster, this might sort of bite you later on. So sort of understand the context in which your application is running and sort of tune it for it. The third reason why we see wastage on the executor side is skew. Skew basically means that some tasks are taking a lot longer than some other tasks. So if you look at stage one, the core one has a task which takes, let's say, two units of time, whereas core two, three, and four work for only one unit. Now the way Spark is structured is that until a stage finishes, the child parent stage finishes, the child stage will not get scheduled. So it is not the average time that a task takes in a stage which is important. It is the worst case time that is impacting the total runtime of your application. So if you can reduce skew, you will get something which is even better. So now coming to skew itself, skew basically happens, exists in the data, and it basically happens because some keys or some partitions have a lot more data. So for example, let's say you're doing, let's say you have some sales data and you're trying to figure out the sales by city. And obviously, the sales in maybe Delhi or Bangalore are going to be lot more than say in Kanyakumari. But what it means is that as your data gets polarized and there's lot more data happening in one partition, it is the runtime of this particular executor who is processing records for this particular partition, which is going to determine how much time your application is going to take. So handling this skew becomes very, very important. So for example, one way to handle this could be that instead of doing a join on city, you might end up doing a join on a pin code, which is far more sort of uniform. And once you have data at a pin code level, then you can do a second level join, which aggregates data for let's say a city level. So there are some sort of ways to deal with the skew, but it will depend on the nature of the data that you're working with. All right, so that sort of brings me to the notion of critical path. So if you look at follow the arrows, right, this is defining the critical path. So the definition basically is all the time that is spent in the driver, plus the time which is spent on the largest task in each of the stages. Now this is sort of a little bit wrong in the sense that some stages can run in parallel. So actual computation requires that you look into the max between the parallel stages, but generally the idea is same. So what is critical path? So critical path basically tells you that this is the least amount of time which your application will take to finish. And that is irrespective of the number of executors. So even if you give infinite executors to any application, there is no way that application will finish in less than critical time. The good part is that if you get a single run of an application, it's possible to compute this number. And once you know this number, you know if you're close to this number, you essentially need to go back and work on your application. If you are sort of further away from this number, you still have scope, you can add more executors and still see some improvement. But if you're closer to this, adding executors is not going to help you at all. Now the logic why it works is follows. Let's say I give adding more executors will not change the time that I spend in the driver. So hopefully if I add more executors, driver will probably have more work, but it cannot have less amount of work. So adding executors doesn't really change the time spent in the driver. And that is why that is part of it. The second part is unless you change the distribution of the tasks, the largest task of a stage doesn't really gains anything or cannot be made smaller if you add more executors. So adding more executors only help if you have more tasks than the number, of course. So that those tasks can also be done in parallel, then having to wait and come for the chance and then run again. So essentially using this, we basically says that spark application will never cannot run faster than the critical path. Now what have we learned so far? So what we have learned so far is that spark application cannot run faster than its critical path and that is no matter how many executors. And the way to make a spark application sort of efficient is by looking at three sort of areas. One is reduce driver side computation. A second have enough tasks for all the course and then reduce the task queue. And if you cannot do any of them, you might be sort of, you know, the one way to sort of reduce the wastage is by reducing the number of executors. So if you reduce the number of executors is at least sort of lot more packing of tasks and you'll probably get much more bang for the buck. So that sort of concludes my work on the theory behind the spark list. In next few slides, I'll sort of talk about spark lens and how it can be used. So what is spark lens? So spark lens is an open source spark profiling tool and it's written in Scala, it's open source and it can be used with any spark application. What I mean by any essentially is that if you're using cloud era or Hortonworks or EMR, it doesn't matter or even if you're developing your application on your laptop, you can just, you can use spark lens and understand how your application is performing. And it basically helps you tune your applications by making it easy to spot opportunities for optimization. And these opportunities essentially are the one that we discussed. Riverside competitions, lack of parallelism and skew. Apart from being just a profiler, it also has some prediction capabilities. It has a built-in scheduler simulator, which basically can let you simulate if you were to increase the number of cores or decrease the number of cores. How will your application runtime change? Or how will your cluster utilization change with those? And it's pretty good because instead of trying to experiment with it, so let's say if your job takes one hour or two hours to run and you're running 100 executors, it will be very hard for you to sort of do experiments by running it on 500 nodes and 20 nodes. It just takes time and money. So being able to sort of predict it using one run of the application is typically fairly useful. All right, so the example that I'm gonna give you, this happened like last year sometime and there was a customer POC and we got a 603 lines of Scala code and somebody said, this is sort of taking a lot of time, can you optimize it? And I think the point to note is that we didn't knew anything about this code. We didn't knew anything about what the schema was or what the person was doing. So there's too much context in that code for us to understand all the details of that code. So what we'll do here is that we'll walk through how the tuning really happened and see how can we actually tune without really knowing too much about the application. So this is the first pass, so we run Spark Lens on this application and this is what Spark Lens reports. So first one is it takes 158 minutes to run this application. 41 minutes are spent on the driver and 117 minutes are being spent on the executor side. Now, we also show the critical path, which is 127 minutes here. And what it tells me is that if I add more executors, the performance or the latency will go down because critical path is, we're not really close to the critical path yet. The last thing that it reports is the ideal application. So ideal application is sort of defined as, let's say if there was no skew and there were all tasks were uniform and there were enough tasks for every executor, how much time will the application take? So essentially we're saying if all our application is the best possible application in the world and it sort of scales linearly on the executor side, how much time will it take? So that number is 43 minutes. So what it tells us is that there is skew or at least lack of tasks, which is causing this application to be slow. And we should sort of figure it out and make changes appropriately. So we started looking at the results and one of the things that we found was that this application had too many stages, like almost 700 stages. And so typically when we profile applications, I have seen that 30, 40, 50 is a usual number, 700 is a very large number. So one thought that came to mind was there could be some sort of a loop going on, which is where this time is being spent. So we started sort of looking at the code and we found that there was a right happening to a high table. And instead of writing in a parallel manner, the code was basically filtering by each partition and doing a right one partition at a time. And so we just thought this is probably wrong and Spark is sort of designed to write parallely to all partitions, why is this code doing this? So we changed that a couple of lines of code so that a normal Spark right will work and it'll write to all the partitions in a parallel manner. Now what we saw in the second pause was that instead of 158 minutes, the application took only 26 minutes. And so that was a good improvement and also the driver time came down drastically from 40 minutes to about two minutes. And total time was just 26 minutes. So from 158 minutes, we came down to 26 minutes. So but what is interesting to note is that the critical path time is just 25 minutes, which means that if I were to add more executors at this point of time, I really cannot really expect this application to perform any better than 26 minutes. But if I look at the ideal application, which is about five minutes, which is showing four minutes and 48 seconds, there is still scope for improvement, there is some skew, there is probably some lack of task which we need to look at. And if we can sort of fix those things, instead of spending 25 minutes, we could probably bring it down to five minutes or ten minutes, some lower number. So this is, so before we sort of debug this further, this further, right? One thing that we noticed was that the executor, so Sparkline also reports you how much is the wastage happening on the driver side versus how much is the wastage happening on the executor side. So here we can see that 91% of the total executor time is actually wasted. And we don't know why, but there's huge wastage there. And so we should probably look at this wastage and try to minimize it. So as I told you earlier, Sparkline also has a simulation component where it simulates the results for you. And here the yellow part, which you see 100 executors and 26 minutes is the real number. All the distance things that you see are simulated. And you can easily see that whenever adding any executors, we go from 100 to 200 to 500, the time for computation, at least in the simulation, doesn't goes below the 25 minutes. And similarly, if we go up and start reducing the number of executors, we see that even if we reduce the number of executors to 50, the total time will still be 28 minutes, which is just two minutes more than the current time. So you get a nice trade off between how much compute you want to use, versus how much latency is useful to, which you want to tolerate. And on the other side, you also have a utilization metric, which tells you how much of the cluster is actually getting used. So if you look at here, we are only using 8% of the cluster with 100 executors, and probably we should target for a lot more. Now, one caution here, when I say utilization, I don't mean CPU utilization per se. All I mean is that there was a task which was scheduled on that particular core or that particular machine. That's all. So it's possible that the machine or the task which you are scheduling is actually IO bound, it's not CPU bound. So CPU utilization is not correlated with the utilization per se. It only means something being scheduled. That's all. So there are a lot of other metrics that Spark Lens provides, not the top level metrics that we talked about, which is the driver and executor utilization. It also provides metrics per stage. So the metric that we need to look at is for every stage, for example, we have wall clock percentage, core computers, task count, peer ratio. I'll explain all these numbers. But what is important to notice here is that if you, for example, look at stage 33, 85% of the time is being spent in this particular stage. So instead of sort of approaching this problem, instead of looking at all the stages, we can very easily narrow down to the few stages where most of the time is being spent and then try to focus on what's going wrong in these stages. So in this case, for example, we see that the number of tasks which are running in this stage is only 10, whereas the number of cores that we have is 800. So it's a huge wastage. And if we could somehow figure out why there are 10 tasks only, we can probably bring it down, increase our utilization. So in terms of metrics, the key metrics that are printed are the wall clock percentage, which is the total time spent in the stage relative to the time spent in all the stages. So it basically helps you narrow down to a few stages where you want to investigate instead of looking at everything. The peer ratio essentially talks about the parallelism. Ideally, this is basically the number of tasks divided by the total number of cores that a stage has. So typically, you want it to be 1, not 2, something like that. So instead of if it is lower than 1, that should ring a bell that you're not using all the cores. Tasks, you again, tells you what is the ratio between the largest task versus a median task. And that gives you a sense of how much skew is there in the data. And is it enough for you to go back and start investigating what to do with it? The OI ratio, for example, is the output bytes to the input bytes at each stage. And this lets you know, typically, you will sort of expect that as you move from one stage to another, slowly the amount of data that is reachable to the next stage should actually reduce. So if it's not happening, then you should probably look at maybe there's something wrong with the logic, and you should look at it. So sort of coming back to where we started during the job, what we found is that 85% of the time is spent in a single stage. And it has very low number of tasks. So when we looked into the code, we found that repartition 10 was called somewhere in the code. And that is what was causing, resulting in 10 tasks. Most likely it was done maybe sometime on a staging environment or something. And the same code sort of came to production. So we changed it. And we also changed the Spark SQL Shuffle Partitions, as I mentioned earlier, from default 200 to 800. So when we ran it after these changes, we see that this application sort of finished in about 10 minutes. So we started with 158 minutes, made a couple of changes, and we could bring this application down to about 10 minutes. Interesting to note two things. The critical path at this time is about seven minutes. And the ideal application is also about seven minutes. So what it is telling us is that adding more executors is not going to help. We are sort of performing in a way where there is hardly any skew. And basically this is sort of a definition of highly optimized, how does a good application looks like? So if your critical path and application path are same, then you have achieved Nirvana, essentially, in the Spark world. Even if you don't have to be pretty exact, but this is where you know your application is 2D scalable. So with that, I just wanted to sort of come to limitations also. So essentially if you look at it, Spark Lens is a model which is built using looking at a run of a job and it predicts the behavior of the same job when you run it on different sort of, different number of codes. But still it's a model and there are, of course there are second order effects which we don't take care of. So a few of those things are, for example, executor or driver GC. So if GC is happening either before or after in some, some executor, for example, when you're running your job, it's possible that overpowers sort of configures the model incorrectly. And if you run it in some, if you run it with different set of codes, the behavior that you see is a little different. Then shuffle service performance varies with the size of the cluster. A 10 node cluster will have a different characteristic than a 100 node cluster. So that can be a bottleneck or a limiting factor. Again, when working with S3, the throttling, network bandwidth, CPU contention, there are a lot of other factors beyond just the availability of tasks which can limit the scalability of the application. So those still have role to play. But in general, if you're not going from 10 to 10,000, in a reasonable range, you will find that the numbers are, what Sparklinx predicts are pretty useful. And also one good part is that when you run Sparklinx, it'll also report its own error, saying this is the actual time and this is what I predicted it should be. So if that number looks to be fairly large, you have to take the results with a pinch of salt. And that's, yeah, one more thing is if you're using, for example, large executors, by large executor mean you're using executors which are larger than when you run the Sparklinx, the plan itself can change in the sense that broadcast joins become accessible. So Spark might change the plan to use broadcast joins, in which case the structure is not quite comparable. So the performance that Sparklinx predict will probably not be same. And the last one is the Spark default parallelism. Since this number actually depends on the actual number of codes, if your application is using this number, then this number will change as you move to more number of tasks and hence the prediction will also change. So these are some of the reasons why we know where things, the output of the Sparklinx could be wrong. But that's fine, I generally don't think of output of Sparklinx in terms of right and wrong. Mostly in terms of, is it useful? Does it help you guide you? Are you doing trial and errors? Or do you have some sort of idea or some sort of direction? What is more important? What to look at? What to focus on? And how you can sort of navigate your way around tuning your Spark application. And in summary, Spark application cannot run faster than its critical path. Spark application can be made efficient by reducing driver side computation, having enough tasks for the course, and reducing task queue. And if completion time is not an issue by reducing the number of executors. So this is open source and there's only two. So when you run your Spark Summit, if you add these two parameters, hyphen hyphen packages and on Spark extra listeners, you could see when the job completes, you will see all this output that comes from Sparklinx and its open source. So the code is available. If you guys want to try out, contribute, change, everything is there. That's all. Thank you. We have time for questions. Hello. Hello. Yeah, so this is useful for a normal Spark application, but how do we do it for stream applications? Oh, so there is a, I think someone from Intute asked for it and they are working on a PR for this. So hopefully there will be something. Not right now. One way to do that is to, for example, you can sort of manually sort of stop the application after one run and see what kind of behavior do you see? So even if we stop the application, it should be able to produce. It should be able to produce how, given the nature of, assuming that the application characteristics or the data characteristics are not changing, the prediction that you get, based on how many extra codes do you add, it should be useful information to have, even with one prediction, sort of one data source, one data batch thing. Those who are leaving the auditorium, please make sure that you drop your feedback forms. There is a box, especially kept for feedback forms. Please drop your feedback forms there. Hi, can you please share some of the best practices to optimize a Spark app, which does IO. So I have to connect to a REST API. So there are limited number of outgoing connections. I mean, there are limited number of connections in a connector pool. So can you share some of the best practices to reduce this queue in those scenarios? So if you're connecting to, using REST API and talking to a REST API, so if you look at S3, it's a REST API. So it's not that there is any problem connecting to REST API. The problem essentially is, are you doing it from the driver or the executors? Executors are scalable, and if your service is scalable, technically you should be able to scale it to the point that you want to. So if the limitation is not on the Spark side, as long as your REST API is being called from the executor side, it is the limitation of the API itself, essentially. Hello. Hi. So how do you calculate the critical path number and ideal application path number? Sure. So critical path number is essentially the time spent in the driver, because if you add more executors, your driver will not magically run faster. So that time is not going to change. The second thing we add to it is, for every stage, what is the largest task? So if you look at the largest task, that task is not going to run faster if there are more executors. So that is some of the large, so essentially there's some little bit more caveat there in the sense that some stages run in parallel. So you have to be sort of, actually know the DAG and compute it that way, but that's what it is. Ideal application is essentially look at all the time that is spent in all, all the tasks in a stage and divided by the total number of executors. So what you get essentially is a uniform number, average number, and ideally if average is what happens, that is what you want. Every executor gets enough tasks and they run all the time. Hi. Hi. In the Spark job, it also happens that sometimes the speed of the Spark job also depends on the size of the file. If there are too many small files, or if there are very big files, so based on each Spark executor, does Spark Lens also tells that your files are too small, make it more bigger files so that there will be less disk IU or something like that? So it depends on if the format of the file is splitable or not. If the format of the file is splitable, then in that case, even if it's a big file, it's not that one Spark executor is going to run through all, will work on full file. And second part is yes, if you're spending a lot of time in one of the tasks, in one of the executors and there is a skew, that will get, it's not reported at a file level, but it'll report that there's some tasks which are very, very large. Hopefully, if you know the stage and if you look at the code, you will understand that this stage is actually about file IU and so that correlation you have to make, but for Spark Lens, it's just the tasks and the durations that it looks at.