 Today, I am going to talk about how we can estimate the energy consumption and carbon emissions of embedded platforms. I will quickly introduce myself. My name is Aditya, and I am a graduate student at ETH Zurich. I do research at the intersection of computer architecture and operating systems. With that out of the way, let's get some brief background into the problem, what we are talking about. So, I hope all of us are familiar with embedded systems, and embedded systems can have multiple energy sources. You can have direct input source, you can have a USB powered system, or Ethernet, right? You can have battery powered embedded systems. This can be a chemical battery, or a lithium ion cell, right? And we can also have energy harvesting devices. For instance, a solar panel powered system. Now, one thing that is common to all of these systems is, we want to use the minimum amount of energy to perform the computation. Okay, I hope all of us are on the same page that our objective is to minimize the energy consumption. Why is that? We want to minimize the energy consumption because battery capacity is a major constraint for design, for user experience. And we see it reflected in many devices. For instance, we see it reflected in cell phones, we see it reflected in these kind of devices. We see it reflected in upcoming technologies like AR and VR headsets. So, we have to solve this challenge, or we have to be smart about it. Now, with that, I hope all of us are on the same page, and let's dive into the problem. If I ask you, how would you calculate the energy consumption of a system? I guess all of you would have an answer to that, right? All of us are aware of the simple equation which says consumption is equal to power times the latency. Power is determined by the hardware, and latency is to a great extent determined by your software. Programmers often optimize latency using well-established tools. For example, the Linux Perf, and using well-established metrics. For instance, CPU clock cycles. But think about it for a second. If I ask you, can you give me tools? Can you give me metrics which you would use to optimize the energy consumption? Does something come immediately to your mind? And that's a slightly harder question. So, I'm guessing many of you would say, yeah, I can do that. I can do that by using things that already exist on my system. How would I do that? This is, let's say, a naive model of how people would do this right now. Your power would come from the CPU. You would say, oh, I'm going to get the power consumption via the interface. One example can be Rappel. And let's say my CPU says, oh, I'm using 15 watts right now. I get this value. I write it down. Great. Then I profile my application. I determine the latency, and my profiler says, oh, I'm using 5 milliseconds to finish the job. And you say, great, life is simple. I multiply these two values using this formula, and I get 75 millijoules as the energy consumption. Good job. That is a very traditional but simplistic model of the world. Let's drive deeper and see why this model will not scale. This does not reflect the ground reality. First of all, the first assumption that we overlooked in this model is that it assumes a linear power draw. You calculated power at one point of time, and you assumed that is going to be for the entire latency. But if you dive deeper, the CPU is constantly switching states between the packages, between the cores, and it exhibits a very high variation curve. And we need to be very careful where we are measuring power. If your measurement came at this point, then you would essentially lose out a huge amount of data. So the first limitation that this model overlooked is power is not linear over time. Up next, we saw power came from the CPU. But what about devices such as sensors, the radio, the display? I'm sure we would agree that sensors and the radio are critical in emitted platforms, right? They are the bread and butter of emitted platforms. Power is often not dominated by CPU. We have experimental data which shows that very often these sensors will dominate the CPU for a huge amount of time for most of the program's runtime. In fact, these sensors are often running constantly, polling every few milliseconds as compared to the CPU which spikes up and then goes down. So the second limitation that we missed out was we need interfaces to talk to these devices. And these interfaces are not available. Finally, let's keep going deeper. Even for the CPU, we have platform-specific interfaces. So we talked about Rappel. Rappel is available only on Intel. I tried to use Linux Perf on an ARM system and Perf said, I'm sorry, I cannot calculate energy on this system because I don't know this CPU. So we have to build platform-agnostic interfaces. And I would summarize these three limitations with this statement. We are inaccurately calculating the system's energy consumption and we are looking only at a fraction of it. We are not even seeing the entire picture. We don't know where is the energy going. And if you want to minimize it, we have to measure it correctly. This is one of my favorite statements. We cannot improve what we cannot measure. So the goal of this project is to understand how we can measure better. How we can understand better. Right. So this work aims to develop a framework which can accurately and reliably measure the energy consumption of the application on the platform. Once we have this data, we also want it to be useful because data without utility does not get used. So we want to report these statistics to the end users in an easy-to-understand format. We want to report them to the programmers so that they can take action on these statistics. Programmers should be able to optimize and to the system designers, to the platform designers to enable them to iterate and quickly explore the design space and figure out the sweet spots for performance, energy and efficiency. So this is the goal, but it's a very big-eyed view. Let's dive deeper. What do I mean by a framework? In my thought process, a framework comprises two units, the models and the tools. The models precede the tools. The models basically tell us how we reason or how we think about a device's power draw over time. So when we saw the CPU power draw graph earlier, we saw there were valleys and peaks. And I told you that the CPU is switching states. So that is the mental model that I have that the CPU is capable of switching states and each state has a different power draw at every time. So that's the model. And these models are often not available or they're very, very poorly understood for other devices. For example, DRAM. DRAM power models are very often, let's say, not so useful. All right, so once we are able to understand and reason about these devices, how do we make this action-able? And that's where the tools come in. The tools can be built to accurately calculate power on the basis of the model's insight. For instance, one such tool would be NVIDIA's power measurement tool. That is the NVIDIA SMI utility. So the takeaway from this section is that our goal is to build a reliable and accurate framework. And to build this framework, we need accurate models and reliable tools that enable us to calculate energy consumption of our platform. Now, embedded systems have been around for decades, right? And people have been constrained by energy for almost decades. So why didn't anyone think of this before? Why is there no such tool before? It turns out there are tools. And there are tools, but they're not good enough. We want to build better tools. And I would like to show you why these tools are not good enough. One such tool is Powertop by Intel. And here's a screenshot of Powertop in action. You can clearly see here the leftmost column shows power estimate. And the rightmost column shows the description of what device or process this estimate is valid for. Now, it is possible to use Powertop to obtain the power estimate of a process, a device, an interrupt or a timer on the Linux kernel. Great, great. But what is the catch here? What are we missing? I spent quite some time reverse engineering Powertop. And first challenge that I observe is power is a discrete time event. Energy consumption is a continuous process. Energy consumption has a higher correlation with your battery drain as compared to power. You can have a spike and then do nothing for the rest of the time. And that's not going to affect your battery at all. But if you have a tiny power draw for almost all the run time, that's going to affect you much more severely. Second, Powertop has significant vendor-specific implementation details. It is coming from Intel and that's all I would like to say. And third, what is the actionability of the data that we just saw? Now, if I give you an example, a process consumes 1.45 watts. Great, what do I do with this? If this process consumes 1.45 watts, how do I reduce it? Where is this 1.45 going? I don't know. And this tool does not tell me. And I want to know because unless I know, I cannot optimize. Even in performance profiling, we find hotspots. We find functions which dominate the run time and then we optimize those functions. And that kind of hotspot analysis is missing from this. Okay, so what is the system design that we are working with? Here, I would like to show you a cartoon. This is a very elementary diagram of the system design that I would like to propose. This system design has three components. The first component is device-specific measurements. The second component is kernel process accounting infrastructure. And the third component is a multivariate regression model. Think of the blue boxes as inputs and the brown box as the model and the green box as the output. Now, let's try to understand what are these inputs and how do we get them? First, device-specific measurements. We need to understand how to model these devices. And modeling these devices essentially means we need to understand how much they influence my power draw. How much they influence my overall energy. And that means we need to determine the parameters in the regression equation. How would we do that? It's a very simple method, I guess all of us studied it. It's a process of deduction by elimination. So, we have an algorithm for this. I will quickly go through this algorithm because I want you to be engaged. Well, first of all, we identify the base rate of the system. We turn off as many devices as we can and we identify what is the base power draw needed to sustain the system for a reasonable amount of time. Once we understand the base, we turn off a single device. Sorry, we turn on a single device. What do I mean by turning on a single device? For instance, in this device, one component would be the screen. And when I say turn on, the base rate would be the screen at minimum brightness and the turn on state would be maximum brightness. And I believe we can understand how sweeping the device's functionality should help us understand the power draw. Well, for a screen, we understand that brightness is a direct correlator to the amount of power that it draws. So, we turn on this device and we sweep the parameters. We sweep the brightness from, let's say, 10 to 100 and then back. And we keep taking measurements of how much my overall drain has changed. Now, what do I mean by drain? This is one thing that I should have clarified. It is very interesting to observe that we have data from the CPU from RAPU. And the second data point that we have is from the battery itself. Your battery typically tells you how much charge is there in the system and how much is left. And what we're trying to do is when I turn on the screen, I'm trying to see that, okay, it was constantly going down by this much. But now when I've turned on the screen to the max, it is going down faster. And that faster slope, that difference is going to tell me the parameter for the screen. Finally, we eliminate this device from the equation and we repeat this step for all devices in the system for as many devices as we can. Great. So, we saw how we would get the parameters via this step. Now, let's take a look at the other input. That would be the process accounting infrastructure. This is a big phrase, but it's actually quite simple. We're trying to understand the inputs, so we have the parameters and now we need the inputs. And the inputs is the amount of time that each device was allocated to the process. What is the method for this? We pull the accounting infrastructure to determine the CPU time, the network activity, file handles, memory usage, disk usage, sensor usage, network usage, as many data points as I can collect. And once I have these data points, I input them to the model to finally predict my energy consumption. Now, this sounds simple in practice, but it is not that simple. Let's take a look at what are the challenges. First, well, in an ideal world, the world in which I would love to live is one in which all the hardware devices accurately tell me how much power each device is using. So, it turns out that manufacturers often do not tell you this value simply because there is no underlying hardware which supports it. So, this screen does not have a register that I can pull to figure out how much current it is drawing. And because that register does not exist, we don't know the value. And essentially, what we're trying to do is we're trying to predict that value using this entire system that I just proposed. So, at the end of the day, it is an estimated value. But as one of the favorite sayings in machine learning goes, all models are wrong, but some are useful. And my goal is to build a useful model. It is wrong, but a bit less wrong than what we have right now. Second, accuracy and bias trade-off, and this is a tricky one. If you want to really be accurate, if you want to get a huge amount of data, your system is going to be loaded with this model. If you have a large number of parameters, this model itself will run at a particular frequency. And the more computation the model itself does, the more it's going to skew your measurements. So, accurate models generate a larger systemic load that biases your observations. And we need to be very careful about what is the sweet spot between your performance and your accuracy. Third, this is a bit tricky again. Thanks to the fragmentation in the Linux ecosystem, we have a huge variety of devices with billions of ICs. And these ICs can often have estimates that range across two to three orders of magnitude. So, some device can use, let's say, a few microjoules, other device might be a few millijoules. And this is the same class of device that you're talking about. And we need to, at the end of the day, focus on the device class, right? So, we need to have models which can reason across this much different, across this much magnitude. And finally, this is the difficult question. Please come and talk to me about this. Do you think if this model was on your system, you would allow it to send this data? Because you do understand to build a good model, you need more and more data. And data comes from users. I cannot buy one million devices to collect this data, right? So, I'm very curious on what you think should users share this data? Obviously, it is anonymized, but it gets heated discussions up in the community. Great. So, that was the part about energy consumption. That is the first step in this new equation that we just see right now. And that was the second component of this presentation, carbon emissions. I will start with this in just one sec. Okay, so, we talked about how to calculate energy consumption in a more reliable way. How do we link it to the carbon emissions of the software that we are running? And it turns out that you can link it via this phrase called energy composition. What does it mean? Energy composition depends on multiple factors. Energy composition basically indicates what was the carbon emission of the electricity service provider. So, if you bought your electricity from, let's say, a green source, let's say, a solar panel-based company, or if you bought it from, let's say, a coal-fired power plant, this number would change significantly. It also changes across countries, because different countries have different compositions of electricity providers. It changes based on the time at which you drew power. So, for instance, if you charge your device in the afternoon, that would be when the solar panels would be at full power, right? If you charge the device at night, then most likely it is getting charged from a fossil fuel-based power plant. So, there are frameworks out there which allow you to reasonably determine what is the energy composition at the particular geolocation, add a particular time instance, and retrieve that data via reasonable APIs to obtain the carbon emission of your application. And there's also significant amount of work. If you immediately think about it, if it's influenced by time, I can shift my workload to a different time to reduce the carbon emission. A lot of data centers do this. A lot of data center providers are actively scheduling their workloads, the batch workloads, at a time when green energy is much more available. Great. We understand what is the carbon emission, but how do we put this into use? It's a number, but I want to use this number. I want to make sense of this number. There's a framework that is right now being established. I'm not an expert in accounting, but I will describe this framework to you. First, there are three types of emissions. The first type of emission is Scope 1 emissions. A Scope 1 emission is one that comes from assets that a company itself owns and uses. So, for instance, if your company produces software that it uses internally, that would be a Scope 1 emission. Second, Scope 2 emissions would be emissions that a company causes indirectly from the energy that it purchases. So, for instance, if your company is located in Norway, then it would purchase energy from a very, very green power plant. Whereas, let's say if it's located in a different country which has a higher composition of fossil fuel-based plants, then Scope 2 emissions would look significantly different. And finally, Scope 3 emissions are emissions that come from the value chain. Now, as an example, TSMC is a supplier of Apple. So, the carbon emissions that TSMC creates on the way to supply chips to Apple would be classified as Scope 3 emissions. Similarly, if your company produces software that is used by other people, then your company would classify those emissions in Scope 3. And that's how you would account for these carbon emissions and make sense of it. Okay, all said and done, let's take a look at how we are going to put all of this into use, how we are going to put this into practice. For the end users, for the end users, I have built this utility. It's very primitive, but it is there, which tells them in an easy to understand and a useful format of how much energy their applications are using on Linux. Okay, you can see here there are different components and there's an application. So it tries to distinguish where is your battery going. For programmers, for programmers, we want to expose an API that enables them to take action. Okay, so we want to indicate devices with the highest energy consumption so that the programmers can optimize for those. An example use case would be if I can directly provide energy efficiency based code optimizations in the programming platform, that would be great. If I can see, oh, this for loop is going to take one kilogram of carbon. Let's break it down into a while loop, or let's break it down into something else. Let's break it down into a function. To just be aware of how my thought process, of how my programming process creates different emissions. And finally, the third stakeholder after end users and programmers would be system designers. And for system designers, we want to provide tools that enable them to iterate much faster. Designers need tools to explore the design space. The design space can have different access. One access can be performance. Another access can be energy efficiency. A third access can be carbon efficiency. So despite being aware of these accesses, it is very hard to actually explore different points without manufacturing the device itself. And that incurs a huge amount of cost. So we want to reduce the cost of experimentation. We want to reduce the cost of exploration in the design space and help the designers with better tools. Great. So that brings me to the conclusion of this talk. If there is two things that I would love for you to take away from this, these two key takeaways. Forget everything else. Just keep these two with you. First, we cannot improve what we cannot measure. Okay? And measure correctly, not just an arbitrary number. And second, non-CPU system components can also dominate your energy consumption. We need to break out of that mindset that that thought process, that CPU is everything. That thought process has its uses. But I would argue that we also need to think out of the box. Thank you very much. Thank you very much. Could you kindly go back to the slide, which basically says carbon usage is energy consumption times something. Thank you very much. So I wonder about the magnitudes between energy consumption and energy composition. So from a user's point of perspective, I think if I switch my energy sources to more like green energy, it doesn't really matter if my laptop consumes 20 watt or 15 watt. Because my maybe naive assumption is that the composition where the energy comes from plays a much bigger impact on carbon footprint than the usage of a laptop. Am I right or am I wrong? What is the magnitude between those two today? So just to repeat the question, we need to understand how much energy consumption influences my carbon emissions as compared to the composition. This is a great question. I would say both of them influence equally. But if you think about which one you can change, if I live in a particular region and if I work in a particular region, as an end consumer, it is much harder for me to buy green electricity from another source. I tend to think of electricity as something that comes out of the plug and not that if I want to reduce my softwares impact, I'm going to use from the power panel. So it is much harder as an end consumer to control for energy composition. And that is why consumers can have a more impact by pulling the levers that they have and that lever would be their application. Does that answer the question? Yeah, well, I tend to disagree but I see your point. I would love to talk to you more. Let's take it offline. Thank you for the question. Hi. Thank you for your talk. My question is that I really like the saying that we cannot improve what we cannot measure. And it seems like even on the measuring side, it's very difficult. For example, if my program does a lot of network processing, sends out a lot of packet. And what I actually can see is only the device total consumption on the network card. But I cannot retell like this part of my code actually cause how much of the energy consumption on the device. And do you see whether it is somewhere to improve this? So just to repeat the question, the way I understand is that you are saying that my application sends X amount of packets on the network. But then network card reports overall power. And we need to delineate. We need to distinguish which process consumes how much power on the network card. Yes. That's a great question. And it's a hard problem. If you're trying to distinguish, so if I talk specifically about network cards, it's a very hard problem. Because network cards don't distinguish how much each process sends. A CPU on the other hand does distinguish. So the performance counters in the CPU are process specific. But that is not the case with network cards. I am in touch with people who are working on this. But I don't have an answer right now. Thank you for the question. Hello. Thanks for your talk. And I see previously you have a diagram of four blocks there. Can you go back to that page? Yes. Oh, yes. So here is a regression model. So how do you validate this regression model and what is the validation result? That's also a great question. We need to validate the outputs that we get. So I validate using actual power monitors in the wall. So there is a power monitor that measures the entire system's power and I use that data to validate. But it's a harder problem when you don't have the power monitor. For instance, if the data comes from another device, I don't know. But my thought process is that this validation does not need to be done a lot of times. Once we are in a reasonable ballpark, we are pretty sure that it should stay in that ballpark. But yes, validation is a challenge and we need more devices. We need more interfaces and devices to be sure that this is indeed the truth. Thank you. So do you monitor the whole system or monitor each subcomponent? The whole time. All the time. Yes. No. I mean, oh, I'm sorry. I should be more clear. It says one time. So we monitor when we are calculating the parameters. But once the parameters are calculated, I don't need to monitor again. At that point, I am good to go. Okay. So if you monitor the whole system, how do you know the percentage of each subcomponent? Using this elimination process. Okay. Thank you. I have a follow-up on that elimination process. So in case of embedded devices like cell phones or even a car, for that matter. You can't turn on a display even without CPU being on. So like how do you switch off CPU and just turn on the display to eliminate? So we don't switch off the CPU. We have a base load and we tell the CPU to increase the brightness, which essentially means to write a particular value and then go back. It is not consistently working, but it does jump and come back. Okay. So CPU is at idle power consumption. Yes. And we expect that it is as low as possible. Okay. So thank you very much for your talk. I'm wondering here. I've seen the, seen your slides, not seen the talk. I understand where you want to go. What are the sort of intermediate steps that you believe are necessary? What would you need the community to help you with or to get to that point that you want to get to? Okay. So just to repeat the question, we want to understand how do we get to the end goal? What are the steps to get to our destination? That's a great question. Thank you for asking. And I need more data. I work to build models. My research is about understanding those devices and building the models. But once we have the models, we also need the tools. We also need the validated data that supports those models. So if you have any data or if you are working on this problem, please come and talk to me. I would love to help you. I believe that the more we work together to solve this problem, the more data that we have, the more useful and the more accurate this entire system becomes. I would love to talk to you. Please come and ask. And I'm already in touch with a few startups that are, that have the data that are trying to build those models. So it's a, it's a process. The more we, the more we have, the better we get, the better we get, the more we have. Okay. Thanks. Thank you. Thank you for the presentation. So I have a question about the API part where you said exposing an API to the system designer. So what do you have in mind as the, as the method for a user to query certain values from the system or from your model? So just to repeat the question, what is the API for system designers and how will we query that API? The API for programmers is significantly different from the API for system designers. The input for programmers is a code or it's an application. The input for system designers is different components. It's a completely different exploration space. So how we would query that API is we want to provide different, so we have a design space. Let's say that I can use four gigabytes of RAM versus eight gigabytes of RAM and I can use processor X versus processor Y. And we can already see there are four points in this design space. And that is the design space that system designers need to explore. So they need to provide what devices they're looking at. And based on the carbon emissions of those devices, those would be operational and embodied carbon emissions. We can calculate where do we lie in the design space. And that would be the API for the designers. Does that answer your question? Okay, so let's say if I'm writing some, if I'm developing some application. So how do I know what my application will, how my application will impact the energy consumption? Is that covered by the API? So the question is, if I'm writing an application, how would I know what is the energy consumption of that application? Right. Yes. So in this case, we would need to provide the application as well as the platform which will be used to execute it. And once we have both of these inputs, we can calculate the energy of the application. And we can also calculate the carbon emissions of the application. System designers typically have a much larger design space. They are, let's say they understand the application. They don't have the detailed application itself that they understand. And they have the tools that they will use to build the platform. And all of this needs to be input to understand how we would look at that design space. That's that. So like my question was more towards the programming aspect or the Linux or the system low level part. Like are the costs or not costs, the energy consumption values, are they tied to, let's say a system call being made by the application. So that if I made a system call to read some packets or something, is there a value that I'll get back? Or I can query that, okay, this call will take this much energy or something like that. Since the measurements are done one time, then we can maybe extrapolate those values or something like that. That is what I meant. I would love to talk more. Let's talk more. I think we have run out of time, but I would love to talk more. I'm sorry. Thank you.