 Hello all. Today's topic is on improvisation and demonstration of Linux thermal framework for multiple temperature sensors. I am Aditya K.V. and my co-author is Tausif Nomani. We work at Samsung Semiconductor India R&D. So this is the agenda for today's presentation. First we will try to understand what is the need of thermal management unit. Then we will have an overview and internal look of thermal management unit. Then we will have overview on Linux thermal framework. Then let us try to understand how thermal management is done in Linux kernel. Then let us see what are some of the important structures and functions in Linux thermal framework. Let us walk through on pseudocode of conventional TMU driver. Let us have a demo of interface from user space for thermal framework. Let us try to see how TMU in complex SOC looks. Then let us try to understand what is the limitation in conventional TMU driver for TMU in complex SOCs. Let us have a walkthrough on pseudocode of complex SOC TMU driver. Then let us see what is the scope for improvement in thermal framework and how it can be implemented. Let us see what is the need of thermal management unit. In an SOC it can have different blocks. When there is high computation or high frequency operation going on in certain block, the temperature of that block increases. For example, in this picture we can see the memory blocks are cooler comparatively to GPU blocks and GPU blocks are cooler comparing to the CPU blocks. The temperature is high on computational blocks. What happens if temperature of SOC increases? SOC can perform poorly if the temperature is high. SOC can malfunction or if the limit is crossed, the SOC can get permanently damaged. So what is the solution for high temperature in SOC? In Linux thermal framework we have terminologies like thermal throttling, thermal cooling and thermal tripping. Let us look on each solutions. Thermal throttling means reducing the clock speed of that block. So the block will be operating in the limited performance. So the heat doesn't build up in that block. What is thermal cooling? Any device which cools the block is called cooling device. For example, a fan can be a cooling device. Thermal tripping. Thermal tripping means when the temperature of SOC crosses its limit, then it can cause permanently damage. To avoid this, we can switch off the power supply to SOC by informing PMU or by controlling voltage regulator. So thermal throttling, thermal cooling and thermal tripping, these three solutions can be provided by thermal management unit. Let's have a look on internal of TMU. TMU have a controller and a temperature sensor integrated within it. TMU can be placed in any block of SOC where temperature need to be monitored. TMU controller configures the temperature sensor and initiate temperature sensor to sense the temperature. There are different threshold levels in the controller. When the sense temperature processes the threshold level, controller generates interrupt. There are separate threshold levels as well as separate interrupts for thermal throttling and thermal tripping. Most of the vendors nowadays also provide support for emulation mode. When the user inputs is emulated temperature, controller instead of taking temperature from the temperature sensor takes the emulated temperature. This emulation temperature is mainly used for debugging as well as to verify the future of TMU. In a real environment, if we cannot simulate certain temperature, we can emulate this temperature using emulation mode. Let's look on Linux thermal framework. Any TMU or temperature sensor are called as thermal zone devices. And any devices related to cooling such as fan are called cooling devices. Linux thermal framework exposes these cooling devices and thermal zone devices to the user space. In order to expose these to user space, these devices has to be registered with thermal framework. Once these devices registered to thermal framework, it becomes part of thermal management. Once these devices exposed to the user space and user space application can take decisions based on the current temperature rate from devices as well as based on threshold temperatures. Now let's see how thermal management is done in Linux kernel. Temperatures will be read from temperature sensors and it will be compared with threshold. If temperature is less than threshold, after some delay, again temperature will be read. If the temperature is greater than threshold, it means associate temperature has been increased. So now cooling devices will be switched on. Once the cooling devices are switched on, again current temperatures are read. And this current temperature is compared with the desired temperature. This is to see if associate has cooled on or not. Once the current temperature has reached desired temperature, then cooling devices will be switched off. And this cycle repeats. The switching on and off of cooling devices is done basically to save power. This is how thermal management works. Now let's look on some of the important structures and functions used in thermal framework. Structure thermal zone device ops. This structure need to be initialized before thermal zone register functions are called. Let's have a look on important function pointers inside this structure. Bind. This binds thermal zone devices and thermal cooling devices. Unbind. This unbinds thermal zone devices and thermal cooling devices. Get temperature. This function reads temperature from temperature sensor and displays it to the user space. Set trips. This set the trip window for current temperature. Change mode. Thermal management can be done from kernel or some user space application. To switch thermal management between kernel and user space, change mode is used. Get trip type. This gives what kind of trip temperature it is. Trip temperature. Get and set. This sets the trip temperature and it reads the trip temperature. Trip hysteresis. Get and set. This trip hysteresis is used to calculate the minimum change required to trigger the trip trip interrupt. And set emulation temperature. This enables and set emulated temperature. Structure thermal zone of device ops. This off device ops is used when we fetch thermal zone from device tree based on device nodes. This is very similar to the structure we used in the previous slide. Let's have a look on function pointers inside this device ops. Get temperature reads temperature from the sensor. Get trend. This function calculates the rate of change of temperature. Set trip. This sets the trip temperature window for current temperature. This set email temp sets the emulation temperature and enables the emulation mode. Set trip temp. This changes the trip temperature threshold. Let's have a look on struct thermal cooling device ops. This cooling device of structure need to be initialized before cooling devices are registered. Let's see some of important function pointers inside this structure. Suppose if a cooling device is spanned and has three states low mid and high, then get maximum state gives the maximum state possible. Get current state use the current state of the cooling device. Set current state sets the current state of the device. Get requested power. This calculates the power requested by cooling device. State to power. This function calculates the power consumption based on the cooling device state. Power to state. This calculates the state of the cooling device based on the power consumption. Now let us see some of the important functions used in thermal management. Thermal zone device register. This function adds new thermal zones in the folder sys class thermal. When this function is called it also binds cooling devices which were already registered or which is registered at the same time. Thermal zone device and register need to be called if the particular thermal zone is not needed anymore. Let's see some of the parameters which need to pass when we call thermal zone device registered function. Type. This is actually the name of thermal zone. This gives what type of thermal zone it is. Trips. This indicates how many trip points are there for thermal zone. Mask. This represents whether the trip points are writable or not. This is the device data pointer. This thermal zone device ops. This structure we have already discussed in the previous slide. Thermal zone parameter. This pointer will be used for device callbacks. Passive delay and polling delay. Passive delay is the delay to wait between poles when performing the cooling action. And polling delay is the delay to wait between the poles when checking whether the current temperature has crossed threshold temperature or not. Similar to the last function thermal zone of sensor registered function registers the thermal zone based on device trees. Upon calling thermal zone of sensor registered function, it will search thermal zone in the device tree and adds the new sensor to the thermal zone. Let's see what are the parameters inside these functions. Dev. This is the device node of the sensor. Sensor ID. This is sensor identifier. When an IP has more than one number of sensors, we need to pass this sensor ID. Data. This is a private pointer that will be passed back when temperature is red. Thermal zone of device ops. We have already discussed about this structure in the previous slides. Thermal cooling device register. This function registers the cooling device and exposes the cooling device to the user space in the folder sys plus thermal as cooling device. This function also checks and binds the thermal zone. The function parameters are typed. This is the name of cooling device. This Dev data is device private data. Thermal cooling device ops. We have discussed about this structure in the previous slides. Now let us have a look on how thermal zones are exposed to the user space. Upon system boot, TMU driver will be proved. These device ops structures need to be initialized to the respective functions of respective drivers. Once these device ops are initialized, then these ops need to be passed to register thermal zone. Upon calling register thermal zone, thermal zones will be created in the file system. User can give command to read temperature from thermal zone. Once the user give the command, temperature will be displayed to the user. Now let us have look on sudo code of conventional TMU driver. In conventional TMU driver, we need to define thermal zone device ops and initialize get temperature function pointer to the driver specific function which read temperature from the sensor. And during the driver proof function, call thermal zone device register function. To this function, pass the device opt structure which is already initialized. Upon doing this, thermal zone will be registered and will be exposed to the user space. Driver might have a function to initialize the thermal zones also. In this kind of approach, we can see probing of TMU driver happen multiple times. For each instances of TMU probe will happen separately and multiple instances of TMU driver will happen. With this kind of approach, little bit of additional memory and time to probe will be consumed. Now let's see how get temperature and set emulation actually work. Inside folder sys class thermal, we can see all thermal zones which are already registered. If you see inside thermal zone, we will see number of files. Here temp is actual representation of get temperature function. If we try to read them, then temperature will be displayed to user in relations sys. Instead of giving the numerical value of thermal zone, if we give a question mark, temperature of all thermal zones will be displayed to the user. Now we know how to read all temperature from all thermal zones. Let's see how set emulation function actually work. Inside sys class thermal zone, emulation temp, this is one of the file. Now here, I am trying to write 1000 millicentures to the file emulation temp. This inter internally called the function set emulation and enables the emulation mode. So here I am writing value 1000 millicentures to the thermal zone 0, 1 and 2. After that, I'm trying to read all thermal zones. We have five thermal zones in this demo. So upon reading all thermal zones, I see for thermal zone 0, 1 and 2 as 1000 millicentures. This is because I have set emulation temperature as 1000 millicentures. For remaining thermal zones, thermal zone 3 and thermal zone 4, the temperature is actually from sensors. Let's see what are policies. If we see inside folder sys class thermal thermal zone, we can see policy and available policies. What are these policies? These policies are actually thermal governor to manage overall thermal functionality. If we try to see what are available policies, we can see power allocator and stepwise. Power allocator policy is actually closed loop control. This is based on power budget, temperature and current power consumption. Power allocator kind of thermal governor implements PID controller with temperature as controlled input and power as controlled output. We can also see K underscore D, K underscore I, K underscore P O, K underscore P U. These are actually constants for PID controller. We can also see integral cutoff. When cooling device can't bring temperature to the exact value governor has requested, the maximum allowed offset is represented by integral cutoff. Now let's see what is stepwise thermal governor. This is actually open loop control. This is based on temperature threshold and rate of change of temperature. This kind of thermal governor actually walks through each cooling state of each cooling device. We can also see what kind of thermal governor is currently being used. If we try to read what is policy, we can see stepwise. So in this example, thermal governor is used is stepwise. Now let us see what are trip points. Trip points are actually threshold levels, which we have already discussed in earlier slides. If we see inside folder sys class thermal thermal zone, we can see trip point hysteresis trip point temperature and trip point type. There are zero to seven threshold levels in this example. Based on these threshold levels or trip points, governor will switch on the cooling device and operate the cooling device in different states. If we try to see what type of trip point it is, we can see passive and critical. Zero to six are passive threshold levels or trip points. And seventh is critical trip point, which means upon reaching the current temperature to this critical temperature, power to associate should be shut off. We can also try to see what is the current threshold level set for each trip point. It will be displayed to the user in mini Celsius range. We can also try to see what is the hysteresis value for each threshold level or trip point. Hysteresis is minimum change needed for thermal governor to take the next action. We can see for zero to six points, thousand militias is the hysteresis value. Where for critical trip point zero is the hysteresis value, which means when the temperature reaches the critical temperature, then thermal governor should not wait for any change in the temperature. The action should be taken immediately to shut off the SOC. Now, let us see what is mode and type for each thermal zone. Mode is current mode of thermal zone. If we try to read mode of any thermal zone, we can see either it is enabled or disabled. Enabled means thermal management is done by kernel. Disabled means thermal management is not done by kernel, but from an user space application. We can also try to see what type of thermal zone it is. Type is actually the name of thermal zone. In this demo, we can see thermal zone zero, one, two, three are actually CPU block temperature. And thermal zone four is GPU block temperature. With this mode and type parameter, we can also try to understand that thermal management for CPU blocks is enabled, which means thermal management for CPU block is handled by kernel. Whereas for GPU block, it is disabled, which means thermal management for GPU block is handled by user space application. Now, let's have a look on how TMU looks in complex associates. TMU in complex associate will have remote sensor interface in additional to the conventional TMU. entire associate can have single TMU controller. And if any other blocks, temperature need to be monitored, then only remote sensor need to be placed in that block. This remote sensors will be connected to main TMU via remote sensor interface. By this kind of approach, size and cost can be reduced. Now, let us see what is the limitation in conventional TMU driver? Why can't we use same conventional TMU driver for complex TMUs? In complex TMUs, TMU main sensor and all remote sensors are represented as a single unit. Remote sensors are connected to controller via remote sensor interface only. If we if we try to register using conventional TMU driver, then only main sensor will be exposed to the user space. The remote sensors cannot be registered with the conventional approach and it cannot be exposed to the user space and user cannot treat temperature from the remote sensors. So what is the solution to register TMUs in complex associates? So what the approach we can take is define thermal zone device of structures separately for all sensors and initialize get temperature function pointer to respective functions, which reads temperature sensor that to separately. In the probe function call thermal zone device registers multiple times and pass this device of structures respectively. By doing this, all the remote sensors will be part of thermal framework and will be exposed to the user space. By this approach, TMU driver probe happen only once. Now we know that thermal zone device register function registers a thermal zone. Let's see how it is done. When this function is called it internally, it sets passive and cooling delays. Then it sets trip parameters like type, temperature and hysteresis. Then it binds the cooling devices with the respective thermal zones. Then it initializes the thermal governor. And finally, thermal zone will be created. The current existing thermal zone device register function. It can register only one thermal zone. There can be a scope for improvisation in this function, but this function can register main sensor and all remote sensor in a single call. If we have this kind of improvisation, then this function need to be called only once. There is also a scope to have a map which shows the relation between main sensor and all connected remote sensor. If we have that kind of map, then we can it will be easy to understand which remote sensor is controlled by which main sensor. By this we have reached end of slides. I hope this presentation was useful for all those who use Linux thermal framework. If you have any questions, please feel free to put your questions in text chat. Thank you.