 Hello, my name is Chan Choi. I'm working at Samsung Electronics. Today, I have a presentation about the non-CPU device power management by device frequency framework in the RISC corner. Yeah, as you know, there are many non-CPU devices, such as CPU, memory controller, storage, and so on. This talk contains how to develop and optimize the device frequency driver. And this talk has this content, what is device frequency. So I will explain how to add and optimize the driver and governor. And also there are some collaboration with other frameworks. And I will explain how to provide and optimize the device by only using corner picture and also check the use in your use case in the main encoder. And lastly, I just mentioned what is weakness and to the rest of the device frequency flavor. Yeah, listening embedded device must require a high quality imaging for high resolution display and picture and test data transfer for large size video and then low latency to access storage. About this, there are a lot of device related to performance and power management, such as GPU, memory controller, and memory data bus and storage. This device must need too many power to support the requirement. Also, people want to use device for a long time, even if it has the limited battery capacity. It means that you need to support DVFS for non-CPU device power management. So device frequency framework supports power management mechanism in order to keep the balance between performance and power. Traditionally, we almost focus on the CPU device for performance and power. As I said previously, we need to consider non-CPU device. On the left side, there are some framework for CPU. On the right side, show the framework for non-CPU. The runtime PM and generally power domain framework support the dynamic power control for device on and off. And device frequency supports DVFS on the same layer with CPU frequency. The device frequency is very similar with the CPU frequency, but it has a unique feature. I will explain it on this talk. First of all, check the device frequency internal module. There are three main modules, core, governor, CFS. The core module contains three sub-modules. The timer is used for monitoring the device periodically in order to check the next frequency and use the monitoring timer. And adjust the final frequency, decide the final frequency among the multiple inputs like governor, regional input through CFS and interconnect summer. And also, it core supports the device frequency suspend module. And governor has two sub-modules, get device status and manager life cycle of the governor. The governor needs to get device status for deciding the next frequency. And the status is used to decide the next frequency by governor algorithm. And also, you need to control the governor life cycle and governor list. And this page shows the external relationship of device frequency. Firstly, device frequency must control the clock and power by CCF and regulate the framework. The CCF is clock control framework in the main dynamics corner. While you can use OPP interface like operating performance point framework to control the clock and power, it provides a very helpful function to control the clock and power. If you use them, you can read through your code and you can read through your mistake. Also, there are some specific device frequency driver with firmware and special device. In this case, you can use SMCCC or direct register access for the specific device. The SMC is the secure mode call to control the secure OS or firmware. And also, power management quality also provides the interface to control the minimum and maximum frequency for each device. The external driver and external user can change the minimum and maximum frequency for device frequency. And when receiving the multiple input from the various framework, PMQS will pass the final min or max frequency to device frequency. It looks like an arbiter. So, internet, interconnect and summer CCF interface use the PMQS to inform their requirement. On later chapter, I will explain the detailed operation of each module, how to do and how to collaborate with other frameworks. In this chapter, I will explain how to add device frequency driver and governor. The device frequency driver supports DVFS by controlling cloud and voltage according to device status with governor. The main icon already consists of the kind of device frequency driver like GPU for rendering, memory bus to transfer data and memory controller and storage for UFS, universal flash storage. And recently, R2Cache's driver was posted to main line current, but has not yet merged just on the review. For driver, the developer need to implement device frequency, device profile structure. It is mandatory. You need to implement the target and get device status and also this structure includes another function and variable. The target function changes the clock and voltage or execute the essence call to control the formula for changing the hardware resources like clock and voltage. And also get device status return the current device status. This result is used for governor. And then you need to get hardware information from device tree and choose to governor like simple on-demand performance and user space. And lastly, you just add your device frequency driver to the system. Yeah, it is finished. It is very simple. And there are one thing if you want to control the frequency by OPP interface you need to add register OPP with device frequency helpful function. Yeah, this page is explained in detail with device frequency device profile structure. The user can visualize the five variable when implementing the driver. And also there are four functions target and get device status are mandatory as I mentioned. The pulling millisecond timer affect the timer monitoring for simple on-demand governor and the OPP threshold and down differential affect the frequency scaling speed on the governor. Yeah, these four variable are tuning points for governor behavior. And also the governor is deciding next frequency by governor algorithm. Governor has their unique algorithm. Generally we use the already defined governor like CPU frequency and CPU idle frame up but device frequency frame up only to as your device on governor. It is key point. As you know the non-CPU device has not any standards how do you specification. It is very difficult to support on non-CPU device with ready defined governor. For instance the main anchor already has the reference for device on governor like TAGRA ACTIMON driver for NVIDIA TAGRA SOC. It is the advantage of device frequency and it makes device frequency more flexible and extensible. Just check the step to add governor. Firstly, need to initialize structure device frequency governor structure. They are mandatory variable and name and get target frequency and event handler functions. Firstly, get target frequency function have to include the governor algorithm for deciding the next frequency. It is the key function you have to develop the mandatory and event handler function handles the life cycle of governor. There are governor events like governor start stop, suspend, region. As I say, two functions are mandatory. If you want to add your governor you have to develop the two function and list and just add governor. After that any device frequency driver can use this governor. Device frequency governor structure contain the name attribute flag and two functions. The name indicate the governor name and governor ccfs attribute flag. Basically common ccfs attribute are provided for all devices. Instead, some ccfs attribute have to be initialized when adding governor according to governor characteristic and feature. Lastly, governor feature flag, for example if governor has immutable flag this governor is never changeable to other governor through the ccfs interface. For example, passive and tagline active governor are immutable. This is an example to add device on governor for tagline active governor. Firstly, alternative this use polyinterval ccfs interface to change the period of interrupt. Even if it use polyinterval but it doesn't use the timer. And next the flag include immutable and interrupt driven method. It means that it use the interrupt method instead of time of method. And also, get target frequency as I said, it include the tagline on governor key algorithm to decide the next frequency according to current device status. We need to add the device on governor for your non-CP device. And lastly, event handler control the life cycle with this governor. In summary it is immutable and it is working with the interrupt method instead of timer and also it has the device on governor. And also I recommended device governor need to support the life cycle for device frequency governor event. Start, stop and others. And also control them. And basically if you use the simple monitoring timer and it will be run periodically. So run timer and when timer is expired it require to get target frequency from governor. And then in the governor module get the current device status from device frequency driver. And then you want to decide the next frequency by using governor algorithm current device status and then return the target frequency to core. And lastly the core adjust the frequency with multiple input like governor result, KMQS and OPP input and set final frequency to device frequency driver. And also there are some ccf interface for tuning point about the governor and timer method. And also if you external user want to change the minimum and maximum frequency outside user can use the PMQS request and OPP. And also I will explain it more detail on later. And also as I say you can add your own device governor. It is very key point. For one simple on demand governor with this diagram the simple on demand handler is for handling the governor event for life cycle and simple on demand function has a key algorithm for this governor. You will decide the proper frequency by using the current device status. And lastly each governor can choose ccfx interface. Surely the common ccfx interface are provided for all governor. As you know performance, power save user space governor is not special, it is general governor. So this page should passive governor. The passive governor depend on parent device behavior such as other device frequency device or CPU. If device frequency device use the passive governor, he cannot decide the next frequency by one say it always require parent device instance. Just show the pattern of the between passive and parent device. The first if device A is the parent device and change the frequency on parent device and then send notification to passive device and lastly pass the device receive the notification and then decide the next frequency by using the parent device result. Like this passive device depend on the behavior of parent device. I add the example of passive governor use case on Samsung actions 542 to SOC is included in the orderly X3 target. It has various AMBA AXI bus. It is data bus to transfer the data between memory and device. It has the 15 AXI data bus, but they share the one power even if each bus is working along with separate cloud. Because of this hardware constraint it require the passive governor. In result only one bus device control the control of power and then remain the bus control the frequency according to visual to parent device. Like this case we can use passive governor. If the passive device depend on parent device and this chapter explain the CCFS interface. This page show the common CCFS interface for all device. The governor indicate current governor name and available governor show the available governor list but if governor is immutable it is not show the only current governor name because the governor is not changeable if governor is immutable. And available frequency show the available frequency list and current frequency and mean frequency are very useful to check the frequency change and current frequency. Also user can change the minimum and maximum frequency for their requirement. Lastly transition statics note show the frequency transition statics and timing state it is very useful for simple profile of device. This page show non-common CCFS interface. Each governor can choose the CCFS node. Timer has two type timer, deployable and delayed. The following interval change the timer interval if you want to check the current state more frequently you just down the interval value and up to the soul and down differential are related to frequency up down and speed. Frequency scaling speed if you want to increase your frequency more fastly for more fast response just down the up to the soul value like this you can tune this value for your circumstance and environment. Yeah this tab show the summary for all governor what with the attribute and governor features. When you are adding your device governor you can choose the attribute and feature like immutable and this feature explain the collaboration with other framework like OPP, PMQS, Interconnect and Summer Framework. Why need to use the other framework like PMQS, Summer Framework and others. There are simply two reasons one boosting for high performance and two limiting the frequency to prevent high temperature or reduce the redundant power. Through the PMQS you can set the min and max frequency for their requirement to device frequency and through OPP you can disable and enable the specific frequency like this the PMQS disable the minimum maximum frequency and OPP disable the some specific frequency and then just device frequency only can use the five or better frequency so on the state final stage device frequency or just the final frequency and then set frequency to device frequency driver and this mechanism provides the external user for easy control of the some hardware resources it's very easy to guarantee their requirement and OPP is mandatory for device frequency framework it provides the function to get club and regulate the information from device stream and support the hardware function to control them it is more simple and easy to use as a resource also this function has some more good exception so I recommend you use the OPP function to control the clock and voltage and also it should simply example how to enable and disable the OPP entry just if you just call the OPP disable with the specific frequency and also again you can just OPP enable function with the specific frequency this frequency is available on device frequency core according to your requirement you can use the OPP and power management quality of service framework provides the interface to control the hardware resources as I comment user can reflect their minimum and maximum requirement you can use the min and max frequency control because the PMQS already has another type of resource control like latency but only device frequency use the min and max frequency control I just simply example how to use the PMQS the first if you want to update the PMQS request you just add request and checking the request status active or not and then just update the request with your specific frequency and on the device frequency side device frequency core will receive the notification from PMQS about the update and just read the value and then set the frequency so for this stand device frequency can set the send or fly the user requirement and when also you want to release the PMQS just update with zero and then remove request it is very easy but it is very strong and powerful I recommend you to guarantee and to keep your performance or power requirement an interactive framework control the interactive node interactive node like memory data both and memory control in the SOC it is very similar with device frequency but the interactive has not any government mechanism like device frequency instead interact to make the pass between the interactive node so you can set some requirement to the pass from someone and the interactive node also it affects the performance and power management so some case it is used with device frequency driver I add two case for NXP, IMX and Samsung Echinos memory bus example the IMX device frequency driver register platform device or Interconnected platform driver and then Interconnected driver problem and will create the Interconnected node and pass between the node actually device frequency and Interconnected is working separately but in this case if when user require any frequency change to Interconnected node and Interconnected pass on the final stage Interconnected require the frequency change to device frequency driver through the power management quality of service interface so as I commented there are two example but the actual case is not in much just on the review and some support to prevent the high or danger of temperature is very important on high-flexive embedded device like smart power as you know that GPU has the high frequency it means that it might make the target the high temperature if you use the for a long time target in this case the summer framework is necessary with GPU the summer frame provides some mechanism we can register device frequency device as a cooling device similarly when device arrived at fixed temperature just drop the frequency it is simple but it is not smart the recently the summer framework provides the smart governor the almost high-flexive target use the intelligent power alligator it is based on the energy model framework it collaborate with device frequency framework IPA consider the estimated power of device and allocated power budget in result it sustain the stable performance without sudden performance drop like simple summer governor as I commented and also when changing the frequency someone use the PMPS interface already we register the device frequency driver and device as the cooling device it is the very important case for sustained performance without any high temperature and this diagram just show the relationship between IPA energy model and device frequency just you can check it recently there are a lot of rich profiling tools but I know but actually it needs to make the environment and with extra effort so in my case I prefer to use the simple profiling by only using the color feature people using the any rich profiling tools if I catch the problem with a simple profile and then I try to use the rich profiling tool on this chapter I just simple propying and how to optimize the behavior of the device frequency driver the first three check the all supported device on target through the debug fs the debug fs somebody should all supported device frequency and also next sysfs show the very simple profiling you can just make the simple to print the current or min and max frequency periodically and then run back to you can simply check the frequency change variation and history and so also it is possible to check the how to affect the performance from your multiplication and tuning usually I use this method as I know it is not correct but on the early stage it is very useful to catch the what is the problem and why not why frequency is not changed it is very good if you want to check the more detailed and want to check the frequency of the operation like with blank and storage access you better to use the trace point and lastly if some in some case we need to check the power status of device if the power status of the device is suspended it is not working yeah so you just can check the all device frequency device information through the debug fs node this is when you need to check them at once and on this example Odroid XU3 with Samsung SOC this target includes 17 non CPU device and 1 GPU and 1 memory controller and 50 memory data bus the one memory bus is the parent device and remained memory bus use the pass program so you can check the all information through this node it is very useful and and also in case of ccps minimum and maximum frequency and current frequency as I said to check the current device status you just to make the simple transcript to print the frequency periodically as I said it is not the correct I know but it is very simple properly on the early stage and translate the state ticks node should transition state ticks and timing each frequency timing state actually in my case this node is very useful when I want to know what my tuning and code changes is how much effect performance how to use it for performance properly you can check the frequency and timing state and just just to read original code and just reset trans and state ticks and read timing state and execute your tool and then after finish the program just read the timing state and you can calculate the difference timing state between before and after and also again after tune and optimize your code and again to do the same case and after finished all estimate you can compare two states is to check your changes how much effect performance yeah we cannot get any correct trace point and some information but we can catch the my code change is how much effect the performance and others is very useful after this if you want to know the more detailed profiling you can use trace point if you want to profile device more detailed there are some trace point in device frequency, summer and pmpf framer you just can check the detail frequency change and monitoring point depending on the time of type and also when the required frequency is enough but there are some performance drop you might to guessing the summer throttling were too many device operation from other request so you can check these operation through the summer trace point and also to check the summer throttling you can use the summer trace point and lastly you can catch the who change the frequency and the correct change timing this page explain the more detailed trace point the first just show the when device monitoring is executed by timer it is very used to check how open it has been and also the second device frequency show frequency change point it is useful for determine whether or not the frequency has changed at the proper timing and checking the history of frequency change and on the summer case the summer trace point show the temperature and if you the throttling happen, the summer trace point printed summer joint trip trace point and then as I said we can show the we can see the QS request point with the request value with device PMQS update request in my case I usually this trace point for propelling device frequency so your change is how much affect the performance you just enable the trace point or device frequency PMQS summer and then enable the trace point or performance assistive device like DRM before it and storage this trace point for example if 60 FPS is required on display each wavelength interrupt must happen within 16 ms you can check the wavelength latency with the device frequency so maybe you might catch the problem about the device frequency firstly check the interval between the two wavelength if the interval is over 16 ms on 60 FPS you check the frequency of non-CP device before and wavelength interrupt and if the frequency is low after increasing the frequency through the game QS and you can try to check the wavelength again and also you can check the timing when QS is required at the proper timing after checking the QS timing you might try QS more early than before or later also it is able to check the frequency change and the summer temperature and throttling this example has the very simple case but just I share the how to use the trace point for your circumstance and the previous example actually I catch the result on the Android accessory as I said there are the GPU but we cannot check we cannot read any GPU monitoring so we need to check the power status of the GPU so in this case I use the runtime PM and general power domain given by somebody on the Android accessory like this the GPU is suspended so the monitoring of the GPU is not working even if it used the simple with the time-based method like this the device frequency framework provides a hyper function to control the governor status for device by using them you can reduce the monitoring like the GPU on the suspense state and on the active state you can restart your governor it is very helpful for power management and also timer has two type of timer the developer timer is not expired if the CPU idle and delayed timer doesn't care CPU status and poly interval and changed timer interval if it is short very fast response if it is long save the power by preventing the frequency frequency to wake up of CPU and ups and downs differential affect the frequency scanning speed so you can change for this value by folder tuning and explain the difference between the developer and delayed timer firstly developer timer is not expired on CPU idle status the cross spawn timer will be wake up on the next CPU wake up time so it doesn't wake up CPU time provider so it saves the power it has the raw response if CPU is idle state and secondly delayed timer doesn't care the CPU idle status you run the timer when fixes the timer interval and also it wake up the CPU from the idle state but it provides more fast response than developer timer actually if on the flagship mode like smart phone I recommend use the delayed timer with short period because the wake up cost is not critical but on the IoT side I recommend you use the developer timer because the small wake up cost is some critical on the IoT side the device has the very low capacity battery you can choose the timer type according to your environment and as an example developer timer bad case on DMA operation the simple on-demand government developer timer changes the frequency according to the amount of DMA operation but when DMA operation is busy and CPU is idle status the developer timer is not expired in the idle state so the timer cannot check the DMA operation in result the performance will happen because timer has not checked the current status it is just an example to explain the developer timer so as I say you can choose the timer type according to your environment like IoT side or some flagship mode like smart phone and this table shows how much affect the device behavior when you change the value through the system so when you try to tune this so you can report this table contains some guide how to tune the value on each scenario but it is not always true just it is my recommend so like this you can make the value set according to your environment for tuning if you use this variable I think that it is very useful to keep the balance, performance and power over non-CPU device and this chapter explains the main and corner device driver and there are five category device driver the GPU and ARM dynamic control and storage, universal fresh storage but it is not on the review and there are every device frequency driver and five type non-CPU device in the main and corner you can report this driver and also you can add your device for non-CPU device but almost except for Tegra action device it is too old so I think we need to improve the system and have to suggest a new governor if you have any idea please send patches to the main and corner and lastly I just simply quickly mention the weakness and to do this as I said the device frequency driver has two old governor so to optimize performance and power we already provided some tuning points like upstress or downstress or down differential but I think it needs the new governor for faster response like scheduled to the governor on the CPU frequency subsystem and also it is too simply checking the device data at the time without any history so it don't expect a future device test some performance drop happen so I think that the device frequency frame up need to some other road tracking methodology like 0.30 road tracking of CPU scheduler or Redo governor of CPU idle frame up and this may issue some further to do list firstly the support required OPP property for the passive governor and also until now passive governor only support with other device frequency driver but I will expand the passive governor depends on the CPU frequency so in the main and corner there are some requirements about this but it has not yet completed I will and also for more immediate response support case rate based time until now just so support developer or delayed work of you and then I will support case rate for each device and also lastly I think need to KTF test for frame up yeah thank you I finished my presentation and that's coming for my session and if you have any question please let me know through the email and on the mailing list yeah thank you