 OK, więc zacznijmy od instytucji. Nazywam się Piotr Ziemcik, pracuję na nordzie semi-kondaktor. Jestem kodowniciem w nordzie, który to znaczy, że każdego paczka z nordzie do nordzie musi zostać ulepszona przez mnie. Dzisiaj będziemy rozmawiać o tym, co się dzieje w ZEFER. Zacznijmy z tej historii, co są tajemniki w ZEFER KERNEL i gdzie można ich znaleźć. A potem daje wam jedną prawdziwą odpowiedź, dlaczego te tajemniki są dokładne. Po tym, będziemy do tej prezentacji, kiedy pokażę wam jak tajemniki w ZEFER zachodzą. A potem pokażę pewnego postawienia i przyszłego pracowania na problemy, które zainteresowały nasz bęcznik. Zacznijmy z tej historii. Jeżeli w tym zazwyczaj spróbujemy używać ZEFER w jakiejś aplikacji, note that there are several API calls which takes some kind of the time as an argument. So you can sleep for several milliseconds. You can schedule a timer over some duration or period. You can have time out of the operation. All these API calls schedule a time out inside the zephyr kernel. i karnal w sobie traktuje all individual time-outs i programuje system timer, aby pozyskać interakt, kiedy ten given time-out jest ekspiring. I ten interakt pozwala ci, aby skończyć coś, bez względu na stanę systemu. Więc system może być idle, system może wykonać coś, ale kiedy time-out ekspira, to coś się dzieje. Trudno może być skończone, ukształcenie, ukształcenie, ukształcenie, ukształcenie, coś innego. W generalnie, time-out musi być dokładnie w czasie, kiedy skończone, ale to nie jest możliwe w praktyce. Jesteśmy CPU, które są skończone, a ten interakt jest czasem, ukształcenie, ukształcenie, czasem, więc nie możemy być dokładnie w czasie time-out. Ale co możemy zrobić, jest, że możemy być tak szybko, jak jest możliwe. To znaczy, że kiedy interakt jest skończony, ten interakt ma jakieś akcje i ma minimalną pracę, aby zrobić coś, co jest związane z time-outem. Co nie możemy zrobić, co nie możemy zrobić, jest, że nie możemy robić przemysłu, czyli to znaczy, że time-out nie może skończyć, kiedy skończone, bo to nie pozwala na realny czas-out. Więc pytanie, jak measuresz kwalitetu time-outs? Możemy prowadzić jakieś benchmarky. Możemy spróbować, żeby skończyć jakieś realne akcje. Zobaczymy jakieś rozwiązania, reprezentowane przez dotyki na zdjęcie i możemy definić różne property do systemu. Więc pierwsza property jest akurasy. Akurasy spodziewa, jak skończony jesteście do realnego akcja. To jest definity przez hardware aby być bardziej specyficzny przez akurasy krystal lub oscilator anek-order. Zobacz, jaka jest resulucja. Resulucja jest limitowana przez hardware. Zobaczymy 1 MHz na systemu jaka jest resulucja jest limitowana przez 1 MHz ponieważ nie możesz distinguić się z wymiarami w 1 MHz. System operacji ma tylko jedną prośbę tutaj. Nie powinien być wielkie akcje hardware. Powinieno wszystko, co hardware przywołać i nie wytrzymać. Property jest precyzyjna. Precyzyjna opowiada, jak wytrzymać te same rzeczy od każdego. Hardware jest zawsze precyzyjny. To znaczy, że jeśli programujesz hardware na 4 cyklu to zawsze będzie 4 cyklu. System operacji powinien być precyzyjny też. Jedno z tych czasów jest dużo pracy. I to jest area, kiedy jesteśmy fokusingami. I dlaczego precyzyjna wytrzymać to naprawdę nie ma. Ponieważ jestem z Nordic to przykład tutaj będzie Bluetooth. Więc, jeśli wiesz to jest protokol, który nazywa Bluetooth Low Energy. Jest low energy, ponieważ próbuje minimizować radyjne usage. To znaczy, że jak radyjna usage to jest radyjna usage tylko, kiedy ma się zaniedzać informacje. Centralny czas na radyjnym usage to jest transmetr. Periodycznie opowiada, jeśli radyjna usage coś do usadzenia. Kiedy usage jest w naszym cyklu to jest 20 miliamperów z powrotem. Natomiast, na radyjnym usage na radyjnym usage jest 4 microamp na radyjnym usage. I radyjna usage na radyjnym usage jest prosty. Jednak, musimy wytrzymać jakieś limitacje. Pierwsza limitacja jest to, że radyjna usage nie będzie radyjna usage przez 4 sekund. Ponieważ krystal, używany w transmetrach ma jakieś akurasie co oznacza, że wytrzymające przez transmetr też akurasie możemy wytrzymać radyjne usage co oznacza, że musimy wytrzymać to na radyjnym usage na radyjnym usage To samo się dzieje na radyjnym usage. Nie jesteśmy synchronizowani przez radyjny usage więc ten oscilator ma jakieś propiety to jest akurasie a my mamy wytrzymać radyjne usage na radyjnym usage na radyjnym usage na radyjnym usage na radyjnym usage kiedy mamy wytrzymać wytrzymać a wytrzymać wytrzymać na radyjnym usage to znaczy, że nie jest wytrzymająca to jest wytrzymająca i wytrzymająca a wytrzymające to wytrzymające a kiedy wyświetnie wyświetnie wytrzymające a kiedy wyświetnie wyświetnie a kiedy wyświetnie wyświetnie wyświetnie więc w tym przykładu możemy oczywać kolerowanie w tym systemie i pewnego poziomu kwalitny benchmarku systemu, jak power usage. Więc jeśli to jest tak ważne dla nas, zacznijmy do pewnych benchmarków i sprawdzamy, jak to zaferuje. Znaleźliśmy bardzo prostą benchmarkę, która prowadzi 100 testów do API K-Timer, które są podbywane przez mało czasu przez kłódkę kłódki. W przypadku naszej SOC, kłódkę kłódki nie jest używana w infrastrukturach w ZEPHER, i wiemy, że to jest dokładnie, bo to wreszcie spadnie, czekając na hardware, więc wszystkie kłódki są zbliżone do każdego, a tak akurat jak możliwe. The K-Timer uses the ZEPHER time-out subsystem, and this is the part we are going to benchmark. And we also benchmarked the K-Microsnip API, as this allows us to use the shorter periods of times because it's accept argument in the microseconds. So if we stated the benchmark like that, this is our ideal result. So basically you see that the time spent in the KBZ white plus time spent in the K-Timer exactly matches with our timer. And this is ideal result, this will not happen. What will happen is that... What can happen in the real system is that we will be slight beyond the target because there is some execution between the running of these two tests and there is some overhead. But let's look about the results. Here is the result of this benchmark taken about half a year ago. As you see, most of the time-outs ended prematurely before our target. The difference is very small, but we are basically violating one of the principles of the time-out API in an operating system. You have to expire at the time or later, and if you are expiring the later, all the delay have to be explained somehow. The second part you can look is that from time to time from one test to another, we have a spike. We are expiring way too long, way later than we intended to. And this is something also bad because this delay is not justified by anything. System had plenty of time to schedule the time-out, to handle time-out, but from time to time it took way longer than expected. The worst situation is in K-macrosleep. We tried to sleep for one microsecond. The higher time was 10 milliseconds, more than 10 milliseconds. This was because of the tick of 10 milliseconds at the time, the default tick. However, you still see that it's not bound to the tick, which is 10 milliseconds. It's bound to much higher value, which is more than 12 milliseconds. There are good things. All of this is now fixed. During that investigation, we identified some problems, and these problems are for most of the part fixed, which means that now Zephyr allows us to set the very high tick rate, which increases the resolution from 10 milliseconds to 30.5 microsecond in case of our chip. The next problem we identified is by adding or removing time-outs, for example, scheduling the K-timer in one thread. You could delay an existing time-out indefinitely. This problem is also already solved. Also, we found some hardware-specific bugs, when the software did not take under consideration some hardware capabilities and hardware limitations. One of that limitations is that you cannot schedule the time-out for the next cycle of the system-timer on various hardware. We found the account that there is a minimal time-out you can schedule. This is now handled. But this was a very easy part. There are some dragons in there, and I show you in this presentation only one dragon, but with two heads. The one head is called unit, and the second head is the rounding. Let's consider one system you are building. Because you would like to measure some time with millisecond precision, you said that your system will use 1,000 ticks per second, which means that one tick in the zephyr kernel equals to one millisecond. However, you would like to use the Nordic hardware, and the Nordic hardware uses the lowest power oscillator we have in the system for the system-timer, which uses the crystal from the typical clock. And because hardware is not using floating point, every delay has to be programmed in the integer number of the hardware cycles, which means that in our case this will be 32 cycles per one tick. This also means that our tick is 0.9 milliseconds. However, the system still thinks that this is one millisecond. What is the result? All timeouts in your system will expire prematurely. If the timeout is short, this will be hidden by the software overhead. But when the timeout becomes longer and longer, the difference will raise too. And we start to see that. How to fix that problem? First, we can eliminate the ticks and use the hardware cycles directly inside the kernel. But this is not possible if practice. The zephyr also supports the SoC, which system-timer counted in the hundreds of megahertz. And for example, 32-bit integer on 200 megahertz machine spans to 21 seconds of time, which means that if you would like to use the cycles to measure one minute, this is not possible on 32-bit representation, which currently have. The second solution is to change the definition of the tick. At the moment we are specifying how many ticks we would like to have in one second. If we specify that how many hardware cycles we would like to have in one tick, then the problem will disappear. So far, because this change is not done in the zephyr, it's being discussed, we have worked around. We can carefully choose the config's clock ticks per second value to basically ensure that one tick will be integer number of hardware cycles. But this will not solve all the problem. That's another example. Let's build a system with 100 ticks per second. So every tick, 10 milliseconds of time is consumed. And all our counters in the kernel just increases by 10 every tick. And everything is okay. However, you know, we have the hardware, which have the fancy number of the hardware cycles per second, so we would like to mitigate that. So we set the ticks per second to 128. And now the time represented inside the kernel is no longer integer of milliseconds. The second is the unit you are using to communicate with the kernel. You cannot use another unit right now. Which means that the same amount of time measured using the kernel IPI will give you different results depending on the time when you start the measurement. And you see that on the picture. For example, tick one takes 7 milliseconds according to the kernel. And tick two takes 8. Result, you basically cannot do any time of the measurement using this kernel API. Because you never know if you have shorter or longer tick. How to fix the problem? We can do drastic change. We can use ticks instead of any artificial units in the kernel API and just leave all the time conversion to the application. This is drastic change. But if we don't do that, the only way to have precise time representation in the application is to use cycles directly. So this is just a few of the problems we have in the ZFR regarding the timeouts. And these problems are currently being addressed on the GitHub. So what we are trying to do is to change the whole idea of dealing with time inside the ZFR. Both inside the kernel and on the kernel application boundary. First part of that is to allowing the application to choose the rounding for the unit conversion. And for that, we together with Intel are proposing the new API which allows you to convert between the time units like ticks, microsecond, milliseconds, nanosecond, etc. and choose the rounding mode. So are we going to floor? Are we going to sail? Are we going to find the nearest value? The next part, which we have to do is to change the internal time representation on the ZFR kernel. Because at the moment, the all time representation is just integer. And signness and width of this integer depends on the person who wrote the code. Which have some interesting side effects because some parts are using signed integer, some other parts take us unsigned and so on. So what we are proposing is to basically create a new type intended to represent some time and use it in all the places in the kernel when some time is involved. At the moment, there are discussions if this type should be signed or not. Because for some people, some people would like to represent the past events and some people would like to have longer timeouts in the future. With the change of the internal representation, we are also proposing the change of the external representation, the one you are using in the kernel API. What we identified, that we would like to specify the timeout in the way that we can use different units. Because from you as the user, the SI units, like millisecond, second and so on, are more convenient to use if you are specifying just timeout. However, if you would like to measure some time, the ticks are better because then you control the conversion. Which means that we have to specify both value and the unit. The second part is the reference point. At the moment, all timeouts are scheduled from the now. The problem with now is that it might be not the now you are thinking about. Because you can get the interrupt, which takes some time and move your now. So what we are proposing is that we would like to specify the reference point. It can be still now, but it can be absolute time. If this would be absolute time, then you can easily schedule one event depending on the time when the other event happened. This helps us to solve the problem, like in the Bluetooth. Because we are referencing some event which happened in the past. Not the time when we are executing the code. And the last thing we would like to include in that structure is the clock source. At the moment, Zephyr uses only one clock source, which is the system timer. And for us, this is something which might be problematic. We are targeting low power devices and we would like to use the best hardware suited for the task. With the current clock source we are taking as least power as possible. But the resolution of the timer is not the best. We have other timers in system. However, this will require higher frequency clock. And higher frequency clock means more power. So we would like to include the type of the clock and give the choose for the application developer to choose between the trite of between the power and precision and other properties like, for example, accuracy. We would like to do that change in the way that existing application will require minimal work to port. Which means that current macros will allow you to smart transition. But this is not enough. If you would like to have the operating system which is both easy to use and allow us to do the things like Bluetooth stack. So when the time precision of the timers really matter we need you. As you've seen, the work involved requires changing of the very basic principles of the operating system. And we can handle that. But it also involves the change of the APIs change of the time representations. And if you look in the code drivers for your SOCs are using the kernel API with some timeouts. You see that your application network stacks, frameworks, etc. are using that APIs. Which means that the change we are trying to do will affect all of you. And this is why we would like to invite you to participate in the discussions about the direction of these changes. I'll show you some problems. I'll show you some very specific cases we are currently discussed. But there are much more. In the presented issues on the github you can see other. With similar showing similar problems in different areas of the timeout handling. If you would like to talk about the better API for time conversion there is already pull request with some discussions. For example about values unsigned about 64 bits and so on. And there is another pull request which shows you how we would like to change the representation of the timeout for the kernel interface. So these are things which are currently discussed in zephyr. We would like to have them included in some form in zephyr 2.1 but without your help even the minimal discussing the changes or let's say more demanding like porting the various subsystem of the new API this will be not possible. And with this message I end this presentation and would like to say thank you. Do you have any questions? So thank you for your presentation. The new timeout specification type that you are proposing it's a little more complex than a simple integer. How much overhead do you see that adding to the timeout API calls in general? The structure we are proposing has to fit in 64 bits because Ciskals cannot pass the structure as a value at the moment but we can pass 64 bit values which means that everything I mentioned in the presentation have to fit in 64 bits and we already do the benchmarking about both size and speed difference in the current 32 bit and the proposed scheme. On typical application we will see like less than 1% of increase of the size and almost not visible change of the performance unless your hardware has some troubles with 64 bits and 64 bits operations are expensive. And other questions? If not, thank you very much.