 Hello, thank you for coming to my talk. My name is Masayoshi Ishikawa from Sony. So my talk is about artists called NatX, so not Linux. And by the way, did you see our demonstration at the technical showcase? OK, thank you. So actually, we started this project almost three years ago. And before this project, we released a Linux-based audio player called Workman just in 2007, so just 10 years ago. So which has ARM9, not Cortex-A9, of course. Then we released an Android-based audio player in 2011, which has dual Cortex-A9. And when we started this project, so we investigated existing operating systems, including UC Linux. But unfortunately, the processor, which I will explain later, cannot have a theorem, but only has internal SRAM. And some products don't have SPI flash due to size and cost issues, which means that all programs have to run on internal SRAM. So we had to use real-time OS for our project. So here is the attendable today's talk. Firstly, I introduced our products and discussed typical software development and why we chose the NatX. Then I will explain how we put the NatX to our microcontroller and explain how we implemented feature investments, such as power management and fast air rolling. Then I will explain C++ 11 and debugging and testing, and including ADV. And finally, I will show you some short demo videos, which is almost the same as the last night's technical showcase. So these are our first three products, so two IC recorders and one audio player, which we released in late 2015 and early 2016. These IC records support microSDHC and microSDXC, which has a larger capacity than SDHC, for example, 64 gigabyte. Also, they support audio signal processing, such as digital pitch control. And the second one is called ICD-6-3, which support high-resolution audio recording up to 96K, 24-bit, as well as high-resolution playback up to 192K, 24-bits. And this model has Bluetooth, which is used for wireless control with a regular mode app on smartphone. And the last one is called the Workman WS-3, which actually, this device is very small. But it supports ambient sound mode, and it can play back music up to 12 hours. And actually, we released one more IC recorder last year. And we will soon release one more Workman, which has a Bluetooth headset feature, like this WS-3. So this device shows internal hardware components. And as you can see, this table, so there are so many differences between the products. For example, audio part is different, maybe here. Audio recording is different, so display is also different. The SX series has a spear flash and NFC and Bluetooth. But there are two important things in this chart. The first one is a release schedule. Actually, it was very tough for us to release all of this software in just four months. So we had to consider productivity as well as quality in software development. And the other one is a display support without a spear flash. Actually, it consumes more internal instruments than other products. And as I mentioned earlier, the microcontroller does not support SDM. So we had to consider dynamic application loading and course size reduction. And this table shows a typical software development. And here, I divided the product model into three categories. And perhaps most of you know Android or Linux-based. But this Android Linux model has rich hardware, such as Cortex-L series. And memory size is much more higher capacity. And usually, we can use open source-based development tools, such as DCC or GDB and ADB, such as open to maybe use for actual software development. However, a typical environment for a software environment for outer space is slightly different. Of course, CPU is not so fast. Actually, Cortex-L series is almost 100 or 200 megahertz. And memory size is not so large. And typically, tool chain were some tools provided by MSU vendor. And however, we have much experience with Linux or Android-based software development. So we had to deeply consider how we applied these software development approaches into our R2O model. And actually, so there is so many R2Os in the world now. For example, free R2Os and Contek, and GBOs, and so on. So first, we had to find the best R2Os for our development. And finally, we decided to use the NATX. So perhaps most of you are not familiar with it. So recently, the NATX is getting popular, for example, some drone project, maybe PX4, maybe some for its flight controller. And maybe in this week, Samsung announced Tizen RT, which is based on NATX. So NATX is getting more popular. But when we started the project, NATX was not so popular. But here, as you can see, this website, the NATX has so many key features, but it's scalable and portable. And the most important thing is the projects and RPC support. Because we have been much experienced with Linux and Android world. So this project support is very useful to reuse existing software. And in other words, we can reduce the training costs and the communication costs between software engineers. And also, health support is also very important for us. Because our product has so many features, but hardware resources is very limited. So we had to divide one big application into more smaller ones. And other features like driver framework and Linux configuration and many MCU and board supports, these are very useful for system software engineers. And finally, BSD license is very useful and perfect for us. And these are technical challenges in our project. As I wrote in our first lecture, we ported the NATX to microcontroller by ourselves with using open tools such as OpenOCD. And considering a small RAM size and reusing existing software were also big challenging items for us. And finally, we applied more than software technologies with these tools and services like QMU and C++ ribbon and GitHub and Jenkins to improve productivity and software quality. So this is very rough software stack, so which contains several components, including applications. And for example, QMU, which is a CPU emulator. And this emulator is used for software porting. And here MCU is not actually not a software component, but I put it here to compare with QMU. And as for OpenSystem, we chose the NATX 7.5 because a little bit older version. And regarding the tool, we decided to use GCC 4.8 which supports C++ ribbon and OpenOCD 0.9 in development version. So please support SWD, so see why at the back. So in today's talk, I will mainly focus on these yellow books. So next slide shows the feature list of the semiconductor that's LCA 23450, so which we use in our products. The microcontroller has Cortex M3 dual core, but currently only single core is used. And also it has a DSP32B fixed point dual MAC DSP. And it has an internal SR, and which capacity is a little larger than typical microcontroller has. And also it supports a higher resolution audio, up to 192K, 32-bit. So in addition, it has hardware logic such as mp3 encoder and decoder to reduce power consumption. Of course, it supports standard peripherals like USB, SPI, I2C. And it can operate at two different voltage, for example, 1.2 volt and 1.0 volt, which depends on actual speed. So from this side, I will explain how we ported the NATX to the microcontroller. Actually, so we really appreciate that our semiconductor provides us a FPGA code. And as you can see in this picture, the system consists of several boards. For example, this is the Xilinx FPGA board here, and this is the IO board. And it includes EMS here, and SD card here, and audio board here, and display card here, and Bluetooth and SWD board. And actual clock speed was not so fast as we expected. So just 20 miles per hour. But we finished porting most of these features so before we received the engineering samples. And after we received the engineering samples, we tested the EMMC board and power management and suspended regime, which we could not test with the FPGA. And to start porting the NATX, so firstly, we had to set up OpenOCD on chip debugger by writing OpenOCD scripts. So without this tool, we cannot boot the microcontroller. And to connect the OpenOCD to the microcontroller with SWD, so serial wire debug. So with a popular FTDI chip, so perhaps some of you might know. And actually, so when we started this project, OpenOCD did not officially support SWD, but it was so lucky for us that SWD support was officially merged into the master branch so just before we received the FPGA. And this screen shows the OpenOCD, so it detects this microcontroller. Sorry, it might be difficult to delete, but please download my presentation slide from the conference website. So next is NATX porting. So in today's talk, I will not explain the standard peripherals, but explain the microcontroller-specific features. And actually, AMMC-LSD is one of the standard peripherals, but here I have a special topic to mention. And as you can see at this chart, we implemented the driver with using a bulk device API, however, so we did not use AMMC-LSD protocol driver in the NATX. Instead, the driver calls are ROM APIs, which on semiconductor provides. So for example, identify card and read sector and write sector, so such APIs are provided in ROM API. And due to a ROM code restriction, we had to use fixed partitions. And also we, the driver used a DMA to reduce CPU load works, and it works with a hot port driver, so which is newly introduced to detect the SD card. So next is a file system. Actually, NATX supports several file systems such as PROC-FS and VFAT. So in our project, we use PROC-FS for debugging and wake log, and VFAT file systems are used to store program files properties and databases. And also, we added the so-called EVFAT, so which on semiconductor provides to support expert and IC records of specific APIs, so such as DPI. The name of EVFAT might be confusing, but they gave this name. So here, please remember that EVFAT just supports IC records of specific features, so including expert. And this screenshot shows the result of DF and my own comments on ICD-SX3s. And as you can see, this screenshot, so these file systems are actually used on our products. For example, SlashDB uses a VFAT just for databases and SlashMAT SlashSZ0 uses EVFAT because it's a content area. So, and Slash PROC uses a PROC-FS. And next slide shows the EVFAT portings. And as you can see in this chat, we implement the EVFAT by using virtual file system APIs. So in the similar to AMMC and SD drivers, the system, the file system calls ROM API. So for example, so mount and open read and write. And because the ROM API support UTF-16 only, so we added a conversion logic to UTF-8 because UTF-8 is very easy to handle inside operating system, including libc. So we actually did not modify the file system or libc, so just added this yellow box. And as you can see, this screenshot, maybe here, so you can see a Japanese file name is supported. So let's move on to our audio support. And as I mentioned earlier, the microcontroller has dedicated audio hardware logic, so such as MPC encoder and decoder to reduce the power consumption. And also, as you can see in this chat, it has a unique hardware feature, so call it audio buffer, to support flexible audio signal flow. And for example, by setting audio buffer register, you can change a signal audio flow, so inside the microcontroller. And as for the driver implementation on the NATX, and actually the NATX hardware audio subsystem, but we introduce new APIs like Linux R3 to be more simple. And also the API support are non-working mode as well. And this is an audio playback example, so using DSP. So in this example, so first, so CPU set up audio blocks inside the microcontroller, and also set up the external audio codecs, which is not shown in this chat, and then boot DSP. Then CPU reads the audio file on the storage device, and audio data is sent to DSP through the audio buffer. And then DSP decodes the audio frame, and finally PSM data is output to I2S0. Of course, actual playback is a little more complex. For example, as I said, MP3 case, and MP3 playback case, MP3 hardware decoder is actually used, but you can understand how the audio playback signal flow works so inside this microcontroller. And this is an audio recording example, and in this example, so first, so CPU set up the audio blocks inside the microcontroller, and also set up the external audio codecs, so which is not shown in this chat. That's then PCM data is from I2S0, and pre-processed by DSP. And finally, then PCM data is encoded by MP3 hardware encoder, and finally CPU writes MP3 audio file into a file. And actually our project has a recording monitoring feature, so real audio signal flow is much more complex than this chart. So let's move on to our power management part. So usually it's a power management consistent with our following techniques. For example, clock gating, so which is a disabled clock for unused books and power gating, so which disable power for unused books, and DBFS and suspend the regime, so which I will explain later. And this finish shows our power domain inside the microcontroller. For example, the microcontroller has several parts of the main such as audio, S1, USB, and the spare parts cache, and if the block is not used, the power should be disabled to reduce power leakage. DBFS stands for dynamic voltage and frequency scaling, and as I mentioned earlier, the microcontroller supports two different voltage, and using a lower voltage can reduce actual power consumption. And as you can see this chart, so we define the active mode and idle mode, and define its clock table respectively. For example, active mode clocks are 160 megahertz and 80 megahertz and so on. And based on the CPU, so idle ratio, the clock control is done autonomously by DBFS. And as you can see in this chart, actually it changes our internal divider and selector in this yellow box. And however, the clock can also be boosted when, for example, the key are pressed or the white loading application to get a more quick response. So next chart shows our idle ratio calculation used in DBFS. Actually the NATX are harder CPU load monitoring, but we found that it's not accurate, but because a CPU load in the IRQ hardware is not considered. So we implement a very simple algorithm. And as you can see this chart, so CPU issues, so WFI, so weight to weight IRQ, here we measure slip time here, and in microsecond order, and then accumulate the slip time to get idle time. So write this equation. And finally idle ratio can be calculated by next formula. And to measure this slip time, so we use hardware timer instead of cystic in Quartex M3, because this hardware timer is not affected by the frequency change. So it's very easy to calculate the slip time. So next is a suspend and resume. And as I wrote in the abstract, so we introduce the white clock, so which is Android developers are very familiar with. So actually we implement a very similar API, so which Android kernel provides. And the mechanism is that if applications that are passed to men, and if no wakelocks exist, then the system automatically enters into a slip-dip mode to reduce the power. And for example, during all the playback, the middleware acquires its wakelock to prevent from entering into a slip-dip mode. And if the middleware to release a wakelock when the playback stopped, and then enter into a slip-dip mode, it's automatically. And here is the actual implementation. So when entering into a slip-dip mode, the system power down on use block and wait. Then the system can be waken up by interrupts, so such as GPIO. So finally before restarting user, the kernel timer is re-synced to re-synchronize with RTC. So let's move on to our F support. So F stands for executable and linking and file format and used for application dynamic loading. And as I mentioned earlier, our system memory is not so large. It's about 1.6 megabyte, so not a gigabyte of course. And to overcome this condition, so memory overlay technique could be used, but we found that memory overlay is not flexible. So we decided to use dynamic loadings with L, so which the NATX supports. So with this feature, so we can divide one big application into more smaller ones. For example, so home application or settings application and so on. And also we can use separate, so debug commands simultaneously. So in this screenshot, so you can see a several application is running as well as debug commands such as free and PS. However, the NATX L loading was actually not so vast because it was designed for small memory systems. So we modified it to be fast. The following are our approaches. So first one is the section data cache, so which allocates a big memory to hold its section table to reduce EMMC access. Because SEM access speed is much, much faster than EMMC access speed. So the second one is a symbol name replacement, so which shorten its symbol name, so by hatching so their names, then sort by the name. And as you can see, example in this year of work, so p-thread, condo auto, so take a set code. So it can be replaced with this hash code. And actually we are also using C++ 11 with so template for middleware and application development. So some symbol name is much longer. So for example, 20-100 byte or 30-100 bytes. So this chart shows how each approach reduces loading time. So actually the section data cache has a big impact. The original loading time was more than 3,000 millisecond, but after applying these techniques, so actual loading time was improved to less than 100 millisecond. And so far I've focused on always porting and enhancement. And from here I will explain about some tools and testing. The first one is QMU. So QMU is an open source based CPU emulator as which support many CPU architecture including Cortex-M3. As a motivation for using QMU was to port software such as Bluetooth stack or good toolkit. Actually we had to finish the portings, Bluetooth stack on QMU plus NATX. So before we receive the FPGA port and audio spring actually worked on QMU plus NATX. And to get a Bluetooth stack working on QMU we had to change our SRAM size and fix some bugs. For example, SD drive malware and big issues. So let's move on to our C++ label. So in this project we decided to use C++ label for mid-range application development to improve productivity as well as the performance. So you're following the main features of C++ label. So for example, auto keyword is which is a compiler determines the site. So for example, the first in this chart can be replaced with a second line. It's very simple. And also we can use a lambda expression and to define your functional projects. The other features right as a smart pointer which is implemented in standard libraries can be used to avoid memory leaks. And also we can use the move semantics which is to improve performance. And other keywords are also useful to improve productivity. And the next slide shows the C++ standard library. And as I mentioned in previous slide, we are using a standard library such as SDD, Konkan Unique, pointer, and so on. And as you can see, this table, so there are some open source based C++ standard library. And finally, we decided to use libc++ from LLBM projects because C++ 11 supports fully supported and license is VSE-like and it's easy to port. And actually we tried the STL port before but we found that it's more difficult to support C++ 11 on our platform. So next slide shows the code size reduction. So to reduce the code size, so first we started with auto-optimization and then apply OS option with size optimization and then apply GC sections so which will remove use sections at link time. And finally, we applied a symbol name replacement which I explained in previous slide. So this just shows the code size comparison example for each techniques. And of course, the writing small code is much, much more important than, but you also have to consider these techniques which are very in common. So as I explained in the net exporting, so we've been using open OCD to debug software and some OSes like Linux and pre-artists are officially supported by open OCD. So this is very useful to debug such as deadlocks. So however, the netx is not supported so which means that we can only debug the current thread. So to support the netx, so we implement the code by ourselves. The approach is almost the same as other RTOSes. So this screenshot shows the GDB windows running on Emacs because I have Emacs so very much. So I decided to use this screenshot. So as you can see this screenshot, you can see all of threads and if you click select one thread, you can see it's called stack here. And actually we released a code on GitHub last year. So if you're interested in this code, please visit dithub.com slash sony and you can find our work. So this slide shows how we can analyze the runtime crash with tools. So similar to Android and Linux, we implement the crash jump mechanism and tools to analyze tools. So this is a typical scenario. So if the crash happens on the target, the crash logs are stored in the RAM and automatically reboot that. Then a crash logs are stored on the storage on EMFC. So then we can retrieve its crash logs by ADB. Then analyze the log with debug symbols. And this screenshot shows the actual crash log. So which is analyzed with the debug symbols. And you can see a stack trace where the actually crash happened. And next is ADB support. And if you are familiar with Android, so you might know what ADB is. So motivation is to test the system without proprietary tools. Because ADB from Android is getting standard. So we'd like to use this ADB tool. To, for example, retrieve internal logs and so on. So currently we support the minimum features like push and pull and shell with remote execution. So because this, the feature is just a testing purpose. So the feature is disabled at factory. So before seeping the products. The actual implementation was not so difficult. So we reuse the NATX, so USB serial driver and modifies the USB descriptor to pretend ADB driver and implement its protocol. So for example, the push and pull and shell protocol. So from the scratch. And actually this screenshot shows pushing a hello application to the target and execute it remotely. Actually you can see a hello world is on the target. So next slide shows the integration testing. So as you can see here, this figure, so we constructed the build and testing system with ADB and cloud based services such as GitHub. So we, if a software engineer pushes a code to GitHub and create your pull request, then the system automatically builds the target code and then deploy the software to the actual each product and then test the products, then test the products with ADB. So finally, you can see a test result on Jenkins. In addition to PCB tests at the factory are done with ADB. Next slide shows the unit testing with Google test. And the Google test is a Google C++ testing for M-Write. So we put it on to our platform. So motivation is to use this framework is to find the back so early and refactor the code safely and so on. So to execute this test, we also use ADB. As you can see this screenshot, so press with the HelloTest program. So then push the program to the target device with ADB and finally execute it remotely. And finally, HelloTest is running on the target and all tests are successfully completed. And this slide shows the DSP software development. So far I explained the open source based software development, but unfortunately as we had to use this I provided to develop a DSP code. So however, so to improve a productivity so in DSP software development. So we implemented some features to assist the software development. So DSP software development. Because DSP has to work with Cortex M3 as I mentioned in order example. So typical approaches like this. So first, so develop a DSP code on the simulator and then run the sample application on Cortex M3. So for example, so recording up or playback application. Then load the DSP code and start the DSP. So finally continue the application on Cortex M3. So it's very simple and we can easily to test the DSP code on our platform. So finally, I will show you some short demo videos which is almost the same as yesterday's technical showcase. And the first video shows the ADB and the first are floating and DBFS. And the second one is a stress testing tool. So like a monkey tool. So first, this is our product. IC recorder, so UX3. So there is no USB ADB device is shown. So then connect to the first devices and ADB device. So you can see your ADB device is attached. Then so try to log in. So we're ADB, so ADB show. So just log in, complete it. Then try to list the application to check its size. So just copy and paste. And you can see application size is almost from 80 to 300 kilobyte. And you can also see a task list. And so then go back to... This is a record application is running. Then go back to home application is currently running. And then yeah, setting application. So now home application is unloaded and the new setting up application is now running. And then go back to home and to check this DBFS commands, which is not demonstrated yesterday. So you can see a GPU idle ratio and as well as its current clock speed. Actually it's currently playback recording is started. So and go back to home. Actually clock is boosted now and now it's loading down to a lower clock. And then yeah, VM set. I think some people saw this VM set yesterday. So now playback is continuous. So you can see a blockout is happening and then go back to stop recording. So no block are you all happened. So next is a monkey-like tool. So monkey-like is a randomly generated input commands or input key sequences. So you can easily do a stress testing with this tool. So just maybe soon we'll start recording. Perhaps like option will be like key will be sent. Oh, okay, start recording because like keys are input. So with this tool, we did so many storage tests with this tool. For example, if we did the overnight test, maybe 40,000 events can be generated with this tool. And with this tool, we can find so many critical bags with this tool. So that's it. So any questions regarding my question? Yes, please. So your questions regarding GitHub? Okay, so actually our code is, we are using GitHub as private, private, private, not public, sorry. Actually, yeah. And I have two options. Actually our code is on GitHub. So in private repository, but we released one source code, so open OCD staff. This is now available on GitHub, Sony. This is a public staff, yeah, actually. It's okay? Yeah, please? Sorry. ADB? Ah, yeah. Your question is, so ADB is in open source? So sorry, we are now not in open source now and we are now considering how to merge into our upstream, yeah, this ADB. Yeah, yeah. Yes. Yes. Actually, NatX is not, this protocol is, how do you say? Maybe implement itself is not so related to NatX, but we will contribute to NatX GitHub repository, yeah. Okay. Any questions? Yeah, please? Yes. Yes. Yeah, yeah. Actually, we are, as I said at some point, right? Actually, we released three products and last year already released one product and this year we will release two or three products and also, but currently audio recorder and audio player is supported, but in the near future, we'd like to support more products rather than audio products. So for example, network connected devices and so on, yeah. Any questions? Ah, great. Yes? Oh no, it's very difficult to say, yeah, actually, but actually, on and off say, actually it depends on its, so CPU road time and it depends on, for example, MPC playback case is very, very CPU just send to our, it's compressed data to our DSP and actually DSP is very long in very low clock actually and DSP and also MP3 case, we can use a hardware decoder and also in recording system, we can use a hardware encoder. So in that case, CPU idle time is maybe around 95 or 97. So such a time is in idle. So idle means that, so we define it's idle clock, so maybe around three or six megahertz, so very, very low clock and but in some cases, for example, if we push some keys to change its application world to redraw its user interface, it's maybe more CPU road, yeah, yeah, yeah, yeah, yeah, yeah. Any questions so far? Okay. So thank you, thank you for your time. Thank you.