 Hi, welcome to Kubernetes Edge Day. I am Tomei Fujita with Sony Cooperation, and we have a co-speaker from Sony China. He is also a software engineer named Fengao. First of all, we really appreciate for this opportunity and the community effort. Thanks for having us. And today, we're going to talk about Kubernetes RobDix Edge class system. I will be sharing the most of the overviews, and the rest we'll be taking care of by Feng. Here is today's agenda for this presentation. Starting with the introduction, we will go through the background, the problems, and the requirements, especially for Edge IoT use cases. And we will talk about what we want to achieve with using Kubernetes as an architecture with distributed system. Then we can even go deeper about RobDix examples with Kubernetes to support distributed system on Edge platform. So I am, my name is Tomei Fujita with Sony R&D Center at Sony Cooperation. I am a software architect and developer. We're most likely working on system services, middleware, like loss, Kubernetes in open source aspect. And I am also a member of Robot Operating System, Technical Stealing Committee. I'm a Feng with Sony China responsible for software such as Kubernetes and the multimedia framework. Okay, so please feel free to reach out to us any time. Besides, we are going to be, we will be available on the Slack channel, worg-iot-edge. So let me do a quick introduction about Sony's purpose. Sony's purpose is to fill the world with emotion through the power of creativity and technology. Our foundation is based on technology to create the new values within diverse and many businesses such as data center, data content, hardware sensors, consumer devices, medical and financial services. Okay, let's get started. At first, I'd like to come up with some backgrounds in a couple of slides. As we know, it's been happening that each device is getting matured with computation and activity and hardware accelerations and so on. At the same time, each system gets complicated to support distributed system and connected system. Besides, considering about perception, recognition with dynamic sensing data on the edge, the application should be dynamically adjusted against the environment. That is said, the system should be like circular effecting system to adjust the dynamic environment. So what comes to robotics and robot use cases as a background? The robots are expected to work and help with highly tasks, even with multiple robots working together. This is easy to see if we can think about factually, logistics, rescue and entertainment use cases. And if that is single robot that we are dealing with, that should be no problem to do operation. But what if it comes to, like, 10 hundreds or thousands robots we have to control and maintain? We surely need to consider that cases to do our development and maintenance easy without any operation as less as possible. Especially during development, application developer does not want to do any operation to check the application. It has to be easy, quick, and efficient to go. Importantly, on its devices, the problem becomes complicated and there is a lot of dependencies compared to the cloud infrastructure. Think about application portability, modularity. There should be some abstraction layer to conceal the hardware devices. From next two slides, let me introduce quickly about robot operating system, which is called ROS. As aspect from Kubernetes, we can just say ROS is one of the runtime framework and SDK. The robot operating system is a set of software libraries and tools that help you build your robot application from drivers to state-of-the-art algos and with powerful developer tools. ROS has what you need for your next robotics project and it's all open source. What ROS provides is not only for SDK but also simulation tools which are really important and useful for robot application development. The Gazevo is the simulation tool constructed on physics, sensors, and interfaces including the UI. You can switch real world and simulation world easily and application is agnostic from this world. That is said, you can just go develop the application without any hardware but simulation at the very first place. Using ROS in the Gazevo, you can have everything you need to develop your robotics application. This ROS overview is needed for our explanation and you can find a way or information about ROS on the internet. So now we stop here talking about ROS and let's be back on the main topic which is Kubernetes related. Okay, so let's move on to the topics now. To start with the problems, what is the pain? As you can see here is the current situation that we have. The application needs to be integrated into the specific system every single time. Even if the application functionality is almost the same, there will be some operation costs for integration. This is because the platform and the system is different from one to another in the edge. And it takes time for application and the system developers. What's the huge pain here is that it's just operation costs but not exactly the development. We do not like doing that anymore. We think nobody does. So we are proposing this cloud and edge common architecture. It is simple and common. It also can support distributed system and the application should be platform agrostic. In other words, against the pain from the previous slide, once we get application developed, we can run that application wherever we like. Could be cloud, could be edge, could be you don't need to even care. Besides, you can see the highlighted directions in the image on the right top. In the edge network, devices are connected, connecting each other directly as distributed system to keep the performers. Continue to the previous slide, the application engineers aspect, it is just appears to be a single entry point to manage the application. We do not need to operate for each devices anymore, but just accessing the dashboard front end will be all. Sometimes we want to deploy application only on specific debug host, sometimes all over the nose. Introducing the boundary between system and application, application just request what kind of capability or hardware devices are connected to run the application. So that the rest will be taken care by framework and application can be agnostic from the platform or dependencies. This is exactly one of the ecosystem in the edge world. That is something we want to achieve. So we have been considering Kubernetes against this situation and the problems described before. And the answer for now, we could do that with Kubernetes. With Kubernetes, we can have most of our requirement as described so slides before, such as development and application maintenance, roll up and down the application without any downtime. We can also keep the application running. Add administration and device capability management and scalability up to 10,000 nodes in the cluster. Those are everything that Kubernetes mainline provides and allows us to do that as it is. So after all, taking advantage of Kubernetes, we can actually support this architecture. We can have the flexible cloud and edge cluster system for application development and management. And providing security enclaves and capabilities dynamically attached to the runtime pod when application starts running. This is what exactly we want to support as cloud and edge cloud architecture. From the next slides, Fen will be taking over the presentation to explain more details about edge specific use cases and what we have done with Kubernetes. Thanks. I will be taking over from here. We have shared enough about our views and the requirements. So let's talk about more details here. We are going to talk about four main subjects, dynamic cluster reconfiguration for edge IoT devices, distributed system and application with rows, dynamic security enclaves attachment with rows, and hardware abstraction with the Kubernetes device plugin interface and the implementation. Here is a distributed system on edge environment, as in a typical use case, expecting multiple robots are connected in the same line and working together for user. Application is built on top of rows, which is publisher, subscriber, architectures, as application layer to support the distributed system. There are face detection and detection. Container runs inside pods on the work nodes and the selector. Container can select images to notify the visualizer to display what image should be printing, the main monitor on primary node. This is one of the examples, but we can do this with Kubernetes. Since this is a distributed system, it fails independently, and it appears to be a single system as a user experience. The point what we want to mention here on this slide is we need to use Wilner CNI. This will be needed if the application layer use multicast since rows supports distributed system. It does end point discovery at the runtime with multicast. We have tried a few other CNIs, but we will work out of the box when we use rows. Thinking about consumer devices, edge distributed system within lines and even with third-party application, there should or must be security certificate or key to control authorization and access permission for each end point. Security tends to be considered after development, but once it comes to production phase, that's something we cannot just ignore. Here it describes how to manage security includes as administrator and the user and how security includes attached to a priority port dynamically. This is just something we can do with Kubernetes, custom resource named ConfigureMap and Secrets. First, administrator registers security includes for each end point and gives them appropriate access permission via API server so that we can control the access permission in the first place. For example, third-party application developer can only see a cup of security includes for third-party application. And when we need to run the application, we can just say what we need to use for this port as security includes. The rest will be taken care by Kubernetes. It will check the permission if the user is allowed to use that security include. And then loss and attach the required security include dynamically on the end point port as volatile storage on the physical machine. Applications ports can be agnostic from this binding by Kubernetes, but it just uses security includes to participate distributed system and access the data objects. Once ports are shut down, security includes will be gone too. So far, we confirmed that everything works okay with ZOS security feature like this. In edge devices, one of the most complication is platform dependency. There are so many devices, we cannot even count all, but this should be also abstracted to application perspective. Kubernetes has the interface named device plugin, which is expected to use for GPU originally. As described here, device plugin implementation can be one of the Kubernetes customer resource. And the interface are really simplified so that it should be easy to support implementation. Device plugin iterates with Kubernetes to list, allocate, and find the device for application ports dynamically. This really seem to be kind of abstraction that we want to have in edge use cases. So we did try to use that against our requirements. In IoT devices, there would be more complicated devices such as FPGA, special hardware acceleration, DSP, and so on. Also, we would want to add a virtual device provided as API access to the host system instead of finding physical device directly to application port providing API from a host system to application container that gives us more flexibility and security to control the access from container to host system. We already have the implementation something described on this slide. Application can request the device API if it has read access when application container starts. Internally, our device plugin iterates with the host system manager and check the permission and provide the APIs requested by application. And the device APIs are attached to application container dynamically. This entire process of device plugin is perfectly concealed by application aspect. So for application, it is agnostic from this process and just goes with the device or APIs. So do we have everything we need for device plugin? The answer is no. There is an open issue for device plugin which originally comes from our use case. As we mentioned before, IG devices are way more complicated. Some devices are not as simple to manage with open-close but require more and some specific operation when release the resource. Device plugin currently does not have such kind of callback interface to release the device resource. So we have been working with the community to support this requirement in the mainland. If you are interested in this, please take a look at the KEP and leave some feedback and comments. Finally, we are going to talk about dynamic cluster reconfiguration. As you can imagine, IG environment, there are some situations different from cloud infrastructure. At first, robot moves, vehicle moves. That means accidentally, robots move out of the network and usually wireless network is used in IG devices which we can see unstable network. In addition, IG devices can be easily shut down or powered off including miss operation and the work with battery breaks down easily and cost should be well considered. Going with this situation and environment to use cluster system in IG, it has to be robust and reconfigurable without manual operation. Some methods are provided such as high availability but we say it should be more dynamically reconfigured on the IG device. We expect this requirement for cluster reconfiguration in IG device. As you can see the picture on the left, there are some candidates for primary service. Technically, all devices can be primary and they will do leader election within candidates and once primary is online, that service is notified to all the worker nodes. Even candidate nodes become worker nodes as well. If the worker nodes come online in cluster network, it will detect the service is available dynamically and participate in the cluster system whatsoever. Also, name space should be applied. Thinking about that use case, multiple cluster network in the same line for factory and logistics. This is needed to support. So far, we have been developing this framework based on Kubernetes API to see if what more missing for all use cases. There are just ideas but we expect that this are also needed to support as a total solution for H cluster system. In the future, application redeployment is required on more dynamic sensing dates such as location perception and recognition date. For example, if the location changes for physical world, application will be redeployed to adjust to that location. Robots in hospital should be aware of medical context. Robots in the store should be aware of food, drink, and context. Under support distributed system, we will need a kind of set of architecture to detect the failure in the entire system. Without that, it will be so hard to maintain the entire cluster system in IT devices. At this point, we think we can learn and study more from cloud experience. And probably supporting microcontrollers with Kube Edge more lightweight agent compared to KubeNet. Supporting this requirement, we believe that we can actually support KubeNet's cluster in Edge system. That is all of our presentation. I appreciate for watching on this. If you are interested, please feel free to make contact with us. Thanks.