 Okay, it's very honored to be here in Shanghai as they could become China. So today I'm gonna share with you the edge computing. Let me start the timer. Okay, the edge computing in the Kubernetes, some of our best practice and for the future. So this is actually the joint presentation between me and Ying Ding. Ying Ding is the engineer manager for Kubernetes in Google. And he's also the Alpha Edge, ACRNOTSC chair, could be edge co-founder and maintainer, the CNCF TOC, the talk contributor. Myself, I'm a director for infrastructure ecosystem in Hong and San Jose. And I'm also the Alpha Edge board chair. So here to this agenda, how we navigate today's session. First, I would like to give an introduction to the edge computing. And then I will talk about the need for Kubernetes in edge computing. And also we'll share some of the very practical, the real world, the best practice for implementing edge in the Kubernetes. Then I would like to discuss with you of some of the future chains and considerations. Help, here, let me introduce how we understand edge computing. You can see here, the need for Kubernetes in edge computing, the best practice for implementation, the edge in the Kubernetes. So do you know what's the edge, where's the edge? Because a lot of people think the edge is just your phone or your home gateway. Here, the definition of edge is wherever the source of the data, to the destination of the data. Anywhere in between, we can call it edge. Think about that. That's why our edge can be located all the way to your phone, your home gateway, a server in your IT server room in the building and enterprise. And also the edge cloud server or even in some regional data centers. They are all, we can all call them edge. Depends where they're located. And there are some of the future chains in consideration. If you are here to attend GoSend, you know a lot of topics, AI check has a lot of people. Like in Elf Edge, I'm sharing. We have the Elf Edge AI project, the AI edge project. We try to see, like today, a lot of the larger language models, it trains very well in the center. But speaking of inference on the edge, how do they adapt to the resource restraint and the resource constraint environment with the archer low latency and also with the bandwidth need and also with the how much memory you, like you can provide that much of bandwidth, right? And then I can talk about some of Q&A session. There is definitely some benefits. So if we do edge computing, we can have the reduced latency, faster response due to local processing. You know the 20 millisecond for 5G and to end. Think about when you call, have a video chat or have a FaceTime with your friends, like how soon can they pick out a phone? That's how it goes from Alice to Bob. All the way across the whole network from here to China, across the Pacific Ocean and arrive in the states. And also if we do a lot of processing in the edge, we can do the bandwidth efficiency. Like let's data send across the market and cross the network. For example, a lot of like inference you can do as close as close to your phone or like any of the edge container edge virtual machines and edge bare metal in any of the edge cloud. Like you can use volcano engine, Ali cloud, Baidu cloud, Tencent cloud, it's up to you. But the specific path for China market is like this cloud, the hyperscalar cloud has to resign in one of the total cloud first, like China Mobile, China Unicom, China Telecom. This is the different path. That's why I think in China, the provided infrastructure as a service for edge computing is really important. However, in the states because the edge cloud doesn't have to deploy three times in different operators. But here in China, you do have to duplicate your edge cloud resources and deployment in China Telecom, China Mobile and China Unicom. Yeah, so this is the thing. Another thing is the enhanced privacy and security. We also see some of the change. Some people, I just have at the lunch table with discuss like some people wanting to have called iMac. You can even do for all your AI, large language model in Apple box at your home. You can already build a large language model or we call small large language model at your house. You can keep all the privacy in your house or close or very close to where you are. Then you can be free at my home in the states. I have four cameras at my house. Now it goes to the Comcast. I can see when my dog's passing and whenever my kid's just walking there, anybody delivery man there. But sometimes I worry about the privacy. Yeah, if my kids just run there, like they didn't just well, like with the people in the IT room, they will sit. So this is edge computing. We need to solve the problem with privacy and security. So the data remains local, reduce the exposure risks. Like today, there's so many data's created every day. We don't know where it goes. That's why it's so important. So here I wanted to highlight the need for Kubernetes in edge computing. Why Kubernetes is essential for edge? So if you go to the repos in the Kubernetes or in CNCF, you can see a lot of Koopie edge open yard or K3 as many Kubernetes. There are so many disorders. That means in different deployment scenarios and use cases, you will see the different ask. Like the scalability, how we handled thousands of edge devices. I don't know other, maybe there are some other technologies, but I think the Kubernetes and its disorders will solve the problem and serve the purpose very well. And also for the seamless deployment and management, because for example, like the Koopiecom I'm gonna talk about later, and it can do the cloud and edge synergize very well. And also talking about the resilience, the self healing capability for robotness because we are based on the cloud native environment with the Kubernetes, with the CNCF components. That's why, and the method is so similar and seamless, consistent. It's very easy to highlight who's problem. Not like when the problem happens, issues happens to point your finger to the other hand, right? It's not my fault, it's not the people's fault. But with this, it's very easy to diagnose where the problem is then solve the problem segment by segment. Here, some of the best practice, like the key practice for Kubernetes edge deployment, we have the modular design, break applications into smaller independent modules to deploy at the edge. So if you see some of our new AI edge projects, we do break it down into the ice, like you can use Shifu, Edge X, Foundry, or you can even use Flatch as infrastructure as service. Then you have to provide the APIs to the upper layer. It's called the AI elastic framework. But within this, you have many models, how you do the control management and even orchestration and scheduling. So this all needs a lot of small applications into the small modules. The other thing is the state management. It's very important because for the edge, for any applications, no matter it's like a cloud gaming, cloud phones, you need to maintain the state. Otherwise, you don't know where am I, for example, I'm playing Wang Zhirongya, right? So like how many people play or see people smiling? You probably play a lot. So if you are in the middle of something and playing, if our edge use cases or edge deployment, like stop your service, like you are losing your game, this is not the best experience. So we have to design the state list application or synchronized state to avoid this data inconsistency. This is very important for a lot of games fans. And we do like Elf Edge publish cloud gaming white paper. I think the work was led by white semi, jenny motion, jenny motion, you guys play games, you know, jenny motion and Tencent, a lot of contribution from the China community. Very good. Here, I would like to share another best practice, like how we refine the Kubernetes Edge deployment. So the optimization, we wanted to ensure minimum and efficient data transfer. For example, a lot of times you have the data transfer, but you don't have the layered transfer, and you just transfer everything across the network. It's not worth it, because some of the information is not critical. You can put the data in layered and put a different layer of data into object storage. Right, you only transfer the key data. For example, there will be a sliding window for the car. Oops, sorry. Did I just accidentally click something? Okay, sorry. Like we have a Kuiper in Elf Edge. We just have the slicing window like 10 minutes. If any car accident happened, we only grab the data 10 minutes before the accident happens. This is called 10 minutes sliding window. That's why, because there are so many sensors in the car, you can collect and stop them everywhere. I know even some of the sensors, the memories in the car, you can store all the way to one gig, but you can't just store all the data for the whole day. Even you have the best and the biggest data center, you can't do that. You have to find a way to how to ensure the minimum and efficient transfer for the data. And security, I couldn't emphasize this more. We have some security experts sitting on the corner. Hello, from SUSE. Yeah, we do use SUSE a lot here. The secure device identity. I think they talk about zero chest this morning at their keynote. And also use the end-to-end encryptions. I think this is very important for edge computing because a lot of applications like retail and it runs on the, like a lot of personal images and some user-related things. And going through the edge computing and the networks. And we wanted to apply the regular patch management like over the air. For example, like a lighting as a service. Here, this is luminaire, right? Oh, it's shaking, is it earthquake something? There are like four to five sensors inside. Then you think about how we do the patch. We tear it down. No, we do over the air. So we want everything is over the air. Not like you have to break the power system and do it again. So this is what I mean for apply the regular patches management. Here we go. The next best practice I wanna share is elevating the Kubernetes Edge deployment. So resource management. I used to work in a product called Resource Manager for a couple years. Quite long time for me. Yeah. The resource management has to optimize for CPU memory and storage. Because like at the launch table, I talk about acceleration. A lot of times you wanted to offload your workloads no matter it's like from DBDK or from the DPI or from any like over the PCIe things. Like you wanted them now running on your main CPU but you can offload it to DPU. And then how much memory you want, you wanted to understand. And also logging and monitoring. I think everybody use the logging. Not sure about the monitoring. So implement, centralize the logging and the real time monitoring for the quick issue detection and resolution. We do have a use case. Some member company of LFH, they do the PCB board detection. Because they have all the logging. The logging use AI algorithm they can find out what the problem may be. And this logging sounds like a really, am I blocking you? Yeah, sorry. So you may be found like some of the logs really abnormal. It shouldn't be like that. And then automatically it pop up and say, hey, there may be a PCB detection, PCB defection there. You have to figure it out and find a good solution for that. Okay, let's come to a real very concrete example here. This is the LFH catalog. What does it mean? So many people you have cars, right? When you go to a forest store, you wanna buy a car, what do you wanna do? You wanna try? You wanna try for like 20 minutes, just drive around in the very busy Shanghai center or you wanna say, oh, can I drive for two days? I get used to how the car runs. I know how it works. This is the idea. We have this LFH catalog. You can have a robust solution for managing LFH applications. So through this, you can really try how it works. Here on the left hand side, it's edge gateway node. It includes Smarter, Smarter is open source component, Smarter edge agent. We are going through the CNI or DNS device manager on the top. But here you can see a lot of parts, audio and machine learning applications like M beard and the front bit, machine learning applications like video and supporting X264, 265, FF MPEG, AV1, all of this and also the GStreamer. So and then it goes through a Smarter cloud services through the edge control plan and the edge data plan. So we use AWS EC2 instances, right here if you wanna use Ali cloud or the others. We use the KSOS clusters and we have a venture people sitting here and through the front bit and in flux DB and finally send your data through the Grafana. So the Grafana will do some of the monitoring and showcase how much power you're using like what's the CPU usage and memory usage, et cetera. So in this case, our value of this catalog is to drives for the evolution of edge computing and it's very et cetera as IOT for small cities, emerging technologies grow importance like autonomous driving, et cetera. And also the talkie audience, we wanna talk you for the developers, IT professors and individuals in triggered by edge computing. So the LF edge catalog is basically a gateway to explore and experiment like you have two days you're driving in the car and you know everything, not really everything but you got a chance to touch every aspect of that. This is another example. I think some of the Kubernetes maintainers are here and the Kubernetes maintainers here. This is actually a real case. It happens in the highway ETC system. Who knows what ETC stands for? It stands for electronic toll collector system. So like every day you drive through the highway you have to pay. And here you have the on-premium toll system on the top and then the provisical highway network O and M centers on the top. By here when you go through this toll gate you can see and we have the toll collection. We have the ETC gate either on x86 IPC or the arm can server or then we put the video and the vehicle cooperative edge here and then you use the Kubernetes and could be edge to connect edge to the cloud. At the central cloud you can have some vehicle infrared cooperative and with the video cloud to see like what kind of like it can identify like what card your place number and is this car like a violate anything but you collect the or the tolling information to the provisional highway network OEM center. OEM center probably just for the OEM but sometimes it also involved in the charging system. So here the benefits for this we have more than 50 K plus edge nodes. It's moderated by could be edge and there are 500 K plus containers in total 300 million data records per day. This is a lot, a lot of data. If you don't put a process in the edge see how much data is cross the network cause the conjection. And so time use the passing through tour station include like from 15 seconds to two seconds in average per car and then we I know so I don't know how many people are gonna drive for the Mid-Autumn Festival and national days. So if we use the could be edge everywhere it could be like 29 seconds to shrink to three seconds. It's a lot. So you wanna go home to reunite with your family. So this end the user is on the confirmation to publish more so but this is the real case. It happens in China for many provinces. So for the future change we really seed AI in the edge. The local processing for quicker inside and it's mainly I mean the inference. The inference could happen in the GPU or on the CPU but it's definitely on the edge. I think this is the most efficient way. The other one is 5G and edge. The archer reliable low latency communications. We already see a lot of 5G even 6G. They wanted to have edge computing for the archer low latency use cases and also for the service meshes managing the inter-service communication efficiency. It's because the service mesh in Kubernetes is so efficient for edge computing. We usually use them as a pair. Here, considering for the future, we'll see how we navigate the future challenges in Kubernetes edge computing. Interability, we really like this because in our community we do a lot of interrupt event and we do demo proof of concept and we sometimes do even the trial. Like China Mobile, they work together with Migu to run their, it's called the CFN, the computing force network and I think Jiangxu Information Harbor and this is a really good use case. They put a lot of vendors edge computing devices, the servers together to ensure the compatibility. I think bad word compatibility is very, very important. So I have a friend, his name is Vingsev, the father of the internet. A lot of times I always worry about my design will make some bad impact to our existing network and he said, Tina, just remember back compatibility is the most important thing. Yeah, because a lot of things, it's legacy you have to deal with it to work with it and sustainability, this is a word and we have a lot of sustainability things to do for the power efficient device and the solar powered edge devices. Like we work with Stanford University and we can see the AI edge has been used for the smart grid, smart agriculture, built environment for smart cities, smart buildings and also even for the EV cars and some of the regulation issues we have to look at to keep the abreast of evolving data privacy laws. I already consider the edge computing regulation is actually different between China, US and Europe and because different power consumption regulation and applies to different regions. And I would like to conclude the my source but I do have some background slides, no worries. The brain is all together, the Kubernetes and edge computing together drive the next wave of cloud native innovation. We really think that because you've already noticed the presentation is from the edge to the cloud and also by integrating the best practice. So I really know the edge computing is a very fragmented market and it's kind of niche market but there's a lot of opportunities there and we can ensure a smooth scalable and secure edge infrastructure. And here is my email and in things email and linking in. Yeah, you can take picture. Feel free to connect with us. I have some reference of my personal website and elfedge.org and anything can answer any questions for a project, a criminal on the Elf Edge umbrella. Elf Edge has like a 13 or 14 projects currently are very active. A criminal is one of them for the, how to say, for the last stop of a deployment integrating all the edge computing technologies. Okay. So I wanna invite you. The future is building on ARM for the, you can visit us to work with us November 27, 29 and December 1st in Shenzhen, Beijing, Shanghai. Come and talk to us. We have the boost number here and you can also scan the ARM community WeChat official account here and there were like, how many? I'm counting like in English a lot. Millions, thousands, millions of, yeah, even more developers together at the ARM platform. And I would like to share some backup slides because this is a technical community. So this is the overall architecture, how we have this child in China, Unicom, I think, in Guangzhou province. So we can see there's some basic requirements for smart cities and we use K3S. Thank you, Susan. And the edge of ours and Pasek, which is a security mechanism and Chitin is a AI open source module. So we integrate them to carry on on the Sonica gateway device. You can see the Sonica gateway. These days, like Microsoft is really the leading company in the Sonica as technology, but the gateway, there are so many white box switches companies. They build the Sonica switches. Basically, we run the complete computing power registration of the heterogeneous computing resources. You can see here from the PC, we run the web browser on the Ubuntu. Well, of course we can also run SUSE and we have those function ABC, like the system infer or the Chitin function one or two could be video, could be audio, like add Sonica gateway. We can run the K3S server on top of Sonica. You know, Sonica is more like operating system for the data center infrastructure networking, right? And then you run the edge fast functions, fast like a function as a service and also has this Pasek here, client, and going through for the other NVIDIA JSON, Nano number one and number two, there the functionality is the same, but we also run the K3S agents on top of them. So it becomes the jet pack. So this field trial is very successful and we got the happy result. The other one I wanna share is the platform, how we do the software together. It's like a bigger view. You can look at this and above the ASIC drivers, network drivers, platform drivers, you can run Sonic as an operating system for the Y box switches. And for the security point of view, we run the Pasek service here and we use the K3S agent and run the Chitin edge fast and Pasek client to communicate with K3S servers. And I would also like to share a use case number one is the device application machine learning model inference offloading workload. I talk about offloading, right? Because edge computing is really good for offloading for the accelerations of your heavy workloads. Here you can see you can do some emotion recognition. It's not just face recognition. It can tell, oh, you're happy, sad, slightly sad, very angry, the kind of emotion detection or recognition. Here you will use a camera app, could be your iPhone, Android phone, and then it can convert to the pixel array going through the devices on the left hand side. This edge computing device could do the image process and image resize and convey the images to this pixel array. And then it goes to the edge. The edge will do the inference. We can see that it's an engine and do the recognition of the emotion and the machine learning service offloading APIs. So through the HTTP 2A500 and the protobuf, you go through the GRPC server and sentient model and then go through the AI service engine manager and the tensor RTAI framework. And then finally we reach the cloud. The cloud can do the application registration, the emotion recognition model training and also the model deployment. Finally, the last step is the model source. So this is a complete end-to-end case, how you round the AI on the edge. Yeah, my colleague told me don't take questions. So this is my last slide. Like is it, I don't take questions for Q&A but if you have something you want me to describe more in details, I'm happy to. Nobody? Yeah, yes, please. Maybe give her a microphone. Can you join me? Just a few of you in the audience used the K3S before. The teacher also mentioned COVID-19 or COVID-19. These are the edges of the cloud. In these, because our theme is edge cloud, in the selection of these basic facilities, is there any direction or suggestion? Yeah, I think it depends on your deployment scenarios. You can see for the ETC case, we use the Kubernetes, three minutes, thank you. We use the Kube edge. For the Sonic Y boxes, we use the K3S. I think it really depends on how much you communicate between edge and the cloud and your use cases what it cares about. Sometimes it cares about more experience. Sometimes it cares about more accuracy. Like for the ETC, you collect the money, right? You collect the toll fees through the highway and the accuracy, location, the information is more important. For the Sonic K3S, it's more like how you do this function as a service because some of the telcos, they want to do function as a service. In this case, they can have their app store in their telco cloud native networks. I think the purpose in the main target is different. That's it. Is it enough information for you? There's another question. I should be speaking Chinese, I'm sorry. Go ahead. I understand what you mean by that. You mean the vision, the vision, these things. There's another question. If there aren't that many devices, you don't need to manage thousands of devices, or it's a relatively small model, what's the basis for using the cloud? Do you have to use the Kube S? This is why I use Chinese. This is why sometimes we don't use the Kube S. We use Kubernetes's distro. For example, K3S, or Mini Kuba S, Kuba S, these things can be used. So it's not certain. It depends on how big your scene is. Do you need all the data to be collected? Even the container, is it necessary? We don't belong to the scene. There's virtualization. When we belong to the cloud game, some of the cases are based on virtualization, some of the cases are based on container. It's based on your client, or your cloud department, their TCO. TCO is the most important point. For example, TCO, it can't use some version cards to complete the same server. It can achieve the same effect. But the version cards can't do the virtualization. So what do we use now? Based on our current conditions, the TCO, and the future development will definitely be based on the virtualization. So it will make a transition, make a transition migration step by step. Is it time now? Am I time now? Hello, just now I saw that there were a loophole in your structure, and then there was a cloud, there were three layers. And in the business, what kind of business will be put in the loophole? What kind of business is put in the loophole? What kind of business is put in the cloudhole? Or is there a indicator that can make the business through this indicator accurately reach out to these three types? Is there a indicator that can make it reach out to the other? Yes, I only have one minute to answer. For example, this example is the same business. It's just that it divides the business into different sections. And this business, for example, it is in the loophole to do inference, but its training is done on the cloud. For example, the training requires a lot of algorithm to do it on the cloud. But a lot of it is close to the loophole to do inference. This kind of thing can be done on the loophole. So it's the same business. It's not about what kind of business is put in the cloud and what kind of business is put in the loophole. I understand that. He may say that there is still half a minute left. Hello. In fact, we are in this business, we have to divide different tastes or some other subtleties. But in the process of architecture or in the process of tracking, we hope that there will be a scale or a few scales. Through this scale, to track the business, you need to calculate the end or the calculation setting. So this is a scale or a scale standard used in a model. I don't know if there is such a design. I understand that. This is actually different from every programmer's supplier. But most of them are calculated according to TCO. I use this method. How many ways can I run? For example, I play games, I run 740 ways, 720 ways. Can I use two methods to run? Can I reach the same TCO? It's like this. This is the most important thing. Of course, sometimes they will consider this power consumption, but it may not be the most important thing. We can... My WeChat is TINATSOU6. You can call me. I will continue to discuss. Okay. Thank you. The lady said that it's already time. She wants to drive me down. Okay. It's very honored to discuss in so much detail with everybody. I wish you have the best Mid-Autumn Festival and National Holidays vacations. I wish everyone a happy holiday. Thank you.