 Good morning everyone. Welcome to our panel. Today we'll delve into a topic that has sparked great interest in the industry, the generative AI at the age. The piece of the development in this field is very rapid and it's filled with challenges and opportunities that we are or eager to explore. So we are thrilled to share our insights and more importantly to hopefully hear from you. In this area of the rapid technical advancement, the converge of the cloud-native technologies and the age computing and also the AI presents unprecedented possibilities. Through today's discussion, we aim to deepen our understanding of how these technology functions in real-world applications and how they will shape our future. So first of all, let me introduce the panelists we have today. My name is Kevin Wan. I'm a long-standing contributor to the CNCF ecosystem. My contribution to Kubernetes starts in 2015 and later on I initiated projects like the Cube Age, Volcano and Kamada. All of them are currently incubation-level project in the CNCF and also I'm now on the technical oversight committee to help the broader community to incubate the new technologies. So another speaker is Tina. Tina is the director of infrastructure ecosystem at ARM. She is the recognized leader in open-source software, cloud infrastructure and edge computing. She chairs the Kubernetes Age Day events under CNCF and also serves as the board chair of LFH. Tina, would you like to say hi to everyone? The next speaker is Ing Ding. Ing is an engineering manager at Google. A lead of the Kubernetes hardening team and brings over 15 years of expertise in large-scale and distributed computing. Ing is also co-founder of the CNCF Cube Age open-source project and the TSC chair of LFH at Crano. Ing has made significant contributions to the advancement of these platforms. Ing, would you like to say hi to everyone? Okay, the next speaker is Hong Bing Zhang. Hong Bing is the chief operating officer of Dell cloud. He is a veteran in open-source areas. He founded IBM China Linux team in 2011 and organized the team to make significant contributions in Linux kernel OpenStack Hadoop projects. Now he is focusing on cloud-native domain and leading Dell cloud's edge computing technology product business team to continue open-source contributions. Hello, everybody. This is Hong Bing from Dell cloud. So we're in control session and we tonight have some discussion about generative AI and edge computing. Thank you. All right. So let's move on to our discussion. Okay. So the first topic we're going to talk is about the trade-off between cloud and edge for generative AI. What's the best deployment path? As we know, currently there are some of different ways people are exploring. For example, some people would choose to run part of the SMOR model on the edge to collaborate with those in the cloud to help and improve the prompt so we can get a better result. But also, there are a lot of other patterns. So what do you see the best choice or what have you seen the other patterns? Ying, would you like to come to the first? Yeah. From my perspective, I don't think there's a big difference from current cloud edge deployment. So it should be the same. We do training in the cloud because they have a lot of computer resources and we do inference at the edge side because it's close to the user. However, there is a challenge for this large vector model. Even the inference, the model is too big to deploy on the small edge. So in our industry, for example, Microsoft and Google, they are doing the small language model. So Microsoft Research released 5.2 and Google released 9.0. So there could be an industry trend to have a much smaller model for edge deployment. So they will collaborate between the cloud edge. The deployment will be similar to the current situation. Back to you, Kevin. Okay. Hongbin, would you like to share your sites? Okay. I think, you know, Generative AI is very popular and I think it's a key topic for our corporate conference. At the same time, edge computing provides all kinds of scenarios and use cases. So it's a lot of practice to how to combine the large language model from cloud together with edge computing. Yeah, because one of my job is to promote some customer engagement. So I can share some insights from, you know, industry point of view or, you know, customer perspective. You know, from our engagement, we found a few options to combine AI and edge computing. I agree with Ying's comments just now. There is one route option to tailor or customize a large language model to run on the edge side, right? For example, shorten the memory and with a limited resource. But another way we saw from industry is that we can combine the large language model from cloud together with a small language with edge. I mean, the small model is maybe transformer-related or non-transformer-related. For some industries or enterprise customers, they need, you know, the general, want to leverage general, you know, capabilities for a large language model. But they also want some accurate results. For example, for some financial institute, they have the, you know, high bar for their output. So they will combine the large language model to handle the general use case. But at the same time, they will use the edge computing to output some accurate result. We also may engage with one customer to do that. This customer is like a financial customer. They will leverage the large language model together with edge computing to handle to improve their client experience. What they need is to output the accurate result instead of the general result. So they will use, for the deployment parts, they will leverage the large language model on the cloud. But they will also deploy a non-transformer model on the edge side. And they will combine their output to generate the better result to their end user. So my opinion is that we can, you know, we have two ways. The one way is to thinner and customize LAM to run on the edge side. Another way is to combine the LAM on cloud and together with the maybe smaller LAM or non-transformer model from the edge side. So that's my opinion and my insight. Okay. Tina, would you like to share your insight? Hi, Tina. Call me or Hongbin? You, it's your turn. Yeah, can it be me? My turn? I cannot hear clearly. Yes, please share your insight about our first topic. Yeah. Okay, my turn. Yes, yes. Yeah. In the evolving landscape of JNAI, the synergy between cloud and edge computing plays a pivotal role. While larger models may thrive in the cloud due to their computational and data demands, edge computing opens the door for real-time localized processing with reduced latency. A strategy we are exploring involves leveraging vector databases and domain knowledge to change smaller, more efficient models. These models can generate high-quality prompts for larger models which can be executed in the cloud. This hybrid approach not only optimizes performance but also ensures scalability and accessibility across various edge environments. Back to you. Okay. Thank you. Thank you. Hello. Okay. Thank you. So let's move on to the next topic. All right. Challenges of running generative AI on the edge. So we all know that, you know, for example, JNAI models, especially RMS, require higher performance and larger memory and also, you know, massive data. But these are kind of not always available on edge, right? But there are more issues definitely to be resolved. So the question is, what do you see the challenges we need to take care about beside the resources? Ian, would you like to be the first? Sure. So in my perspective, the challenge will be the security and privacy. So currently, the large-angle module, the training is really expensive. So that's precious IP. We need to protect this model from stealing or compromise. However, because if we deploy it on edge, it's more prone to be stolen, especially when you deploy on a monitored edge side. So there give the new challenge, even more challenges to protect the model and the pressure data. So that's, and currently, all these large-angle model training framework or inference framework, they don't provide a built-in security model or security. Usually, they rely on the cloud provider or edge provider to provide a security protection. So that's a very challenging job for us to protect it. Yeah, that's my point of view. Kevin? Okay. So as we can see, if we deploy some part of the AI workloads on the edge, collaborating with those part in the cloud, I think the development and debugging experience is kind of a new challenge. As we know that the benefit of this model, this pattern, is that we can kind of keep a lot of initial data, raw data at the age, while you may just use some framework like federated learning or incremental learning to help improve the model as it runs. And also, we can do like collaborative inference. But the challenge is that sometimes when the result is not accurate, we still need to debug, may need to refer to the raw data, the original input. So that becomes a problem that engineers may still need to kind of go to the age side and to check out the original environment. So I think that's a big challenge and we need to resolve. Yeah. Hongbin, would you like to share your insight? Okay. I think we face a few challenges. Besides the technical challenges we talked before, from an industrial point of view, I hope to raise two challenges. The number one is energy. In Kubercom, we have a lot of discussions and talks about how to better improve the energy saving, energy consumption. If we, for the AI part, we know that even for the data center, for AI training, especially for the large language model training, need a lot of power and energy consumption, right? If we move to the age side, no matter whatever the size of your model, you can imagine that the age side still need a lot of power, I mean the energy consumption. If we consider some scenarios and the constraints for the age side, for example, especially for the mobile devices, these devices are powered by battery or internal battery, you can imagine. If we are running a large language model, even tailored a large language model in an age device with battery, powered by battery, so you can see that. I mean, it will be very short, right? So I think the energy consumption is one challenge. If we really want to run, I mean, the large language model or gene AI on age device, I think if we want to solve this problem, it's not just by software or model development. Maybe it's also related by the hardware, even the chip design or the architecture design. So I think the number one challenge is for the energy consumption, right? So the number two from the industry point of view is interoperability. For the data center side, for the cloud side, most of the devices and architectures or frameworks are similar and for the data exchange should be easier. But for the age side, you can see that we have a lot of different architecture, a lot of the operating system, right? With different configuration, hardware configuration. So the large language model running on age device should be pretty much different. So how can we interact with each other or how can we, you know, collaborate between cloud and age should be a problem. So currently, from what I know, we don't have an open standard to facilitate that. So I think the number two challenge from, in my view, is how we can interact gracefully between cloud and the age, even for the age and age synergy. Okay. Thank you. Tina, would you like to share your opinion about this? Yeah, is it my turn? Yes, yes. Embracing the JNAI at the age introduces a new set of challenges, particularly in terms of model performance and resource optimization. One area of focus is the development and debugging of smaller models tailored for age deployment. Ensuring these models can operate effectively and diverse and sometimes resource constrained environments without compromising the quality of output or use experience is crucial. Moreover, the necessity to maintain data privacy and security at the age further complicates the deployment. Developing standards for interoperability and data model synchronization across the cloud edge continuum is essential for overcoming these obstacles and enable seamless efficient JNAI applications. Back to Kevin. Thank you. Okay. We can move on to the next topic. Next one. Yes. So the question is any interesting use case or potential use case you have seen or you are exploring by, you know, enabling the generative AI at age. Honbin, would you like to be the first? Okay. I think for edge computing, edge computing has a unique advantage compared to other computing. It naturally connects to various input and output devices. For example, we have a lot of AR, VR who are inverse experience. So this can be a best input and output if we connect to the large language model on cloud. Or we have some wearable devices, right? So I think the unique advantage for edge computing is that it will leverage the comprehensive, fancy, and usable input and output devices. At the same time, we can combine these, you know, channels with some, you know, the capabilities from cloud, large language model, JNAI. So if we combine this chain, we can imagine a lot of, you know, useful scenarios to provide a better experience, right? We, for example, we have some customers who are manufacturing based. A lot of devices were connected by, to the IoT or even edge computing, right? So all kinds of the data will be captured by this edge devices. And if we generate a lot of data, if this, you know, data or some experience by edge computing can connect with JNAI on the cloud, it will generate a lot of, you know, the usages and can address a lot of, you know, very useful scenarios. So basically, I think we, how we leverage the tremendous input and output devices with the very powerful, I mean, the JNAI capability, this will be a lot of use cases for my mind. Okay. Yeah, in my opinion, I think, you know, this is also second Hongbin's opinion. Actually, in edge computing, there are a lot of interesting scenarios. One of the use case, me and my colleagues are exploring as well the robotics on the edge to collaborate with different instances to help improve the warehousing efficiency for the goods, you know, transitioning as well as the storing. As we see, we can see that at the warehouse, there are a lot of complicated use case for the complicated tasks for the different robotics to do, to finish. And they need to take care of the route they are going to, to avoid conflating with each other and also to save the energy. And also, we also had the administrator for the warehouse to organize the whole thing. So there's a chance, actually, we can rely on the ROM to deal with you know, the interaction with people and to dispatch the more clear task description for the robotics and also use the AI powered algorithm to improve the robotics routes planning, for example. Tina, would you like to share your insights about this? Me? Yes, yes. Jane and I can revolutionize the edge computing applications by enabling more intelligent context-aware interactions in real-time. For instance, integrating the LIM-based machine translation with systematic self-correction directly on devices could vastly improve the communication barriers in IoT applications. Another exciting demand is prompt engineering at the CDN edge, which can be significantly enhanced content delivery and personalization for users. These applications not only demonstrate the potential of JNAI to transform edge computing, but also highlight the need for innovative approaches to model deployment and management. Yeah, we have a demo in the, it's for the, using the whisper model on the edge to do the live translate. Yeah, if you're interested, I can provide offline. Okay. Yeah, so we are actually concluding this panel discussion. I would like to, okay. Ian, would you like to share your insights or? Yeah, I just have a similar, similar observation say when we deploy JNAI on the edge side, it will be closer to the user, so we can provide more real-time LLM-based machine translation or things, especially when it is not closer to any data center, because current from engineering from cloud side still take time. So when deployed to the edge, we can reduce the latency. Yeah, that's my observation. Back to you. All right. So let's just recap what we discussed and hopefully we can have some more time for the Q&A. So today we discussed about the collaboration between cloud and edge. We actually recognized that the deployment that is not a solitary task, but also a symphony of collaboration between cloud and edge. So each has a role to play, complementing the other to create a cohesive environment for JNAI applications. Second, embarrassing the challenges and opportunities that is rapidly involving fail that presents both opportunities and challenges. So both security and privacy are very important. We're running JNAI at age and also the interoperability to support heterogeneous hardware architectures and also the data collaboration patterns are very important. Third, we discussed about the innovative perspectives in the vertical integration. So some of the new points, viewpoints have been emerged to the contextual and the vertical integration of JNAI. So that's what I've seen. Tina, do we have any additional about this? Yeah. So as we conclude, it's clear that the intersection of JNAI and edge computing presents both significant opportunities and challenges. Our discussion today underscores the importance of balanced deployment strategy that leverages the strengths of both the cloud and edge. The rapid pace of change in this field demands continuous adaptation and exploration by focusing on the development of new models and deployment strategies, as well as addressing interoperability and privacy concerns. We can unlock the full potential of JNAI at the edge. I'm excited about future collaborations and innovations that will emerge from this dynamic landscape. Back to you. Okay. Yes. I think it will be a solid trend to see that we can connect the JNAI with edge computing because the edge computing can provide all kinds of scenarios. It can connect the device, connect the people, connect the data and provide no latency scenarios. So how we can leverage the powerful capability from JNAI on cloud and no latency with various scenarios from edge computing. I think this will be very exciting and interesting. Okay. Ian, would you like to add something? Yeah. I agree with everyone. So that's a lot of potential in the future. And we are facing pretty good challenges instead of bringing over to the new world. That's fine. Okay. All right. So it's a pleasure to engage this dialogue with you. And any questions? I think we open for discussions because this is a panel discussion, right? Yeah. You can also share your ideas, not just questions. Hi. Thank you for the discussions. I actually work for one of the big manufacturing firms. And we do have these exact problems which you may always stated here, like taking advantage of LLMs and identifying any manufacturing issues which we have on edge computing because we have limited computing in the manufacturing lines, for example. So can you elaborate on what challenges you have faced in your project so that we can take advantage of? Please pay it as a scenario where you have applied LLMs on edge and what challenges you had faced so that we can take advantage of. Thank you. So the question is about the adoption and the usage for the manufacturers, right? Yeah. So one of the use case we have seen in the Cubase community actually is people using Cubase to kind of manage the whole product line. There are a lot of, you know, data to collect. And also one of the manufacturers, they use AI to kind of detect the quality of the product that they produced. And that helps resolve the actually the human resources. And also as you know, the product line, there are a lot of massive data. So you basically need some of the software filter to filter the raw data and then collaborate the collect the most important part, you think, and you know, to generate more useful information. Yeah, that's what I have seen. Yes, I can add, you know, some comments for usage in manufacturing. So like Kevin mentioned, though, for I think majority of the manufacturing are using AI technology to do defect detection, right? So previously, I mean, the purpose model for manufacturing is to deploy a small model in the front line, but leverage the computing power at the back end, right? So this is a previous model. So right now, it's also evolved to a new model that even for the edge side, you detect a defect. Only, I mean, the back end, much language of my gene AI can generate more information for you to guide how what's the problem is. So give you all kinds of useful information for you to reference. So this is, I think is an upgrade from the traditional model to our gene AI model. So this is the first one. And second scenario we met is for the source control and the source track. For example, if you find a good with quantity issue, so you need to track what the problem is, maybe from the raw material or from the production phase. So the gene AI can tell you and give you some guidance, which is possible to have issues. So that's the two scenarios we met in manufacturing scenario. Yeah. Okay. Thank you. More questions? Yes, please. Hi. My name is Selena. I work at Ngrok. So we're a reverse proxy and we sell our product as an API gateway. And many of our customers are like, maybe not manufacturing devices, but IoT devices who want to configure ingress into their devices. I was wondering if you because how do I phrase this question? Do you, for the configuration part for configuring IP policies and be filtering on requests going into those devices, do you use the LLM models to maybe play with that? Or are you just running into the same challenges that other people do when configuring ingress of their devices in terms of like scaling ports or configuring IP policies? Does that make sense? Just to understand your question, you mean you are exploring the usage of LLM to help simplify the configuration? Is that a question? Yeah. Yes. Did say actually not very special to the edge computing, but a lot of a lot of general use case. If you got complicated configuration and LLM, the generative AI is definitely very helpful because first of all, you don't need everyone to really understand what exactly the failed means in the config file. And you can rely on the gen AI to actually generate a lot of configuration values, not just always to the default values, but according to the context people are really looking for. That's what I have seen people are doing. Okay. Yes, right now we are engaging with one customer to how to accelerate the deployment, especially for the massive deployment because even edge computing, there are a lot of devices you need to configure on the deployment. So yes, we are considering and we work with our potential customer to work out a solution how to leverage gen AI to generate some not just the guidance, but still the configuration and the file and to how to help their easy deploy the massive equipment by configuration. But this needs some work because every place for every device, they should have their own configuration or customized configuration. So yeah. Okay. Thank you. Okay. So since we are running out of time, we are wrapping up today. If you'd like to discuss more, sorry, please feel free to connect us on LinkedIn. Okay. Thank you for listening. Okay. Thank you.