 Hi, everyone. Thank you for being here today. My name is Kate Goldenring. I'm a software engineer at Fermion and co-chair of the CNCF IoT Edge Working Group. Hi, and I'm Ammar Kapadia. I'm co-founder at a startup called Arna Networks. And we're here to talk about edge native application principles. So the CNCF IoT Edge Working Group wanted to kind of work together to find what does this word edge native mean? We know what cloud native means, but what does edge native mean? And so we're going to take, walk through the process that we went through as a working group that culminated in nine edge native application principles that we put out a white paper around. So we started off by defining what is edge, and we came up with our own definition for that. And then we went through what does cloud native versus edge native mean? What are the differences, the similarities? And this led us into our nine edge native application principles. And then we're going to talk about some case studies. So some applications where you can see these edge native application principles being used. And finally, we'll talk about the working group. So we really see this white paper as a draft. And so we expect future revisions, future breakout papers. So if any of this interests you, if you have feedback, agree, disagree, we'd love to hear. Come join the working group and tell us how we can make the next revision even better. All right, so if you're at this event, you know what the edge is. You've heard enough definitions. But we are going to give you our definition just for context settings. So you know where we are coming from. So we are going to define edge computing as a paradigm which brings compute and processing data processing closer to the source or the user. For example, a robot controlled in a factory. And there are four key benefits to edge computing, which again, I'm sure all of you know. So I'll just quickly repeat reduced latency. For example, if you're doing virtual reality or robot or drone control, bandwidth management. If you have hundreds of cameras and instead of pushing all that data to the public cloud, you want to process that at the edge. Increased privacy depending on your local regulations and uninterrupted operations. If you have poor network connectivity or even connectivity is absent in an air-gapped environment. So if you're on a cruise ship, if you're in a remote mining site, et cetera. So that's how we define the edge. And next we are going to classify the edge. And there are different ways to classify what edge means. We are going to use geography based classifications. So how far is the edge site from the data source or the user? We are going to actually reuse a definition by Linux Foundation Edge. They put out a white paper where they described it. And they broadly split up the edge into two. Service provider edge, which is a shared edge. And then a user edge, which is dedicated for a given user or an organization. The service provider edge is further broken into two. The regional edge and the regional edge looks very similar to the cloud actually. It's multiple racks of equipment and you can use a lot of the cloud like methodologies. Access edge. And then when you get into the user edge, you have the on-prem data center edge, which again has some similarities. It's traditional industry standard server storage networking. But then you start getting into the smart device edge and the constrained device edge. And this is where it gets, the form factor gets really tiny. And that's sort of the farthest end of the edge. So with that. So now that we know what the edge is, taking a step back, what is the definition of cloud native? And a lot of us may know this. The CNCF has their own definition. But cloud native is really that movement away from your monolithic tightly tight architecture and more into this microservice approach where you're able to run these highly scalable applications. It's loosely coupled. It's resilient because of that. And a big part of that is that it's all very observable and you're able to take management actions based on what you observe. And the result of this is that you can make these high impact changes frequently. So if we transition to what is edge native, and this definition was put out by the open glossary of edge. And one thing to point out here is that edge native principles build off of what cloud native is. So we're not talking about an on-prem data center here. We're talking about something that ultimately connects up to the cloud and still adopts that loosely coupled architecture. But it's also really taking into account those unique characteristics of the edge. So resource constraints, different security, different parts of security that you need to take into account. Latency and autonomy. A lot of your applications are going to be a little bit more tightly tied to data. And so then if we take a look at what is the similarities between cloud native and edge native, like I mentioned, in both of them, you want this portability of your apps and services. So you should be able to deploy your same application from one edge site over into another edge site, just as you can horizontally scale in the cloud. Observability. You should still, whether you're deploying your application on the cloud or the edge, be able to observe this all in one platform and see where your application is living, see its health, data, telemetry, metrics, et cetera, should all be still a part of your edge native applications. Manageability. Based on all that data that you're seeing, all of your observations, you should be able to take those management applications. So whether you're doing that in the cloud or the edge, those actions may look a little different, but the idea there is certainly the same. And then finally, one of the things about cloud native is the idea that you're not doing these monolithic applications where everything is in the same language. You can have different services using different languages, and that certainly does not change whether you're running your application only in the cloud or it's spanning to the edge. So now we get into the interesting stuff. If you are a cloud native developer and you have gotten comfortable writing cloud native applications, this is the slide that's really important in terms of what are the differences between cloud native and edge native. So you can start writing edge native applications that are more suitable for the edge environment. The first set of differences can be classified as state management. They deal with how the state of the application is managed. The first one is application models. In cloud native, we have gotten comfortable with stateless microservices that scale horizontally, and you have a centralized data store. And in the edge environment, you can't always assume that. So if you are in the service provider edge, it may look similar to the cloud, but as you get to the more and more constrained edge, your applications may start to get monolithic. And in either case, whether it's service provider edge or the user edge, you are probably going to have your data collocated with your application. So that's one big difference between the two. The second is in the data model. So we just talked about it. In the cloud, you are going to have a centralized data store, which will be backing up a bunch of stateless microservices. In the edge, you are going to have a lot more diversity of data models. You may have that exact data model that I mentioned, but you might have caching. So if you are in a poor connectivity or an air-gapped environment, you might have streaming if it's a lot of video or content. It might be distributed where your data might be sharded between edge sites or edge and cloud. So expect a lot more diversity of data models. Elasticity. In the cloud, you assume infinite elasticity. You just start scaling horizontally and use as many resources as you want. On the edge, that's not always the case because the edge is constrained. So you're going to be elastic only to a certain limit. And as you get more and more to the user edge, which is even more constrained, you'll have extremely limited to no elasticity. And in fact, you may want to scale vertically. And by that, what I mean is that you might want to scale to the public cloud. So you might have a piece of your application at the edge and a piece in the public cloud. So elasticity has to be thought through differently. Resilience. In the public cloud, resilience is outsourced to the cloud provider. You assume if I go across a certain number of nodes and if I need even more resilience, I can go across sites and I got my resilience. In the edge, again, you can't assume that. And in the edge, we are going to go backwards in time. Now your application may need to start thinking about, do I need dual redundant power supplies? Do I need dual redundant fans? How do I recover if there's a failure? So you have to think of these things. And ultimately, on the edge, you might have to accept less resilience just by the nature of the edge. The next set of differences deal with orchestration and management. And these all stem from scale. The scale for a cloud is you might have a few public cloud sites and a few applications. In the edge, the scale dramatically changes. Now you have tens of thousands of sites. And two years ago, Walmart said they had 10,000 sites. These are not big numbers. And you might have hundreds of thousands of devices that are being managed by those sites. So the scale is, you know, orders of magnitude higher at the edge. So what that means from an orchestration point of view is, again, when you're in the cloud, you are assuming centralized orchestration into a few sites and you're doing it sort of for efficiency and you're going to scale horizontally. That's what we just discussed. In the edge, you're going to have distributed orchestration. So you might be taking the same application and putting it across a large number of sites. You might be taking the same application and splitting it across edge and cloud or multiple edges. And your orchestration may become location-specific. So in the cloud, we generally don't worry about the location. In the edge, location is very important. I want to be five milliseconds from plain old, right? So on and so forth. From a management, both are centralized. So in that sense, they're similar. But the big difference in the edge is that you might need central management and remote management. If there are certain actions that need to be taken that are time-sensitive, that need to be taken very quickly, then you might need sort of a distributed management framework. You have no staff on the edge, or even if you do have some staff, they're generally not going to be trained to the same level as the public cloud or the data center. So you have to have zero-touch provisioning of hardware and software. If you are upgrading devices, those upgrades need to be robust or sort of brick-resistant upgrades. So those are some differences, and I'll hand it over to Kate for some more. So next week, we can think about networking. So in the cloud, you have very strong, highly available network that you're guaranteed in these big data centers. However, on the edge, that's not always guaranteed and oftentimes not expected. And so this is when you take into account some of those data models. So maybe you do more of store and forwarding, storing when the network's not there, forwarding when it is. And then another part of networking is the domain of the network. So in the cloud, you have a public network. On the edge, your whole site may be isolated in a private network, and there are security benefits to that. On the third part of networking, you can think about networking beyond IP. So the other protocol is that your edge servers may be using to interact with the local devices that are generating all that data on the edge. So protocols such as OPCUA for industrial machinery, ONVIF for IP cameras, MQTT, more and more. And you need to be familiar with all these different protocols and have to think about which one fits your needs on the edge the best. On the security side, it's quite related. So on the edge, you have all these devices. So even though you may have a private network, your physical site is way more accessible than these lockdown data centers. So someone could come in and put a device on your network that should not be there. And so part of this is knowing what should exist on your network and what shouldn't, and keeping an inventory of that so that you're only using data from devices that you trust. And so that zero trust of everything on the edge is really important. And continuing this discussion of devices, as I was mentioning, you're going to expect a lot more of external devices on the edge. The cloud, very homogeneous, racks of servers, all the same static system hardware. Edge, the environment is way more varied. That's when we're talking about this user edge of that spectrum. And so those are the devices that you want to be leveraging. That's the reason why we have that compute so close to them. And so a huge part of an Edge native application is thinking about those devices and using them. As we move into the servers themselves, the hardware is also more varied on the edge. So while in the cloud, you kind of have these standard flavors of architectures and OSs, on the edge, you may have more specialized hardware with specialized hardware interfaces that you want to use in access, or you just need to be aware of. That also comes along with hardware awareness in the sense of what restraints do you have on your hardware. You need to have a better sense of how much CPU and memory do you have because you can't easily just add another VM or add another server. So, like I was saying at the beginning, as a working group, we went through this discussion. And this led us to our nine Edge native application principles. And this is gonna be very based off what we just talked about. And then we took the nine and broke them down into five smaller categories. And so the first larger category transitions from what we were just talking about. It's that being aware of what resources and devices that you have is really important on the edge. So with hardware, you need to be aware of what you have on your servers and try and use any tooling that you can to make that awareness as simple as possible. So, for example, if you're using Kubernetes, you can use the device plugin interface to understand what physical hardware is available for you and make that known to Kubernetes so then you can make a scheduling decision and kind of make that less of a decision you have to make as an operator and put that on your tooling. For external device connectivity, you're gonna want to use tools like device registries, like I was mentioning, certificate rotation, and think about how you can choose the right set of tooling to make that as seamless of a process as possible because you're going to need to make sure your applications know the access endpoints for each of those devices. And ideally, that same application can be deployed from one factory site to another, one edge site to another, regardless of the number and the different addresses of the devices there. And finally, being aware of variable connectivity is, of course, important. And that goes back to, how are you gonna design your application for that? So is it gonna be purely air-gapped? What gets forward up to the cloud? What alerts are important to move forward in that chain? The next larger category is at-scale management. So as we've been discussing, Edge Native uses the idea of cloud native. And so this starts with the infrastructure. How do you make that zero-touch onboarding experience of your physical infrastructure? You can do something like Fido device onboard, which makes it so that you can literally just turn on your server, and it'll do a remote call to flash your firmware on, and then your platform and your application and go from there. And actually, if you're more interested in FDO, our working group talk last year, Steven Wong talked a lot about FDO if you want to learn about how to do that to get your infrastructure up and running. The next level above that is application management. So what is your platform that is going on top of it that's orchestrating your workloads to the appropriate sites and to use the appropriate devices? And so you're going to want to use some sort of orchestrator just as you would in the cloud. And then finally, all of this should be centrally observable, and you may be using the same tooling that you would be using for your cloud, so Prometheus for monitoring and so on. And then the last three principles stand on their own. So the first one is a caveat of sorts. Like we mentioned, it's portable and reusable with limits. So as you move more and more to the user edge to more of the constrained side of things, your application is getting more tightly tied to its environment. Its portability and reusability goes down. As you move more to the service provider edge, it gets more and more portable, the environment is more predictable. And the community in general is building out tooling for the user edge to try and make that as easy and reusable and portable, but we're still working on that more. One that we may have not touched on as much is resource usage optimization. We've talked about how you can easily in the cloud add a server and on the edge, that's not always possible. And so you need to be aware of how can you design your application to use as little resources as possible and then share that with other applications. So that could be time-based sharing, for example. And then finally, spanning. So deciding what stays where it is and what moves vertically up towards the cloud. So do you have an edge hub and what is going there and what is further going to the cloud and what management actions are coming down from the cloud to your edge? So far what we showed you was perhaps a little abstract. So we're going to give you two examples. So I'm going to give you one example and then Kate will give another one. So one, this example is for 5G. So how are 5G applications using these edge-native principles? Now, first of all, many of you may not know that 5G networks are actually going to be software-driven. Previous networks used to be hardware. You would buy appliances and hook them together. 5G is all software-driven. So other than the radio, everything is virtualized and now moving to containerized workloads orchestrated by Kubernetes. So the diagram which hopefully you can see, the left-hand side is the enterprise edge at the tower of the cell tower and you're running something called the RAM, the Radio Area Network. So that's the piece running there. You might also be running some edge computing applications. That's possible. Then the next you get to the telco or the cloud edge. This could be the telco edge or a colo or data center and there you're running something called the 5G core and you might be running more edge computing applications depending on the characteristics. And then you are in the public cloud where you might be running voice services, collaboration services and this is just representative. Some people move things around. So you might be running, for example, some people might run the core in the public cloud. So anyway, with that architecture, let's see how it maps to the five group principles. So resource and device aware. So the RAM software is interfacing with the physical radio. So that's where it actually has to talk to a device and is also aware of acceleration. The RAM software typically needs to hardware acceleration. There's like digital signal processing that's going on. So you need things like GPU, DPU or an FPGA. At scale management, tens of thousands of RAM sites are not unusual and they all have to be managed centrally because as you know, there is no staff member at a tower. Spanning, we just saw the RAM core and voice services span all the way from the user edge or the tower edge to the public cloud. Resource usage optimization. This is very important. In a public 5G network, the RAM can make up for 65% of the cost. If we can make that efficient, that directly affects the deployment cost of a 5G network. So for that reason, resource usage optimization is critical. Portable and reusable within limits. The company that is creating the RAM software or the core software or the edge computing applications, they have no idea whether it will be run on a public cloud, a hyperscaler type edge, or it will be run on a private cloud and if so whose the Kubernetes, the CAS vendor, they have no idea. So all of these have to be portable and reusable so that maximum number of the company's customers can use it without any changes. Over to you for the second one. So I'll do the same and set the scene and then talk about how it maps to our principles, but more for the user edge side of things. So say you have a factory floor, an assembly line with a lot of robots, maybe some cameras who are doing some visual inferencing of the robots so that maybe you're doing some predictive maintenance, and then maybe you have some sensors that are doing some health monitoring in the area. You could collect data from all these devices from the cloud, but it would be leaving your private network, you may be having security concerns there, there's latency concerns as well. So maybe you follow the laws of data gravity and process that compute closer to where it exists and so you set up some servers right in your factory setting. And you can do some clustering, maybe you use Q-Bedge which has an edge hub so you can cache data there and forward it to the cloud hub. Whatever orchestrator you're using, you're able to process the data from these devices and then only send up through a gateway what you need to the cloud, maybe that final health alert you can see in the cloud, you can send a technician down, but only those final actions are going up there. So if we once again go through our principles, being a resource and deviceware is clearly very important here. You have a variety of devices having that inventory and of all of them and being sure to handle the credentials there. Also in the scene that we just painted there, we have multiple protocols going on. So making sure that your applications are configured to know which devices do I need to be talking to and how do I communicate over that said protocol. At scale management, you need to remotely manage and configure these services. So deploying them to each of these sites. This is just one picture of one factory site, you may have multiple. So being able to remotely deploy to multiple of those sites and configure your applications to use them once again. So that term management, when we get to the user edge, goes beyond just your application to also managing these devices themselves. So making sure they're getting updated appropriately. And one way you could do that, for example, is with digital twins. So that's a common use case for this user edge where in the cloud you're representing in some textual way the ideal state for your device. And then when you do a change to it, you have something on the edge that's watching that and will reflect the change there. So for example, say in your assembly line you want to speed up how fast the robots are moving. You may change that parameter in your digital twin in the cloud and see that be reflected by your edge applications which are managing those devices. And spanning once again very clear visually on this one. So you're going all the way from maybe your private network where you're processing data fast and making these real-time decisions all the way to the cloud where you're making more general decisions based on what was processed. Resource usage optimization once again expands beyond the server in this setting. So doing things like predictive maintenance to be able to make sure you're using your resources outside of your servers as optimally as possible, reducing them out of downtime of those. And then also being sure to collect that data so that you're able to make those decisions. And then finally, portable and reusable with limits. So this is where we're getting to the part where the portability is harder. But you should still be able to port the same applications from one setting to the other. And there are certain protocols that help with this. So for example, with MQTT, instead of having to directly figure out how to connect to every device, you need to connect to one broker and thereby get all the data from all the devices. So there's ways to generalize how do I interact with all these devices on the edge. All right, so to summarize, if you're a developer and you're planning to develop edge-native applications, we told you the similarities between edge-native and cloud-native. We showed you the differences. We gave you a set of nine principles which can be grouped into five. And we gave you two concrete examples. You're well equipped now. Now, as you can imagine, there are a lot of CNCF projects that help with edge-native. And there are non-CNCF as well, Linux Foundation Networking, non-Linux Foundation Networking. So we have started to create a spreadsheet. And the spreadsheet has some initial projects that we have put in, but we need your help. So if you have the time and interest, please go through that spreadsheet, which that you think are relevant, and mark which principle the project applies to. Yeah, and this was in part kind of inspired by the CNCF landscape. It would be great to have some sort of resource where you could decide what principle is your pain point and kind of look at a document and see what projects or tools could help you with that. So, yeah, help us build up this resource. And beyond that, the working group in general, if you want to read the full textual paper of this, there's a link to our white paper there. We fall under the CNCF runtime tag. So anything about us will be found on that GitHub page, including the white paper draft. We also have a Slack channel, WG IoT Edge. So if you want to chat at any time, a lot of people will just pose their questions there. So if you have any questions about Edge, people can respond to them if you're more of an in-person or I guess visually in-person person. We have meetings every two weeks at 9 a.m. and that's on the CNCF community calendar. And more QR codes. So if you want to leave feedback for this session, we would greatly appreciate that. And then if you want to leave feedback for the working group, we'd also appreciate that. So right now, we've been focusing on the white paper. In the past, we've done presentations from people in the space where you can learn from them. And so we're really looking for what of these Edge areas are interesting to you. Would you be interested in presenting and more so? Yeah, happy to receive any feedback on either of those areas. And with that, I think we have about 10 minutes for any questions or maybe five if people have any. And then Steve, one of our other co-chairs, will be passing around the mic. Hi, I got a security question. So... going to some stuff I wrote down here, I think I'm going to make some assumptions here. So initially you opened up saying that there's actually some benefits to the Edge in terms of security. And I'm assuming that's because you can go off of killing data at the Edge, right? Like you don't store anything there and just keep moving it on. But in order to have uninterrupted operations, how do you make sure that you don't store any customer data there? And I'll just add one other piece there. A lot of examples I hear about are kind of, you know, even though you have remote devices that people aren't getting to, if it's in a factory, it's still sort of under the protection of that. Yeah, versus something that is actually sitting, you know, maybe in a CDN or a radio tower, right? You know, you could have somebody go up there and steal a whole bunch of cell phone data, right? So those are my questions. How do you balance... what are some things that you've heard about about getting that uninterrupted operations as well as making sure your customer data is not at risk? Great, thanks. I can start with the data part. So I wouldn't say, on the Edge, you generate data and kill it. That's not why it's secure. It's not the fact that you're getting rid of that data as quickly as possible, rather. The thing that I was emphasizing there is that it's a private network. So the data is staying within a private network. Well, if you're using the cloud, if you're sending data across a public network up to the public cloud, there's a security risk in that. And so it's actually really important. The data processing is the most rich and important thing about the Edge. And a lot of times when we're talking about that, we're also talking about data pipelines. So the first thing that the service on the Edge is using in order to add data from their devices to this pipeline that creates this really rich experience of making enlightened choices on the Edge. The other question was about... The 5G, right? 5G. Okay, and I forgot to rephrase the first question. So the second follow-up question was, how do you make sure someone doesn't lock in and take your server and use your data? Was that the question? So multiple things, listen in and then physically access data. I mean, these are the key security questions is how do I make someone not take my machine and use it? I don't know the answer behind how do I make someone not access my physical zone and take my devices, but I would certainly hope that you have some sort of hardening of your devices there so that even if you do have access to it, you need some sort of credentials to go in and get access to the data, but I don't know beyond that. And then listening on the private network, yeah, you need to tighten your private networks and once again, if you're spoofing devices having a way of detecting that. Amar, would you add anything to that? I think it's really case-by-case and you are right, so if you're in a hospital and you have a hospital Edge, there is going to be personal data. In general, you have data address and data in-flight encryption techniques. So if you use those best practices, whatever you store, you store encrypted and then you have secure key management. So I think those methods exist, those zero-touch, you just have to make sure those are applied correctly. I would say. Hi. Question around the design principles that you talked about in the Edge space, kind of the differences between cloud-native and Edge-native. Have you seen, I mean there's a lot of desired states there. Have you seen a pattern where this could be useful to be applied back into cloud-native? And how does that look like in your mind? I think that's a really good question. So the question was, we've talked a lot about how cloud-native influence Edge-native, but is there a way that Edge-native should influence cloud-native, things that we can bring from Edge-native to cloud-native. I'm more on the small device side of things. So my first instinct was, I've worked a lot with the device plug-in interface back that it needed to be created was because we had this varied, more varied way that we have static physical hardware. So we needed to create more resources that we can advertise on the node level. And so I think there are ways that we're bringing this idea of more varied hardware back into the cloud. So that clouds now can have more specialized hardware, such as servers with GPUs, et cetera. And that could have been inspired by more specialized hardware on the Edge. And I think it's a really good idea, but I think it's very early. I'm sure it will spill over, so a lot of the management and orchestration techniques for the Edge will be a lot more sophisticated. So that sophistication could go back to cloud-native for things like multi-cloud. So I agree with that, yeah. Okay, this is better. So if I take similar use cases and apply it to energy utility sector, there's a lot of cloud Edge there. However, there's NERC-CIP compliance rules that don't allow OT networks data ever to come to the IT or to public cloud. So they have a clear separation when, I know of companies who deliver services in a public cloud, but then the way they do it is they actually take a physical hard drive from the OT network, plug it into the server that's part of the IT network that has a connection to the cloud to do stuff. So is there any work that's being done or how do you bridge that gap when it comes to electricity and the ability to use public cloud in OT network? So the question if I get it right is that there are unique needs for OT specifically that we don't want that data leaving its private network. So are there strategies that exist today for just having the data we want to leave it and go to the public cloud. Was that the general question? And the follow-up question there is do we have any standards? Are we creating standards to be able to enable there to be some interface that's allowed for sharing that? I do not know the answer to that. I think a lot of that is the point of edge computing. I think edge computing can help with that by terminating the OT traffic on premise and sending only, for example the use cases I've seen and sending only metadata or things that are acceptable. But I think that again becomes case by case and I don't know if you can come up with a generalized sort of a mechanism. I think the burden goes on the edge computing application and then they would have to get certified or meet compliance. But this might be a good transition because I think we're at time. So if you want a group to start thinking about that feel free to join the working group and if you have thoughts on that you could present kind of a way for us to wrap our heads around that and see if we can kind of build resources around that together. Alright, thank you. Thanks everyone.