 Hi, this is Neil Oliver and this is Sinura Gaddam. We are architects in the network platforms group at Intel. Our topic is edge computing platform architecture. Intel has worked in this domain for several years contributing to standards, open source code and products, and we are seeing strong growth in the market and among our ecosystem partners. The topic we are presenting here is the architectural style of cloud native computing as applied to edge computing. While we started originally by implementing an edge computing platform, we are devoting significant attention to the development of microservices, which are one of the fundamental features of cloud native computing. We will describe our cloud native architecture and draw attention to the areas where the cloud native style is impacted by deployment on the edge, and then talk about a few interesting microservices which emerged from our investigation into these use cases. Cloud native computing is an approach to building systems that run in the cloud. In the cloud, the number of users of a system is not only large, but may increase or decrease over orders of magnitude. And so the system itself must be able to scale over orders of magnitude. Because the user base is large, the system must be always on. The cloud has no downtime. But since no computing components, hardware or software are infinitely reliable, a system in the cloud must be reliable even when built from unreliable components. The cloud native approach has had considerable success in creating scalable resilient systems. So cloud applications are everywhere, providing services to consumers and businesses. Edge computing at a high level can be thought of as cloud computing where the physical location of the cloud matters. An edge computing network can be thought of as a cloud with special constraints. And so cloud native techniques ought to be applicable to edge networks. We will call this an edge native system and we'll talk about it for a few minutes. We will then move from theory to practice and talk about the Open Network Edge Services Software, OpenNES project, and its approach to edge native computing, covering both the platform architecture that enables it and the microservices which it employs. Cloud native computing does not have a precise rigorous definition. It is partly a class of computer network architecture, partly a style of software engineering, and partly an organizational style and mindset for IT. Over the last few years of widespread use, however, there is a consensus on its basic tenets. Stateless processing means that components of a system should not own their states such as user information databases, but should retrieve it from external data sources where necessary. This is needed for resilience and scalability. New components can be started in a uniform manner, whether to increase capacity or to recover from a crash. Statelessness allows applications to scale out by increasing the number of compute components. Microservices follow from the tenet of statelessness. Microservice is a much overloaded term, particularly the micro part, but it is basically a function that does a particular set of tightly related actions efficiently and is networked to other microservices. Applications and complex systems are expected to be built by composition, connecting microservices together. The connection method is usually a REST, a representational state transfer API, which is also a way that a microservice can remain stateless. Strictly speaking, open source software is not a fundamental tenet of cloud native computing, but historically cloud native computing came from organizations with a habit of contributing software to open source. These organizations, however, often take themselves take a hybrid approach, releasing community editions of software to open source, but retaining a highly maintained high performance version for their mission critical and proprietary requirements. Finally, the deployment of cloud native software is carried out by a standard building blocks and run by standard processes. Virtual machines and containers are widely used and orchestration frameworks and tools built on them, such as helm charts and operators, formalize deployment and make them easier to debug and maintain. There are various definitions of edge. Some definitions place the edge in an embedded system, such as a cabinet or enclosure in an outdoor environment. Others consider it to be a server cluster located at a telco central office or even a regional data center. The variations are determined by use cases, but what they all have in common is that they are responding to the laws of physics and sometimes the laws of nations, for example data sovereignty. Edge native computing makes some important changes in the cloud native tenets. The most obvious differences come from the physical constraints of an edge cluster. It is safe to say they are more constrained in size and power and therefore compute in network capacity than a data center. The number of distinct edge cluster locations, like separated by wide area networks, is larger as well. Thus, the scale out approach of cloud native orchestration is no longer homogeneous. Past a certain capacity, the scale out reaches out to other edge clusters. But time sensitive applications must be located close enough to their data sources to work, so an orchestrator will ultimately have to optimize for compute capacity and density in an edge cloud. Depending on the workloads, optimizing for compute density will require the use of accelerators such as GPUs, NPUs, or FPGAs. Orchestrators must take these additional constraints into account. Finally, data sources reach the edge application via access networks. For many important use cases, these are mobile networks which have control and data plane separation and full sets of protocols and APIs which must be used to steer data plane traffic to edge applications. The net effect is that an edge native system has more moving parts, more heterogeneity, and more constraints in operation than a cloud native system. Well, that was all somewhat theoretical. We will now move toward an example of an edge native system, specifically the Open Network Edge Services Software or OpenNES Toolkit. OpenNES is a software toolkit that enables the construction of high-performance edge platforms. It is designed to provide cloud features to edge clusters within the edge native constraints described in the previous slide. The features listed in this slide depict the range of features supported, which enables commercial deployments. Implicit in this list is that the microservice approach to creating features enables real-world deployments. Now let's look at the OpenNES platform level architecture. Depicted here are a representative edge node on which edge applications execute and a controller which is used to deploy and manage edge nodes. An edge cluster consists of one or more edge nodes. Multiple edge nodes may be associated with one controller. Note that I said representative. The green and blue blocks in the diagram are microservices. Not all OpenNES microservices are shown here because not all microservices of an OpenNES distribution need be in a given configuration. The platform is based on Kubernetes. Thus, many of the microservices shown here are containers grouped together in pods so that they can be deployed together. OpenNES supports virtual machines for applications, but does so via KubeVert rather than natively. The color coding and stacking of various subsystems suggests a certain layer cake design, but this is more for ease of explanation. The microservices actually interact with each other via a container network. Cloud native design allows for elements to be deployed in a coordinated manner, and we follow that rule too. The color coding does indicate where the building blocks come from. Blue blocks are standard open source ingredients. On the left side of the figure, we see CentOS, the host operating system, and Docker for low-level container management. Elsewhere, you can see the standard Kubernetes components, KubeProxy, KubeNet, KubeCTL, etc., that comprise a working Kubernetes cluster. This is a stock Kubernetes distribution, although with optimizations it until is upstreamed into open source. The green blocks are OpenNES components. Most of them are microservices, although some of them might be more accurately called ingredients or enhancements or plugins. The first pod we will look at is the Container Networking Interface, C&I pod. This platform uses KubeOVN as a default C&I, but can support other C&Is as well, such as Weave or Calico. Because many edge applications require multiple ports, OpenNES uses the Multis C&I to support this functionality. We will now move to the system pods. These microservices enable time-sensitive applications and other tenant-of-edge native computing. Time-sensitive applications need to have fine control over app deployment, over sequestered cores, a selection of numinodes, availability of special instructions, or other features. These microservices enable the controller microservices to control app deployment. The platform pods exist for a similar reason. In this case, rather than fine-grain control of the CPUs, these microservices are used to support computing and network accelerators. When an application is deployed, in addition to selecting a destination with enough computing headroom, it can also select a candidate cluster with required special purpose hardware and control the hardware as they are configured. These microservices are specialized for certain families of hardware and work with other microservices in the controller. Moving over to the controller, the OpenNES microservices work with the edge node microservices, harvesting information from edge nodes and making orchestration decisions based on that information. The information received includes capabilities and attributes, as well as telemetry information. The remaining blocks to mention here have to do with the access network. The OpenNES configuration shown here assumes a connection to a mobile network, either 4G or 5G. The blocks along the bottom of the diagram are radio access network and core network. The mobile network and the edge platform must cooperate to steer traffic from the network to edge applications. OpenNES provides reference implementations of the core network building blocks, as well as a special controller microservice, the core network configuration agent, CNCA, to accomplish this. We will explain this in more detail later in the talk. Before moving on to microservices, a few more comments about CNI networks. Edge applications are often very time sensitive. They have requirements on latency and or jitter in order to function properly, such as gaming apps which need to minimize lag or public safety applications which need to be able to respond to signals raised by algorithms in a timely manner. Networking between microservices must be able to support these requirements. The CNCF community accepted a project, the CNI project, to develop specifications for container network interfaces for Linux containers. Today, there is a growing list of CNI implementations, such as Calico, Veev or SRIOV. OpenNES works with many of these, and the installation charts provides procedures for installing them when configuring on OpenNES system. For time sensitive application support, OpenNES in particular adopts OVN or OVS virtual networking. The cube OVN CNI may be used as well as enhanced CNI based on the data plane development DPDK. This is OVS DPDK. The multis CNI is configured by default, as many edge applications require multi-port for their functionality. A key microservice included in OpenNES is Topology Manager. This microservice, which is now supported in Kubernetes, makes the orchestrator NUMA aware for optimum deployment of time sensitive net edge applications. With the multi-core and multi-socket CAUTH system, which provide a variety of memory and IO features, an application whose CPU memory and IO are not located on the same socket or NUMA node will have degraded performance. The Topology Manager microservice allows resources to be inventoried and tagged. When the container containing an application is deployed, the same Kubernetes mechanisms, CPU manager and device manager that uses the labels to select a platform can also select for appropriate hardware on the same NUMA node. An example of this might be a video analytics application that must run on the same NUMA node where the network card supplying its input stream is located. The next microservice we will discuss is the Node Feature Discovery or NFT. CAUTH's platforms used for edge deployment come with many hardware features to provide better performance and meet the SLA. The features may be intrinsic to the CPU, such as special instruction sets, or maybe external, such as GPUs, TPUs, FPGA, non-volatile memory, et cetera. For these features to be used, they must be discovered and tracked within the controller. Let's take an example. A container network function or CNF, such as 5G G node B, that implements the layer one of base station. This CNF needs to have hardware that includes FPGA acceleration for forward error correction, vector instructions in the CPU to implement the math functions, and a real-time host operating system, and so on. The NFT microservice, which is implemented in Kubernetes as an add-on, provides this feature. There is an NFT worker component that inspects the edge platforms for hardware and software capabilities. An NFT worker puts this information in an ETCD database where the NFT control uses it to perform scheduling. Our G node B would therefore be automatically scheduled on a node that has the hardware capabilities that it requires. The next microservice to discuss is telemetry aware scheduler. It enables the user to make Kubernetes scheduling and scheduling decisions using telemetry information. The telemetry allows the scheduler to observe metrics such as CPU or network utilization and react to performance bottlenecks by moving applications to different nodes. Like node feature discovery, this is an add-on to the Kubernetes scheduler developed by Intel and used by OpenNAS. The app developer is able to create a set of policies defining the rules to which port placement must adhere. The Kubernetes scheduler has the ability to trigger an action if the policy is violated. Policies can be applied on a workload-by-workload basis, allowing the right indicators to be used to place the port. The TAS may collect metrics from many sources, but in particular supports metrics received from the Intel Resource Director Technology, RGT, and Reliability, Availability, and Serviceability Metrics provided by Intel on CPUs. The final topic we will consider is the integration of OpenNAS with a mobile network. This talk is not the place to discuss all of mobile network architecture. The standards run to thousands of pages and are different for LTE and 5G networks. The important part for this talk is that the mobile network routes data from the user equipment, or UE, I think, phone or tablet to the endpoint where it is connected to the Edge platform. We will restrict ourselves to the 5G network and will only cover a few of the network functions of 5G. The white boxes depict the primary 5G network functions. There are numerous other functions that we are not going to cover. This description is not specific to Edge computing. It is a way that any 5G network operates. In this diagram, the UE is user equipment, the phone or tablet are similar equipment. The genode B is the base station and other functions that deal with the radio signal and turn it into an IP stream. The application function, or AF, provides services relating to applications. For example, applications need to steer traffic from the UE to the application and configure various mobile network configurations. The network exposure function, or NEF, provides an interface so that untrusted third parties such as, for example, an Edge native platform can request the mobile network to steer traffic. The NEF is already designed as a service by the 3GPP standards. It provides its own API used by other services and the openness framework will also use it. The UPF performs routing and filtering functions for data flows. It receives traffic from the genode B and from data networks and forwards it based on policies and filtering rules that are set up by the functions of the control plane. In general, multiple UPFs exist in a 5G network and they may forward user data between each other. Data routing from one UPF to another might happen in order to make sure that a data flow reaches a selected Edge node. Finally, the data network, or DN, is the opposite end of data sessions with the UE. For us, the DN is the Edge platform. Now we move back to the openness platform architecture you see in the upper left-hand corner focusing on the mobile network blocks, the main part of the diagram. The gray blocks in this diagram come from the previous slide. The green blocks are provided by openness as before. The NEF is a reference implementation provided by openness. It interacts with the rest of a 5G network following the 3GPP standard. The AF, or application function, is a reference implementation that interacts with the NEF via the standard 3GPP APIs and also provides application interfaces that allow it to talk to other network functions, in particular to the CNCA. The CNCA, or core network configuration agent, is a microservice that is part of the controller. It is a function that knows both where the Edge platform is in terms of IP addresses and where the NEF is. It can therefore steer data flows from a particular UE to a particular Edge application. It provides an API to allow it to be used by a telco operation support system and also a user interface so it can be controlled by a system operator. In other words, the AF and NEF are reference implementation microservices that implement 3GPP network functions on the one hand and provide APIs that allow the controller microservices to steer data flows on the other hand. There is an equivalent set of network functions for LTE, but we won't go into them here and they work somewhat similarly. Because of the complexity of mobile networks and because telco networks configure and operate their networks in many different ways, these microservices are only reference implementations. But they provide a starting point for an operator wanting to use the OpenNES toolkit to add Edge computing services to their network. The previous two slides talked about enabling a mobile network to steer traffic to an Edge application. But Edge platforms can also host mobile network components. In the previous slides, the AXIS network or Gnode B was just a single box. In reality, it is a very complex function requiring very demanding real-time processing. It is often implemented with special purpose hardware. But one configuration of OpenNES Edge platform consists of implementation of Gnode B. The RAM architecture in the diagram comes from the ORAN reference architecture published by the OpenRadio AXIS Network Alliance. In the configuration shown here, the application consists of distributed unit or DU functions following the ORAN reference architecture. The DU is deployed as a pod in which various physical and link-level functions execute. The 5G code can also be hosted by OpenNES Edge nodes. In this case, two different Edge nodes are employed. One is for the user plane in which UPF network functions are run and the other for the control plane in which AF and NEF functions are run. The advantage of running 5G RAM and co-functions in an Edge platform is that the network functions can be moved closer to one another than if they were in discrete components, which improves performance. It also allows functions of a mobile network to be converged into a single server, which can then be deployed with much more flexibility. We have covered a lot of ground in the last 30 minutes. We discussed the concept of cloud-native computing and how it applies to Edge computing. After this theory, we then took a specific example, the OpenNES toolkit, which provides both a platform for executing Edge microservices and also a growing catalog of microservices. We discussed the CNI network, the topology manager, node feature discovery, telemetry aware scheduling microservices and building blocks. We talked about why they are implemented and how they are used. This talk only scratched the surface of a very large topic. We invite you to visit the OpenNES website, www.openness.org to browse through our engineering white papers and to download and evaluate the toolkit yourself. We hope you have found this talk interesting and informative. Thank you for your attention and have a nice conference. Bye. Bye.