 Hello, everyone. I'm Xinjue Kui. I represent the Open Networking Foundation. So today we'll be talking about top programming in the data plane with P4, right? So before we get started, so we have to know how networks work. So basically networks work in a distributed manner. So between switches, they exchange routes with each other in order to know the exact path of how to forward a packet from one point to another. So the key word here is distributed. So why distributed? Because traditionally how switchware design is that they have a really complex control plane as well as data plane. So these two, they are interdependent, they are tightly coupled with each other. So basically the feature sets of the control plane is affected by what is available down in the data plane. So if they take any vendor for example, so the higher the price point, basically you get a switch with more feature sets and more complex case down there. So that which means your switch can have more features and support more protocols, for instance. So in this case, traditionally all the switches have to be configured in a distributed manner. So as in, you have to go through all the switches manually in order to apply the corresponding configurations. So here is the problem is that if you have a lot of them, so managing them would be difficult. Right. So apart from that, you would be prone to human errors. So this is the main problem in traditional switch architectures. So then we would think about how can we make this better. So then we have software to find out where we try to reduce the complexity that is in the control plane. Then we try to have a centralized controller. So with a centralized controller, then when the controller, so for each controller, it manages a group of switches, underlying switches. So then the controller will have a global overview of all the underlying switch paths. So then, so the idea here is that, okay, we remove the complexity in the control plane, we offload this to a centralized controller. Then, so between the controller and the switch, so it would be through some protocols such as OpenFlow, which is one of the most popular ones. But then, although we have opened up the control plane, but then based on in terms of what the capability of these switches is still constrained by the underlying data plane switch AC. Because depending on the AC, then depending on the version of the AC, for instance, in terms of the feature sets, it would vary from one another. So if we have a, we have more in-depth discussion on OpenFlow itself in terms of, as it is one of the most popular southbound protocols around. So, so in short, OpenFlow has provided a standardized model. So in terms of providing a match action abstraction. So basically, yeah, many. So basically how the controller defines rules, how the controller install rules to the switch is true match action, match action entries. Right. So it is a standardized protocol between the controller and the switch. So basically, like I mentioned just now, it's like you have, basically we tried to simplify everything. Then we have logically sent, we have not logically yet. It can be logically or physically. We have a centralized controller that controls everything. So basically with a centralized entity like the controller, basically you would have a complete global overview of all the switches. Then you would know from end to end what are the available paths that can be taken for each, each and every packet. So you don't need, so then we eradicate the need for any distributed algorithms such as OSPF or ISIS, for instance, we don't need to wait for them to converge. But then there is a problem with OpenFlow. It's that at times when, because not all operators are equal. I mean, in terms of the demands for the features. So OpenFlow has evolved throughout the years. I'm beginning starting with 12 match types, which is 12 headers. So now it supports up to 40 headers. But then the problem is for each iteration, whenever a version changes. So basically, whenever there's a new version, there's a new standard. So when does manufactured OpenFlow based switches based on the current standard? So whenever there's a new iteration, whenever there's a new standard, so it is not forward compatible. So basically, in order to support more features, as an operator, you either have to replace a switch or you have to stick with the limited match types. So this actually poses a problem as it limits innovation. So it limits the capability of what can be done by the controller and limits the application. So yeah, so this is in terms of how inflexible the data plane is as it slows down development. So then there comes the question. So what if we can open up the data plane itself? So in the case of program over data planes, then we have P4 as a programming language to program the data plane itself. So we try to here, we open up the data plane so that we can on the flight, almost on the flight, change the packet processing pipeline down in the data plane, according to our liking, then in order to suit our needs so that we can operate and work according to as how we want it. So not only in program over switches, P4 itself can be has been used in programmable nick, there were interface cards, software switches, as well as the external Berkeley packet. So this is quite in terms of P4, it supports various platforms as well. So basically, why do we want to talk about P4? Because P4 it's currently supported by various vendors, operators, as well as universities and startups. So it has because of its capability, because of its potential for innovation. So that is why it has attracted various interests from various entities. So in short, what is the benefit of P4? What is the benefit of programming on the data planes? So instead of having one chip, so traditionally, instead of having one ASIC with all the complicated protocols fit into one chip, now we can actually decide what we want. So we reduce the complexity, we only install, we only program the relevant protocols that we need. For example, it could be, so if we don't need OSPF, then we don't need to include that in the ASIC, if we are only doing IPV4 forwarding, we don't need to care about anything about firewalls and stuff. So we can only constrain ourselves, we can only fit ourselves, we can only constrain our program to only do IPV4 forwarding. So that's in terms of how you can reduce the complexity. So apart from that, if you want to add new protocols, even if the protocol is service, it's not existing any standards, you can always add anything. So for example, so here we have an example for VXLAN, it took four years to gain wide availability as part, apart from the standardization process, but then in P4, if you want VXLAN to be there, it's just around 175 lines of P4, you get it instantly. So that's how P4 can be used, apart from that, with P4, we have greater visibility to the network. So in terms of that, we have INT, which is called in-band network telemetry, which is a brand new use case, which is brought by program of the deaflings. So we will discuss, we'll give more discussion in the next few slides. So apart from that, with P4, since we are able to program the data plane itself, so there has been some, various use cases has been investigated in terms of offloading stateful applications into the data plane. And since P4 itself is a programming language, the data plane itself, it's a target that we can program. So now we can adopt software style development. So in terms of, we can go to the STLC cycle there, you can basically you can build up applications or build up fixed bugs in the data plane on the fly. So in short, depending on your requirement, depending on your liking, you can do, you can devise your own program, your own ideas with P4 quickly. So here are some use cases. So with P4, it has enabled various use cases. So some of the, some of the most famous ones is how it's about a layer for the balancer, as well as net pixels, which is as well as, which is to offload a consensus algorithm into the into the data plane, which has achieved significant performance enhancements. Then next, we'll talk about, in terms of in-band our telemetry, which is one of the most killer, which is one of the killer applications, which is brought by P4. So basically, instead of currents, current monitoring techniques, which are out of, and usually we call all the RouterCents, which is for the current current statistics, for the current packet flow statistics. So here, how, how in-band our telemetry was, is that when I'm going to packet travel is a struggle from end to end. So each Router will append some statistic, some statistic as a custom header to the packet itself. So it could be in terms of a QDef. It could be in terms of the timestamps, et cetera. So eventually, all the Routes from the pen, relevant information, eventually add the pen ultimate hop. Then there will be a monitoring agent where the, where the pen ultimate router will strip off the custom headers to be exported to a monitoring agent. So eventually, it provides more detail, more granularity in terms, which would not exert any, any, any overhead, which will, so this is one of the, one of the state of the art monitoring techniques that can be used with P4. So now we will go through a quick introduction with P4 16. So why 16? Because P4 itself, there are two versions. One is P4 14, as well as P4 16. So P4 16 is now the current, widely supported version. So as to recap, when P4 was found, so it was, it was published in back in 2014 in a SIGCOM, in a SIGCOM CCR. So it was back in 2014. Before then it was officially, the specification was officially released in spring 2015. And the current P4 16 specification was finalised in May 2017. So the P4 community is a very active community. So you can check out their JIT hub. There are various repositories and nice tutorials for everyone to look at, as well as there are mailing lists, which there are various members, which are quite a bit. Apart from that, there are workshops that are being held throughout the world each and every year. So probably usually there'll be one in Europe, one in the US. And last but not least, in terms of academic research people is, so there are various research being focused on programme of different things. So you can find relevant publications in ACM SIGCOM, SOSR and so on. So in terms of the working group, these are the five working groups that we have in P4. So language design, control plan, application, architecture, as well as education. I will not go much on this. So as a programming language, P4 itself, now it has a very robust set of tools for us to get started with. So depending on which platform that you want to deploy P4 on, it could be a natural FBG, it could be a bare sort of phenotype, it could be EPPF, so all these have their own compilers already. So apart from that, if you want to run P4, I mean in terms of if you want to communicate with a P4 switch with different northbound protocols such as open flow, if you want to run it. So although the de facto is P4 runtime, but then if you want to run open flow or switch abstraction interface, SAI, so you can implement the northbound protocol in P4, so it depends on what control plan implementation that you want to run. Apart from that, you have simulators, various testing tools to verify whether your P4 program's correctness, as well as you have various summer programs being contributed by various contributors throughout the world for new customers to learn all the examples, all the P4 examples, how in terms of the feature sets and the limitations, we also have plugins as well as tutorials. So in terms of the development workflow, so first you will have to know what are the requirements of a P4 program. So then we as a user, we would program, we would do the programming in P4, but then before you do the program, then you would have to know what are the architecture, what is the architecture technique, a program for example, if a program for Tofino switch, they have the Tofino architecture. If you're programming for the software switch, you have the Bmv2, which is based on the V1 model architecture. If you have a FPGA, it could be based on the FPGA architecture, so you have to know that because eventually these underlying different architectures may have different external libraries being exposed, which may affect in terms of the implementation. So you have to be aware of what architecture that you are programming for. So after that, once you get your P4 program ready and you compile it, then eventually you get it installed into the switch. So this is in terms of how the workflow looks like. So if we look at, so this is the V1 model architecture. So this is in general how a switch should look, how a switch should look like in terms of what are the components that it should have. So there are a few things that, there are three things that in a P4 switch that it must have, you could, so that must be a parser. Then you have ingress stage and you have egress stage. So in the ingress and egress stage, this is where the packet processing occurs. So you have all your match action tables where your control logic can happen. So as for the parser, it is, so it is actually how you extract all the, extract all the relevant headers that you want to be processed. So these are the three main things. So in terms of P4, so here we have some summary on what it can do, what it can do, so it can do layer 2, layer 3, or even layer 4 actually. So it is really high level, although it looks like C++, it looks like C, but there are no points, no loops, because why, why no, because why we want to bound the execution time? Because we want to, because in P4, in data, in the data plane, for each and every packet, there is a limited time budget for these packets to stay in the data plane. So you would want them to be folded as soon as possible. That is why the reason there are no loops available and memory is statically allocated, which is like for each stage there is already a pre-allocated memory for them, so you don't need to do a malloc or no need to do a malloc, and since there are limited memory, you can do a recursion because you don't have the stack depth and so on. So yeah, so for some numbers, you can have some sub parses to parse headers, packet header writing and such. So this is in terms of how the language has, so in terms of the target, if you look at below, if you look at the below part of the slide, so this is defined by the vendor, so if you have the, if you have the Tofino target, so you have the Tofino target description, as well as the external libraries being supplied by the Tofino architecture. So if it's a software switch thing, that will refer to the bmv2 v1 model and so on. So what can the user define is that the user you get access to various data types, then they will have a parser, they will have mass traction units, as well as programmable reassembly, reassembly as in like to reassemble the packet after processing and being set up after that. So this is an example p4 program, so these are some of the, so this summarizes the components that should exist in the p4 program. So first and foremost, you will need to include core.p4, core.p4, which is the core package of p4 and secondly, so you would see that there is a v1 model that v1 model, which is your architecture. So if you're programming for different targets, it could be psk.p4, it could be tofino.p4, so depending on the architecture you are programming for. So subsequently, you define the areas that you are interested on. So for this case, we'll be only interested in looking at the Internet and IPv4, then we only define these two targets. If you want to define more and go to deeper like layer 4 into TCP, UDP, or if you want to go to application layer like the DNS, it depends on your choice. Then parsers, so once you define the headers, then you need a parser to actually parse out the relevant headers. You have some check shown verification, then you have egress and ingress processing. So these two are where we write our control, write out the packet processing code in egress and egress, ingress and ingress, sorry. Then after that, then you go to another checksum update. So after processing you will need to go to some checksum update, then last but not least, once everything is done, you have to depress them, which is to reassemble relatively everything before sending it or as a packet. So here is an example of how you can write some control logic in the ingress or egress block. So in the apply block to be specific, so you can write some if statements, you can have some actions. So here this is an example of a very simple forwarding program, which is to forward anything that comes in from port 1 to port 2 and vice versa. So this is one way of writing it with if statements, or you can convert it into a table. So with a table, then you have to match on a specific key. After you have a key, then you have the match, and you apply some corresponding actions. So for here, the key that we match on is the ingress port, then the action is, then the action itself is to set the output port. So here we might, so if it comes off the same thing, the same thing as the previous slide, if it comes from the first port, then it goes up from the second port. So this is how we define a table after defining a table. Then you will have to apply the table in the apply block. So here, there's one more example. So this is an example of how we can do IPv4 forwarding. So by forwarding, IPv4 actually will be interested on the destination address. So how would we do IP address matching? Probably it could be exact match, it could be longest prefix match. In this case, we are using longest prefix match. Then, so here then after defining the key, then you define the action. Then you can define the size, this is optional. And last but not least, you can define a default function. So if there's no match, then you don't take any action for this case. So in terms of the limitations, so the core language of IPv4 itself is very small and is portable. That is why it is portable to support. It can be extended to many targets. Although I wouldn't say that it's, I wouldn't say that it's a very complete language, but in terms of expressivity, it has limitations on what it can do, what it can, what can be done, what cannot be done. So accelerators can provide additional functionality depending on the vendor. You can have your their own additional APIs, if you are extension to the core IPv4 language. So down there and down and point number three, you have what IPv4 would define some standard architecture that would to be following and some standard accelerators like counters, beaters and so on. These, the current standards are mostly for switches only. So if it's for FPGA, it is for NIC, there are no standard architecture. So it depends on how the vendor, how the author of the architecture decides what they want to support. So what is missing from P4, what does not, what is not existing in P4, it is floating point. Because essentially floating points operation are not essentially impact processing. And that is one of the reasons is that if we want to have floating point operations, then you need a floating point processing unit. So an FPU, which is a waste in terms of chip size. So that is why floating points has been excluded in the core people, in the core people language. Then you don't have pointless references, you don't have recursive data players, you don't have dynamic memory management, which I mentioned before. There's no look because we want to process packet as soon as possible. There's no recursion, there's no threats. So in summary, P4 itself, it's a standard language. So you can specify how you want to process packets, not only on switches, it could be in BPM, it could be on software switches, it could be on FPUs, it could be on mix. So it is expressive enough for you, for us to define how you want to process it. But it is not as expressive as conventional programming languages that you have all the loops and stuff. So in terms of it is type safe, high level, which is important, and last but not least, as a software, as a programming language, so we can treat P4, we can treat P4 for preference as software. So current software development techniques such as requirements gathering in the SDLC, apart from requirements gathering, design analysis, or even testing all these methodologies applied to P4. And last but not least, with P4, we have probably revolutionary applications such as network monitoring, to be precise, it is the in-band network telemetry, which I mentioned before. So these are some brand new cases which has been enabled by programming data planes with P4. So here are some references that I've used, so you may refer to some of some of the references here for further information. So last but not least, I want to thank you for your time and I apologize for not being able to be physically present this year due to the virus outbreak and some organizational policies. So if you have any questions, please do not hesitate to text me on telegram and I hope that you are now aware of the P4 programming language. I hope, I do hope that in the future we will have a more in-depth discussion on P4 so that I will be able to provide more well-written examples to give a better clarity on P4. So thank you for your time, bye-bye.