 Okay, it's time to start. Good afternoon. My name is Jun Myung Kang from Hewlett Packard Labs. And then our team is Mario Sanchez from Hewlett Packard Labs. And then we have JK Lee from Barefoot Networks. Today we're gonna talk about policy canvas where we can easily draw our own policies for open stack services. First, we're gonna show the current policy management in open stack. And then we're gonna show the motivating examples in terms of how to define policies. And then we're gonna show our policy canvas as our solution. And then we're gonna show the learning demonstration of our policy canvas on the open stack dashboard. In a countless networks or cloud environment such as the open stack, it's very critical to manage the multiple infrastructure component as a whole such as computing, storage, networking. As there are many different types of services in open stack growing, we need a well-defined policy management framework for open stack. As you know, there are multiple policy writers who want to define their own policies such as the cloud operator, network administrators or application developers. But sometimes a software-defined application itself can define their own policies. We have many multiple different types of dimensions dealing with maybe security, performance, availability and some middle boxes such as a firewall, DPI, IDS. In this talk, we're gonna tackle networking policies but we believe that our insight can be applied to the other types of policies. Okay, as you know currently we are using the security group and rules which can allow or deny some protocols or ports from source to the destinations. Maybe we can associate this group or rules to a specific VM or a specific network port. In terms of automations of cloud orchestration, we can use the HIST service with some predefined template which can guide how to define groups or rules. In terms of some application-specific policies, maybe you can use Murano service. However, this groups or rules definition is a two-level command which is a little hard to use for some older users. While this interface is fragmented, that means the multiple commands are scattered of the multiple data sources. Even though we have same policies, we have to use the multiple data sources. In terms of high-level intent-based management for OpenStack, group-based policy, GVP has been introduced. Instead of writing policies for the specific endpoint, GVP defines a group of endpoints and write policies for the communication between the EPGs, endpoint groups. However, just EPG definitions, creations, and association with the EP should be done by only via the GVP APIs. There is no way to leverage endpoint properties while policies in existing OpenStack. As you know, we have multiple policies from multiple writers, so GVP doesn't handle these multiple-writer problems. Finally, we have a very well-defined Congress service as a global policy management framework for OpenStack, has been actively integrated in the current OpenStack list. Congress basically can get some system state of the cloud from the many multiple data sources such as Nova, Neutron, Swift, and so on. And then it can verify if the cloud provides the policies. As you can see in this example, Congress is using the variant of SQL-like data log language to define the policies. And then it can provide some simple table join for the multiple policies. Actually, this data log has very good expressiveness and bust style. However, we require steep learning curve to data log and also it doesn't handle some multiple-writer problems. In order to overcome these limitations, we need to decouple some high-level intent from the underlying low-level interfaces. And also the interface should be easy to use, intuitive and high-level interfaces. And so we need some automatically deployed policies to the infrastructure and the scalability. So we are proposing the policy canvas which can draw our network policies for OpenStack services. Actually, we got an idea from the real-world examples. When we define or when we update some policies, usually we have some face-to-face meeting or some online meeting from a bunch of people from different organizations. Mostly we are using the whiteboard for discussing the some policies because it is very easy to use and user-friendly. After discussion and then deciding the policies, we can assign the task to the IT relevant people and then they can use the CLI command or management framework to using the low-level command for the dead policies. Our idea is maybe let the users easily express policies as easy as drawing diagrams on the whiteboard. And our system can take this diagram and then generate the all-relevant command or API automatically. Okay, we have multiple policies from the multiple data sources and then they can create their own policies using our policy canvas. And our policy canvas can generate all-relevant rules using the OpenStack OpenAPIs. Okay, I'm going to hand over to JK for more details about our policy canvas. Thanks, Jun. Okay, so policy canvas is built on top of two main concepts, graph abstraction and graph composition. As you know, graph can naturally express various networking policies using its node expression and then edges between them. The graph is easy to read and write by the human users, but also it's readable by the system. So you can use and leverage existing graph theories and graph algorithms to analyze, compose, and verify the policies written in graphs. So we take those multiple input policies written in, express as a graph from different policy sources and compose them into one big combined graph, the system-wide graph, conflict-free. So we are reserving the conflicts and doing this composition proactively prior to deploy those policies down to the runtime system. So it gives a chance to each policy writer to review their policy in the final form in the compose graph. And it basically prevents the unexpected system behavior from the runtime system because it will be too late to detect and then fix such a conflict in operational network. And this kind of compose graph can be used as a ground truth for policy enforcement and runtime troubleshooting. So let me talk a little bit more about the PGA, policy graph abstraction and its composition algorithm. So PGA graph model has nodes and edges and each node represents a set of, a group of endpoints sharing the same property. In this example, we have a policy P1 from application admin. It allows the communication from the marketing employees to the web servers for the HTTPS traffic. And the traffic should go through the load balancer, LB function box in the middle. So it's an accurate policy and also middle box policy. It's all expressed in one graph. And we are using the different color to express the user intent that this communication is allowed exclusively to the marketing. Then we have a similar graph from the cloud operator P2 allowing some traffic from campus to the cloud and the traffic should be monitored by the by counter BC box. And as you can notice here, the EPG nodes are defined of logical labors like a marketing web and cloud campus. They are not just random words. They are well defined logical names or labors. And they are defined from existing data sources. For example, the location information from the open stack keystone or NOVOT data services or the employee identities or the application like a web and database, this can be also extracted from the Murano and other data sources. So using those logical labors, we decoupled the policy expression from the low level specifics like IP address or MAC address so that this policy can be reused and portably on multiple different network targets. So for the competition, we compute the union of the input policies using some satiric Venn diagram analysis. So we detect the intersection between two policies P1 and P2 and see if there is any conflict. If there is a conflict, reserve the conflict by using some constraints specified in the original input policy graphs. For example, there is a conflict between P1 and P2 for the communication, the HTTP communication from non-marketing employees to the web server in the cloud we will show in the next diagram and that conflict is reserved by using the exclusive requirement from the P1 as a constraint in the composition process. After composition, we store the composed graph as a set of these two in policies. This normalization is pretty useful to make the runtime system to be able to look up the policies in a constant look up. You can make build as fast and pretty scalable runtime system. So that was the theory and this is the example that we composed from the two input policies. For that we need some relationship between labors. For example, here we leverage the information that marketing employees are in campus and web servers are deployed in the cloud so that we know that the marketing label and campus label, they need to be composed together. But there's no need to compose marketing label and cloud label because marketing employees will not be placed in the cloud anyway. And this composed graph, you can easily walk through it and verify what's going on in the final version. For example, there is no edge from non-marketing and campus to the web in cloud, correctly implementing the exclusive access requirement from P1. So this is the same policy written in English but it's not really readable by the machine. So human user needs to basically read through it, detect some overlapping relationship, conflict and somehow compose them into this kind of set of prioritized rules using GVP or group-based policy or maybe OpenFlow rule table expressions. So this kind of careful insertion of prioritization can be done in composing maybe few number of policies but it will be challenging to compose multiple maybe more than dozens of policies. And we believe that composing policies in such a prioritized rule form is too low level. The rule sets already kind of compiled version of intents and policy. So we need more higher level abstraction like this graph abstraction we have here. Then the system can automatically compose them into one big graph. So here we have a advanced version of the different policy grabbing examples. And we are using different type of colors or node shape or edge shape to clearly capture the intents of the policy writers. And we also have a label namespace or label trees defining the relationship between labels. Like I said before, in order to proactively compose such a policies written in labels, we need the relationships. And here we have three type of trees. For example, tenant tree, location tree and security status tree saying whether the virtual machine is normal status or quarantine status. So these are maintained by different sources and we extract them and maintain in the tree data structure. So for the labels with the parent-child relationships, for example the application and database labor, they have overlapping relationships. We know for sure that we need to compose the policies written for application database. But at the same time, the labels like a campus A and campus B, they have sibling relationships. They are mutually exclusive. No virtual machine will be placed in campus A or campus B at the same time. So we don't need to compose them together. Basically scoping down the composition scope and make the composition algorithm way more scalable. So to test the scalability, we took 20,000 accurate policies from global HBIT and written for more than hundreds of departments and augmented the accurate policy with service policies, compose them into one big million edge graph taking the 30 minutes. So while we're doing this composition, we realize that the existing accurate policy expressions, for example, do not clearly reveal the hidden intents of users. So when you write the whitelisting policy, you know for sure you want to allow the HTTPS traffic from source and destination, but it's not clear what do you want for the rest of it. Do you want to actually deny them or just don't care about that? If it's an application policy, you just want to allow HTTPS traffic for your web server to correctly operate, but you may not care about the other ports. It's similar to the blacklisting. So to handle this problem, we divide some four type of intern accurate edges starting from must allow, can communicate, which is kind of weak allow, block, and conditioner. These are quite useful to clearly capture the actual user intents for the whitelisting, blacklisting, or application demand. And it drastically reduces the chance of conflict between allow and deny policies from the conventional model from 50% down to something like 12%. So it greatly helps the systematic composition of such a different policies. So while the June and Mario will talk about our PGA Implementation System in OpenStack, we also have some of the Grab Compiler and Accra Intent APIs adapted by Open Daylight NIC project. So it will help us to render the policy down to the network. And there are also full papers and demos and talks if you want more information, please refer to them. Let me hand over to June. Thank you, JK. Okay, let's dive into the more interesting part and in terms of how we implemented policy canvas on the OpenStack. Like other OpenStack services, policy canvas is composed of three main parts, PGA Begn the Service and Python Client and Horizon GUI as our policy canvas GUI module. Using the policy canvas GUI, we can manage different types of policy graph and also we can create our own policies by just drawing the policy graph. Python Client can provide a CLI command and also Python binding APIs. And PGA Service can manage the PGA resources such as input, composed, deployed policy graph and label trees and function boxes. We have three types of underlying drivers, label drivers, compilation drivers and enforcement drivers. From the next slide, I'm going to show you how to use these driver modules for supporting PGA Service. Okay, I'm going to show how to create the policy graph and then how to compose multiple policy graph. As JK mentioned previously, we are using the label tree for creating the policy graph. We can get the system state from the label drivers and then you can give the label tree which is automatically created from the label drivers. And then you can use this label tree for creating our own policy by just drawing this policy graph. We can get this policy graph from multiple data sources, multiple user or multiple applications. And then we can give this multiple policy graph to the PGA Service. PGA Service can use a compilation driver for composing this multiple input policy graph by resolving the conflict as much as possible. Currently we have our own PGA graph composer and also as JK mentioned, we have another graph compiler in the Open Daylight NIC project. And compilation drivers can return this result to the composed graph and then we can show this result graph to the policy canvas. Okay, after composition and then we can maybe deploy one of the composed graph through the PGA Service. Basically this composed graph is a set of nodes and actions which means that a set of classifiers and set of actions. We can create this classifier and action through our enforcement drivers. Currently we have a neutron driver which can create a secret group or SFC or middle boxes through the neutron open APIs in terms of the actions. And then we have another congress driver for classifications. We can automatically generate the data log rules based on the EPG definitions in terms of classifier. Okay, after deploying all of our policy graph to the infrastructure. So maybe we have one new virtual machine. Congress can detect the direction new virtual machine in that it can update the congress table and then it can notify our congress drivers. Congress can driver maybe can use the congress excuse function or currently we are using our neutron driver for associating space-based secret group to the endpoint. So we don't need to create manually here. So all processes automatically map associated secret group and endpoint groups. Okay, I am going to hand over to the Mario. Mario will show the learning policy canvas demonstration. Thank you. So yeah, now we're going to actually dig into a video demo that we're going to show you of the system working. This slide is just to show you really quickly what our setup is going to be for the demo. In this case, we have our open, our policy canvas installed in the Mitaka OpenStack release using DevStack. We're going to have four different compute nodes in our setup. The different machines will be divided in a couple of different availability zones. And then we're going to use our policy canvas to create policies from the standpoint of two different stakeholders that we call host manager and zone manager. Then we're going to have different policies to deploy into the system and we're going to see how our system composes them and deploys them onto the infrastructure. Don't worry so much about the specific policies right now. That's going to be shown in the video next. So let me show hop into the video. Okay, so our demo scenario, like I said, has four different compute nodes split across two different availability zones, AC1 and AC2. In this case, two of the nodes belong to AC1, which is highlighted in green here. And the other two nodes belong to AC2, which is highlighted in blue. In this case, we're going to spin up four different virtual machines, each of which will run in a different compute node. And all of them will be connected to a single virtual private network. In this case, in our DevStack scenario, I'm going to show you next that we have only a single security group defined that will be applied to all of the VMs that are going to be created. And this default security group only has information, a single rule, that will allow us to SSH into the system, in this case. But that's the only thing we will do. So next, we're going to fast forward the video a little bit just to show the automatic creation of the VMs using the Nova CLI. And here on the right is just the horizon view of the VMs being created. So once the VMs are created, we're going to open two different terminals. The one on the top left is going to correspond to the VM1, which is instantiated in a compute node that belongs to AC1, availability zone 1. The second one will correspond to the VM3, which is a VM that belongs to availability zone 2 in this case. So next, we're going to start using our policy canvas to create the policies from the standpoint of these different two stakeholders. In the first case, we have a zone manager that only cares about connectivity between availability zones. In this case, this zone manager wants to specify a rule, a policy that allows HTTP traffic between VMs instantiated in availability zone 1 towards VM running in availability zone 2 because those are, for example, web servers. So in order for us to do this, to express it through the canvas, all we have to do is create a new policy canvas, an empty input policy canvas. We specify descriptions and the domain where it's going to be hosted in. From there, we click into the canvas. And what we're going to do is we're going to drag from the label tree in the center the different labels that actually work in this policy. In this case, we're going to be dragging from the label tree into the canvas, the label that corresponds to AC1 and AC2. Once we do that, all we have to do is create a unidirectional edge between AC1 to AC2. And all we do is allow HTTP traffic between them. Once we do that, we save the graph and we're basically done. The policy is specified. This is what the graph looks like once it's actually composed. So next, we're going to create the policies from the second stakeholder point of view. In this case, it's what we call the host manager. In our toy example, for some reason, this host manager only cares about connectivity between compute nodes. He doesn't care about connectivity between availability zones. So in this case, she wants to define a policy that allows traffic, SSH and ICMP traffic between VMs instantiated in CNC32 toward VMs created in compute node CNC3 in this case. So we're going to do that by again creating an empty input policy graph through our canvas. And what we're going to do, we go into the graph and what we're going to do is drag from the label tree, the labels that corresponds to CNC32 and CNC3 in this case. Once we do that, we create an edge between CNC32 and 3. And we're going to specify that ICMP and SSH traffic can be allowed between these two endpoints from CNC32 towards CNC3. Once we do that, we save the graph and we're done again. The policy is specified. So from here, what we have to do next is deploy the graph. To do that, first we're going to just open up a couple of different windows on the left to show you the original input policy graphs we just created so that you can keep them in mind before composition. So here in the top, we're showing the host policy graph and we're going to show the zone policy graph underneath. So from there, what we're going to do is compose a graph. We go into the policy graph canvas on the right. We select compose. And from there, we're just going to have to select the two different policies that we want to compose in this case, which are the ones we just created, which are a host policy and a zone policy. We click compose and the composition is done. In this case, this is what the compose graph looks like on the right. So from here, you can see that our PGA service automatically creates a unified conflict-free graph on the right. Had there been any conflicts in the composition of these two policies, it would have shown us a pop-up telling us that there was a problem that needs to be addressed. In this case, there were no conflicts, so the graph is created. So the last step then is, once we compose, what we have to do is deploy that into the system. The last thing we're going to do then is go into the canvas and we're going to select deploy graph and we're going to select the recently composed graph for deployment. From there, we select the graph and we click deploy. Once this happens, we're basically done. The policies have been deployed down to the infrastructure. In this case, the policy canvas will automatically create the required security groups into Neutron with the desired rules that we want and will automatically apply these security rules to the VMs that classify depending on the different EPGs that were just created using our labels. So in this case, we're just going into the security group tab where you can see that they were already created automatically with some rules that make sense according to the graphs. Then we're showing you one of the VMs and how the specific rules were applied to that particular VM. So next, what we're going to do is a quick connectivity test just for one of the policies to show you that this thing is actually working. And in this case, we're going to be testing the availability zone policy and we're going to be testing HTTP traffic between VM1 and VM3. VM1, remember, is hosted in AC1 and VM3 is hosted in AC3. So in this case, we're just tailing the log for the web server that is running on VM3. So you can see when the GET actually hits the server. And then we're just going to launch a WGET from VM1 towards VM3. And because it was already specified in the policies, it should go through. And in this case, it actually goes through. But if we do the reverse test, we try to do an HTTP GET from VM3 towards VM1 because it wasn't explicitly allowed in the input policy, it should not go through. And in this case, we are just doing that, doing a WGET from VM3 towards VM1 and it doesn't go through. We could do the same test for the other two policies but in the interest of time, we're just going to skip those. So the last thing I want to show you is how our policy canvas automatically is able to discover new VMs, assign them to a specific EPGs, depending on the characteristics of the VM and apply the actual security policies without us having to do anything. Everything is done automatically. So in this case, we're going to use HorizonView to create a new VM that belongs to AC2. We're just going to click through a couple of different static options or default options. And then we're going to hit Create. When that happens, like I said, our policy canvas will automatically identify the VM, tag it with the appropriate labels, decide which security groups apply to it and actually apply them. And in this case, we're seeing how the VM is being created. The last thing is just to show that the security groups were actually applied to it. So that's what we have for the demo here. We're going to jump back on the slides. So when we're thinking about future work or next step for our policy canvas, we're thinking about, I don't know if you noticed many of the things that we showed has to do with ACL-like policies, allow, deny in the network. In this case, we're thinking about being able to, for example, extended the canvas so that we can import policies that were not specified through the canvas itself. Example, application actually created the security rules directly on Nova. I'm sorry, on Neutron. So in this case, we should be able to load them into our policy canvas, express them in the same format, graph format and be able to use the composition algorithm that PGS provides. We're also thinking about being able, for example, to provide things like composing newly added features into Neutron. For example, QOS specifications and specific links should be fairly easily to be implemented in this graph-like abstraction. And finally, for example, being able to provide port-level connectivity that is implied in this ACL-type of rules. For example, if you're assuming that you're allowing connection, that means that their connection already exists and then you are providing they allow entry on top of it. We see a way where we can easily express port-level connectivity into the graph so that policies like this VM must be connected to this particular net what can be expressed. So I'm just gonna hand it back to you, Jake. Okay, so let me switch your gear a little bit here. So far, we talked about how to express policies and compose them into one big compose policy graph. Once we have it, we need to push this down to the network data planes to enforce the policies. But the data planes can have many different forms, hardware, software, etch, core. And the question is that whether this network can implement my policy correctly or not. Because different data planes have different capabilities and limitations. So it would be great to have some common clean abstraction to capture such different properties of data planes. Taking one step further, if we can actually program the policy directly down to the data plane, then we don't really need to retrofit our policy into the fixed data plane, but actually we can redefine the data plane in a way such that policy can be enforced in the best way. And finally, if you have a policy enforced down the data plane, then you want to verify it. That means that you we want to have some kind of complete network visibility. For that, there are a couple of a few industry-wide open source efforts starting from OpenFlow Data Plane Abstraction or Open Compute Project has switch abstraction interface, PSI, and P4 as a language to program the flexible data plane. For now, let me talk a little bit about P4 and its benefits in policy enforcement. So P4 is a high-level language to define the data plane. And it's a kind of C-style language that you can define how table operates in terms of match and action and you can also define the control flow between different tables. And it provides the abstract forwarding model, so it's a protocol-independent or target-independent. It's open source approach to version 2 license. And it's been adapted pretty well by the community. For example, HP's OpenSwitch network operating system replaced OpenVswitch by P4 as a default data plane emulator for development and testing. And OpenCompute.Sci was adapted P4 as an abstraction language. With P4, we can do something like this. You can pick the best protocol suited for your policy for example VXLAN or whatever incarceration protocol and remove all the others that are not really needed. Or you can define your own custom protocol for service function chaining. And you can also instrument the switches to embed switch information like a switch ID, port ID or latency experienced by the packet directly into the data packet so that you can have a complete visibility of the network and the path and many more information of the network. You can easily verify the policy in that way. So traditionally to push policy down to the data plane we use some controller APIs to populate the rules of the fixed data plane tables. But with P4 you can define your best data plane programming in P4 language and compile it through the program of the data plane whether hardware or software. The compiler can also auto generate APIs for the controller to populate the rules in the new data plane. So this could be the better way to push your policy down to the data plane. With that, let me hand over to June to conclude. In this talk we showed our policy canvas effort which is two simpler and intuitive abstractions for writing our policies just like the drawing graph. It is portable and also we can also automatically create all low-level commands from the high-level graph model and then we can deploy the infrastructures. Our current status we have a learning GUI on the open-step horizon and also we have a PGA service and all relevant APIs and also we have two neutron and congress drivers for deployment. So we have another graph compiler in Open Daylight Nick. In terms of collaborations with the open-step community we need more use cases and feedback we can express the graph form and also we would like to maybe contribute our code to the open-step community. Okay, thank you. And then we can have some clarifications. Thanks, really interesting stuff. I'm a little confused over what you call your label tree because when I think of labels I don't think of a tree at all and I'm trying to understand what kind of constraints come along with it and I know you talked about it early on but you went through it really fast. Could you maybe just spend a quick minute again on that? Sure, that's a great question. So you can start from think about the label as a common tagging or metadata mechanism used in for example Docker or any policy management system or even SC Linux, Security Linux Linux allow users to define some policies using high-level labels. So labels are just a kind of tagging or metadata that can be mapped to any endpoint. So there must be a mapping mechanism labeled down to actual endpoints. So actually the Open Daylight also has such a label mapping service which provides some kind of database or a set of coding services so that whenever the virtual machine or a network endpoint shows up in the system it may have some properties like location and owner and these can be expressed as a labels. So we are leveraging that so that we can express the policies purely in the label and later enforce down to infrastructure using such a mapping system. But label tree is another way to actually capture the relationship between labels so that we can proactively analyze the policies written in labels and compose them. So there is some inference going on actually it's an ongoing effort. For example the label tree we showed in the demo the location label tree is automatically constructed by reading the database of the NOVA and KISTAN because that's where the availability zones and then compute node informations are stored and there is a relationship between compute node and then availability zone already available in the database. So we extract this information and then model in the tree data structure. And there are some relationship between different type of labels like location and then tenant ID. They cannot be modeled as a one tree so that's why we have multiple trees. The relationship between different trees are labels belonging to different trees are modeled as a separate mapping data structure. Yep, hey thank you very good presentation. One question I had is is there plans to do policy across regions so where I could define a policy and then select instances from multiple regions? Yeah, so the algorithm and then abstraction itself does not limit and actually it's a design in a way such that we can apply the same policy across different regions and then if maybe different region have a different region manager they have their own regional policy and then you want to probably deploy the same tenant policy or application policy defined once and enforce down to different regions. Yeah, that's definitely possible. We haven't implemented that way so far because we were not able to actually model the regions in our small environment that theoretically definitely possible. So you can build like a layer of abstraction on top for multi-region. Thank you. Right, then thank you very much. If you have any other questions then just feel free to grab us in the podium. Thank you.