 So, hi everyone, welcome to Ned's seminar. Today's talk is going to be from Diaz Sekar, Diaz did his PhD at CMU and then he went to Stony Group University and a few months ago he went back home to CMU and now he's a professor there. His work has been on video, internet video, middle boxes and security and today's talk is going to be about SDN and middle boxes and a few ideas of the work on how we can integrate them through backwards combative and some new abstractions on SDN for middle boxes. Before we start just to note, we're going to have an Ned's seminar next week as well with Edward Barton from Princeton and I'll send more details in the next few days. Thanks Yanis for the introduction and thanks for hosting me. It's great to be here. So this is a talk that's really, this talk is based on work that's really driven by two of my students, Saiyan, who's now a student at CMU and Safler. So really all the hard work is done by them. I'm just a messenger over here and this has also been done in collaboration with Tech Mobile at Google and Milan at USC. So really what sort of prompted or motivated this line of work that we've been doing in middle boxes is really this fundamental disconnect between what we sort of teach in networks 101 and what's sort of the reality out there in the real world. So what we teach in networks 101 is the network is a dumb network. It's just pushing bits around. It's just basic plumbing. But then you say, okay, well, people want security, they put on performance. They want all sorts of policy complaint requirements. So how does it actually happen inside the network? And the reality is that there's actually a lot of complex in-network processing that's going on. There's a lot of stateful middle boxes and network appliances. Things like inclusion detection systems, firewalls, proxies, application gateways and sort of the whole gamut of sort of complicated processing that's happening inside the network. So to understand sort of what's the reality, we started by looking at doing surveys of operational networks. And sort of the graph here is a survey from 57 network operators and nano. And the table here is data from like one large enterprise network that we survey. So the staggering statistic is sort of at the bottom of the table there is that the number of middle boxes like the firewalls and so on is almost on par with the number of traditional routers and switches in this enterprise network. So in some sense these seem to be a critical part of the infrastructure sort of paralleling traditional networking equipment but as sort of as a community we seem to have ignored them for a while. So in fact if you look at market surveys the market for just network security appliances is close to tens of millions of dollars. And of course this is not just an artifact of this one large enterprise we looked at. We did a much broader survey. We found that across different market segments large enterprises, small enterprises, ISP networks that we see a consistent trend where the number of middle boxes sort of all these diverse boxes is comparable to the number of traditional switches and routers. So this is a consistent trend that we see across many different network settings. Now clearly people perceive a lot of value in these middle boxes so the operators get critical security performance and policy compliance requirements that they have to meet to these boxes. But at the same time managing these networks is actually a lot of pain. At the same service we talk to people and we ask these questions. We found that they actually spend a lot of effort in managing the network for the lot of these middle boxes. So in fact the large enterprise I surveyed they told us that they actually have dedicated themes for each kind of appliance. So they have a VAN team, they have a load balancing team they have a firewall team, they have an ideas team and so on. And managing these middle boxes causes a lot of problems. For example, a lot of times the service training policies tend to be misconfigured because they have to set up sort of crazy static routes inside the network. A lot of failures are caused by middle box overload because these middle boxes are not just simple switches. They are actually doing fairly deep packet instruction. They are doing like sophisticated stateful processing. So overload causes a lot of failures. And this has been seen also in sort of Microsoft's data centers where they find that these middle boxes are often the critical part in a lot of failures and failures. And the people do spend a lot of money in employing personnel to manage these middle boxes. The bottom of the chart reasons why each of these middle boxes fail. So this is sort of a current service. We said what percentage of time, what percentage of problems, what percentage of time do you spend in dealing with these kind of problems? So like 63% of the time they spend in doing this. So really the motivating question for us came from sort of this, these middle boxes are here to stay. There are a lot of value that people perceive in them. So what can we do sort of as the academic community and say as the SDM community to help simplify middle box management? So we take the canonical view of SDM. So you have a centrally centralized controller. You have the administrator giving some high level policy. And these are compiled down into floatable rules, say using something like open flow. So this is sort of the canonical view of SDM. And what we're really asking is imagine an administrator comes up with these more complex service-changing policies saying, oh, web traffic needs to go through a firewall and IDS and a proxy in that particular order, in that particular policy sequence. Can SDM help the administrator to simplify the management of all these complex policies? And that's really where we are coming from. And in some sense we're not sort of alone in recognizing this challenge and an opportunity for SDM. So there's a market survey that was done sometime I think in 2012 of vendors and CIOs. And people did really feel that a lot of the value they perceived in SDM was much higher level than say traditional routing and forwarding. Really the value they saw was in these sort of more advanced functions that the services that the market uses is important. And some of them felt that this was sort of a critical need that SDM could have fulfilled for them. So really what we think here is that it's both a necessity as well as an opportunity for SDM. It's a necessity in the sense that here is a critical piece of the networking market that's not quite in the SDM for yet. And it's an opportunity because now we can actually demonstrate from an SDM point of view here's a tangible value that SDM can provide network operators in these different market segments. So that's really where we are coming from in the SDM side of things. So at high level what makes this problem hard or interesting or challenging is that these boxes like say a proxy or an IDS introduce sort of new dimensions to SDM that are not quite traditional routing and forwarding. So our current focus mostly in SDM has been access control, routing, forwarding, sort of very canonical network functions. But these boxes are introducing new dimensions that go beyond those traditional functions. So for example I mentioned you have this requirement of like policy based steering. So like the application world they call it like service chaining, we call it like steering or policy composition. The other dimension they introduced is that you actually now need new resource management mechanism. So how do you deal with middle box failures how do you deal with middle box overload and so on. So there's new sort of objective requirements or goals that it introduces to SDM. And finally these boxes are proprietary and they do a lot of sophisticated logic insight and they do introduce like hidden actions and hidden transformations on packet headers that are not quite exposed to SDM mechanisms. So that that creates a challenge in integrating these boxes inside the current SDM code. So that's really what sort of in a nutshell what makes this problem hard is that you have these new dimensions beyond traditional routing and forwarding tasks. Are you including in this last one, tunneling? So meaning new dimensions or talking about like mechanism tunneling is a potential mechanism to solve some of these problems. Well it's also a mechanism that can cause some of these problems. Certainly, tunneling definitely causes some of these problems. So in fact a lot of the ways people deal with these middle boxes today is to create a lot of complex tunnelings and that does create a lot of problems. But at this sort of high level I'm not assuming tunneling is creating a problem. There are much more first principles after which just think about adding these boxes to the network. How does it create problems? That's another question somewhere. I was just asking for an example of those. We will get to that in a minute. So what we've really been doing is actually I think we are just scratching the surface here. So this is really a very preliminary work that we've been doing. So one is sort of work that we did called the simple system and it's really looking at a backwards compatible way of integrating these middle boxes that are always changing policies using existing SDN mechanisms like using say OpenFlow 1.1 and using sort of legacy unmodified middle boxes. So it's like saying, okay, what can we do for the world today? And going forward we actually sort of in this process that we said, okay, let's try to push the limits of the current world. We did sort of hit some bottlenecks and these were like fundamental bottlenecks that we couldn't quite solve with a purely backwards compatible approach. So we have this sort of new extension to SDN called FlowTags which is really talking about how to handle some of the sort of more proper hidden actions of middle boxes that can allow for a more clean way to integrate them inside the SDN code. So this talk is really about these two pieces of work that we have done. One, a legacy sort of backwards compatible mechanism and the other sort of more future looking. Say you're thinking of NFV, you're thinking of service training. What's a clean way to integrate micro processing functions inside the SDN code? Okay, so in the rest of this talk, I'm going to first start by talking, motivating the design of the simple system which is the traffic training solution then go into FlowTags before I sort of summarize. So at a very, very high level, what's simple is, as I mentioned, it's a policy enforcement layer. So imagine you have the service training policies that an administrator comes up with say that traffic needs to go through and what simple lets the administrator do is specify this training policy at a very high level and then it compiles down into flow rules at the data plane level. So it's sort of this in-between layer sitting at the network controller that translates these high level policies into an actual physical realization and the key here is that we're actually going to do legacy middle boxes. You can bring any middle box you want and we're going to use existing open flow mechanisms. So no extensions whatsoever. So what are the challenges, right? Let me elaborate on what made this interesting or what made this challenging. The first challenge is that we're actually looking at something beyond just forwarding. We're actually trying to look at some sort of an overlay policy composition routing. So for example in this simple topology you have two switches S1 and S2 and you have three middle boxes and what the administrator wants to do is sort of chain these middle boxes such that the packet goes through a firewall and ideas in a proxy in that particular sequence. So we have this sort of sequential composition logic that we want to impose in the network. So what could go wrong, right? It seems simple, I shouldn't be able to do that. The problem is that as the packet goes through the network we actually kind of see that the same packet arrives at the switch S2 multiple times. So it's coming on the same interface with the headers and then you don't quite know what to do with this packet, right? So switch to S2 just by looking at the headers it can't quite decide if it needs to send this to the destination or if it needs to send to the IDS in this case. Just by looking at the headers because you only have flow level rules we can't quite implement this sort of composition logic using existing open flow rules. In fact there is a logical loop in this network because the same packet is traversing the same link multiple times and the simple flow level rules that you would use sort of naively would not quite work even in this very sort of toy example. So that's one sort of challenge that we came across in trying to use existing open flow rules and we said, okay, here's a simple example where we can show that simple flow rules may not suffice. So the solution to this problem is actually not that hard. The real insight here is that we need some mechanism to distinguish different instances of a packet. So conceptually this packet that came at the switch S2 there's some different incarnations of this packet and we need to be able to conceptually distinguish these two different incarnations of this packet. So when we generate rules in the forwarding in the data plane we need to make sure that we can distinguish these different incarnations. So conceptually what we need is imagine that we have a logical tag that lets you identify what is the processing state of this packet. So when the packet comes in it's sort of this unmodified sort of a native state. Then as it goes through the network we kind of tag it and say, okay, it's gone through the firewall, it's gone through the ideas, and it's gone through the proxy. Now once you have these tags, these additional logical tags the switch S2 can now actually distinguish these different incarnations and use the processing context or the processing state to decide what the forwarding action should be. So in this case, because I know it's post-proxy I know I should forward it to the destination but if it was post firewall I would actually send it back to the ideas. So you're kind of breaking the loop by adding state inside the packet header itself. And the nice thing is actually we can implement this mechanism very simply using existing open flow schemes. We can use existing headers, the switches can just add a few more bits and the controller can inform these actions to add these tags such that it tracks the processing state of the packet. So this is about sort of how do you do composition in practice in this network? I'm thinking of these labels but on a perp flow, is the chain the same for all the packets? In this case, you can think about just perp flow or perp packet, it doesn't really quite make the difference. Sorry? And you'll have too many rules to handle at least. Not really, actually you can do a lot of these things pretty proactively. So you can actually, because the controller knows the topology and it knows what service chain, what physical chain it's using you can pre-comput and say, okay, this particular tag it is a post proxy for this particular chain. It actually can pre-comput. It doesn't need that many tags. In some sense, the number of tags you need is really the length of the processing chain. It's not much more than that. So why is that policy chain or the tags are basically state or state of the packet? It's sort of the processing state of the packet that's being exposed to the network. So conceptually what we need is really the length of the chain So this is about how do we do composition in practice. There's actually much more, a more subtle problem. The subtle problem I mentioned was that we actually have new resource management requirements when you have handled these middle boxes. So in particular one problem we saw that a common cause for these boxes to fail was that they could not handle the load that was sent to them. So they might be overloaded. So natural thing you wanted to is actually balance the load across these different middle box instances. So in this network we have two IDS instances, IDS for an IDS 2 and what we want to do is actually split the load between these two boxes such that each one gets like half the traffic. And again here SDN is nice, OpenFlow is nice because it lets you do like an in-network load balancing solution. Otherwise what people would do is like every box would sort of come up with a sort of a daughter load balance or box that would essentially do the load balancing for them. So SDN is nice because it lets you do in-network load balancing. The challenge though is that when you have a lot of these policies and a lot of these load balancing requirements, you actually start hitting limits in terms of the number of rules that you can install in the available T-CAN switches. So even though sort of this new switch is coming up with more T-CAN space, we are really roughly talking about like few thousands of rules but if you have a large network with many of these policies and you would easily run into like problems where you don't have enough T-CAN space to install all of these rules. So the challenge here is that can we actually do this load balancing in a near optimal fashion but at the same time not violate the resource constraints of the switches? For instance you have a sort of joint problem where you have to deal with switch constraints as well as your middle box constraints. Yes. So by optimality minimizing the load balancing. So here optimality kind of is with respect to the load balancing on the boxes. So in this case so the optimal solution is 250 in terms of the load balancing of the middle boxes but that's really an objective function and the switch rules are kind of the constraint saying oh I have only 4,000 rules do optimal load balancing given you only have 4,000 rules. You could formulate the other way around as well but we have been the IDS or firewall proxy can handle the special tag header because of the third party a person cannot look at the special header like a tag. Oh so you're asking about the previous case? Yeah. Actually no the tags don't have anything no semantic for the middle boxes. The tag semantics is purely for the switches and the data plane. So in fact the switch can actually remove the tag before it sends it to the proxy. So this solution is completely oblivious to what the middle boxes are doing. The middle boxes may not even see the tags. But in the second half of the talk we'll actually get to a state where the middle boxes need to understand tag semantics but hold on for them. In this particular case we just want to do forwarding the tag semantics are completely oblivious to the middle boxes. So the question is we have this sort of multiple objectives and constraints we need to do boxes that want to do load balancing how do you do it in a near optimal fashion? And really what you need to do ideally is this is some sort of a joint optimization problem. You say I have switch constraints, I have middle box constraints, I have some policy requirements, I know what my topology and traffic patterns look like and I want to be able to have the controller set up the data plane rules in such a way that it does not violate the switch T camp constraints. There is a load across the middle box instances. So that's kind of the formulation of the problem. It turns out this is theoretically hard and many people have sort of seen this in multiple instances when you start talking about these T camp constraints there are actually these sort of integer discrete constraints. So the minute you have these integer or discrete constraints in an optimization problem it very quickly becomes an NP hard problem because you have to do this and in fact in our case it's actually even harder sort of conceptually because we can't even figure out if I give you a particular configuration saying do 50-50 load balancing I don't even know if that configuration is feasible or not so I can't even tell you oh maybe there is an option I can't even tell you that. So our solution to this problem sort of this resource management problem is really sort of this simple but effective insight. The insight here is that you can actually decompose what looks like a very theoretically hard problem into an offline stage and an online stage and what we can do is in offline stage we actually deal with the hard part. So you have these hard switch constraints like a hard limit on the number of rules a switch can handle. We can actually do a lot of this the hard optimization making sure you have a feasible set of rules and we can make sure we solve it sort of pre-compute given these patterns, given this topology what are the feasible parts or the feasible rules you can set up and really load balancing in the online stage that is if the traffic changes how do I do the load balancing that's actually a much simpler problem that's just a simple linear program optimization and that can be done really fast so the insight here is that we can decouple this problem into an offline and an online part we can deal with switch constraints in the offline part and we deal with the resource management of middle boxes in the online part so we're exporting the structure of the problem to have a practical solution so this is sort of clear yeah so in some of the policy training examples you have looked at could you give us a sense of how quickly does this thing actually blow up how many switch decandles do you really need to service change a simple example the ideas so yeah good question is this do we actually hit these rule limits in practice so just to give you a lack of the NLF calculation imagine you have a network with like 100 ingresses like 100 exit nodes and then you have like 100 squared ingress egress pairs now imagine you have like say 4 different policy classes for each ingress egress pair like saying ok traffic from me to you has to go through policy one traffic on port 22 goes through policy 2 now you have like 4 N squared already so we have like 4 times 10,000 some core routers in this network might actually have to have rules for all possible pairs if you sort of just give let flow even doing wire card rules it would need like 4 times N squared rules and that quickly blows up beyond but there's a hierarchy built into this right you're talking about N squared being services as opposed to ingress egress that you're actually interested in so you could maybe have a network just handle the service level and then to the flow level and maybe even sub flow level certainly you could do some of those things that's exactly what NYSERA most NFP people propose certainly some of the things that in the fabric case you're saying you have fabric at the center and do all the kinds of rules on the edge possible I mean that's something you could do something that you could do but that actually you don't know for example you may actually have little boxes already in your network so if you're doing a clean slate design that you have like a data center that you can do whatever you want maybe that's an option but if you actually want to deal with legacy boxes that are already out there and you want a load balance of that existing infrastructure that may not necessarily be an option but yeah certainly the edge decomposition is another thing you need to do most load balancers are essentially PCs which you can run like so the edge doesn't have to be physically the edge you can use still I mean it's not is there not conflicting I think there's a kind of compliment to it also the hardware switch decam resource constraint is a very open flow 1.0 view of the world there are many more tables in a switch which the later versions of open flow have subsequently exposed through an abstraction so that resource constraint may be just they may not be real I mean I don't deny that I mean you can have like sort of these more interesting match tables you can have like sort of these sequence of tables and there are more optimal solutions for even building all these decams as well but the deal is it's just a view that a lot of people have and may not be real because it's just an old view of it's possible I mean I'm not a hardware person so I mean in some sense to me it's like when we are modeling it is okay if it does let's say a million rules this problem becomes easy for me and I'm happy if this problem becomes easy for me but you can always come up with an example where there are many more fine-grained policies so imagine for example you are doing a cellular network and you want to do a policy for each customer or each class of customers I would always come up with more and more complex policies that will hit the rule bottleneck unless you say this is infinity right I mean so it's nice to have a systematic solution for this particular framework and then you say okay well if you have more it just means that this particular framework has more degrees of freedom to do the road balance so if you think of it this way it's like I'm giving you a building block the building block may have more degrees of freedom if a newer hardware gives me 10x more rules fine I mean if they use that they can do a better job but the point is well taken I mean I'm not saying there's only 2000 rules in the hardware there could be 20,000 there could be 200,000 that particular sort of that number yes can we summarize all the middle box actions as sort of the computing action and subsequently there's a forwarding action and there's some kind of computing action and these middle boxes can potentially you know be used for those and your tag is I basically tells there's some entity or the switch what kind of computing action should take right is that the concept not in this part you're straight up over the previous part okay I'm talking in general the forwarding you're looking at the flow idea like an action but you could potentially look at the forwarding idea and also look at some policy action and that policy could be any of those middle boxes sure so you're saying the rules could be coarser than per flow in fact we're actually doing like wildcard rules we're not doing per flow, microphoto rules we are doing wildcard rules but you're saying these rules may actually be number of policy classes may not necessarily be number of actual physical rules certainly that's something we can do now all of these are sort of cool optimizations we could do on top of whatever we're doing here so here's a sort of a third even more even harder challenge right the harder challenge is that these middle boxes may do hidden things like for example a NAT may rewrite headers a load balance may rewrite headers or approximate break connections and set up new connections to external now in this case if there are these dynamic actions that are hidden from the SDN controller how do you actually set up the rules such that your policy chain is implemented correctly so imagine in this case you have two users and you want to do different service changing policies for these two users so user one needs to go through proxy and firewall user two has to go through just proxy and then he's allowed to go to the internet so it's like okay faculty here don't go through all that now what happens is like the proxy is actually going to break the connection and you're going to spawn new connections to the internet so at this point when you're doing an event as two you actually don't know which is the green flow which is the blue flow what you see is like you see completely new flows and you've lost the context for what this flow originally was so actually this is a much more fundamental problem here I'm talking about in the context of just service changing but it turns out it's a much more fundamental problem where these hidden actions of middle boxes can cause a lot of problems for diagnosis it can cause a problem like confusion and so on so we will revisit this in the second half of the talk but for now we can just think about even if you wanted to do service changing these dynamic actions can cause a lot of problems so the way we address this problem is actually sort of this idea that we borrowed from the security world so in the security world there's this problem called the stepping stone detection problem which is basically saying imagine an attacker who sort of attacks your machine and then uses it as a stepping stone to attack another machine instead of sending the attack flow directly I sort of use an in-between half and send the attack traffic through that the solution in the stepping stone world was that to detect that the pure machine was compromised and used as a stepping stone I could correlate the flows coming into you and the flows coming out of you correlate the flows, the packet payloads of the timings to infer that you're actually being used as a stepping stone so you kind of use that insight you know what you can sort of treat this as like a stepping instance of a stepping stone what we really want to do is imagine packets coming in out of the proxy and coming out of the proxy and you want to correlate the payloads of these flows coming in and out of the proxy and this is something you could do with the controller using some lightweight algorithms and once the controller detects that you know what green was really red and blue became orange using this payload similarity algorithm it can set up the right set of rules now micro flow rules add this which has to reactively to make sure that the policy is not being violated so again this is somewhat of an expensive process because you have to collect the first few bytes of the stream send it to the controller, hold it there calculate the payload similarity and then install the rules but if you want to do it in a perfectly sort of backwards compatible manner this is kind of a reasonable solution to the problem it's an approximate solution it's an approximate solution absolutely it's not always an accurate this is really the stumbling block that we hit and we said ok we need a cleaner solution to integrate these dynamic actions which is the second half of the talk I have to assume that you have a single as the enterprise sorry I should have understood all of this is in the context of a single administrative domain federated I just hold my hands and say sorry I cannot help you all of this is a single administrative domain that's a different rat hole I'm not going into so are you further assuming that the traffic is not encrypted are you actually good question yes so here we have to presently assume that the packet is not encrypted so we can only do payload similarity the package is not encrypted we say we cannot do it so this has fundamental limitations cannot handle encryption it looks more like the security thing where it's becoming an exact exactly so if you want like a clean solution this is a clean solution but this is an ok solution for most many common cases it seems like the better solution would be just having like virtualizing the proxy and splitting it or something like that but of course you're trying to be backward we have a different solution but that's also a potential solution but we can get to that okay so sort of putting the pieces together what we really had was the sort of three building blocks that were addressing sort of three fundamental problems that we hit when integrating middle boxes into this SDM framework so one was the resource manager that handles how do you set up these rules such that the road balancing will be eruptable without violating these constraints we have this sort of rule generator which sort of carefully sets up the forwarding table rules on the different switches such that you avoid these loops when you have this policy composition logic and finally we have this sort of reactive module that informs the rule generator and says okay I'm actually going to get these packets compute the payload similarity and then set up the rules okay so I'm actually I won't have too much time to go into the results but the sort of short take away is that we've actually shown we implemented this using box and open switch and actually open flow, the older versions open flow and what we've shown here is one of the benefits of doing this sort of more fine-grained road balancing using this simple system so the y-axis here is the sort of the maximum middle box load normalize with respect to optimal so one is the best you can achieve you want to be as close to one as possible and the two bars show the simple system and the other option is what sort of what is the status quo is like if you can't do any of these correctly, you would do something like the fabric do everything at the edge the problem is that you do all the middle box processing at the edge before it goes into the core of the network it turns out actually you can the ingress based solution is not always optimal in load balancing because you could have a particular ingress that has a lot of load so you might actually need like 7x more middle boxes in New York or San Francisco just because there's too much traffic over there so it's not so optimal in load balancing perspective whereas simple achieves are close to up I think that's a good argument the thing about the edge is that your computational capacity scales linearly with your computation it also means that you have to have like 7x more resources to put there the traffic is generated by CPUs and the CPUs their computation scales linearly I think you need more of a data center centered view of the problem I think your edge is really sort of a data center openly switched view I think this is more sort of you think of an enterprise network the edge is more like oh I have a site in Santa Clara I have a site in San Francisco and in New York so the view of what is the edge in this case the edge doesn't quite scale up I would say an aggregated edge rather than sort of a shorted edge so this is really simple so the simple system there's many more results on like the controller is scalable the accuracy of the flow correlation is actually not too bad it did close to 90% accuracy for realistic patterns and so on so I won't have time to go into that but sort of in a nutshell what the simple system did was sort of push the envelope of where they can take SDN to integrate these boxes so we came up with these three backwards compatible mechanisms to do that so in the next half I am actually going to sort of zoom in on this one problem which we saw when we did simple which is really this dynamic actions of middle boxes where they may change the headers they may change the payloads they may trade sessions, they may do all sorts of crazy stuff and actually this is where we sort of took a step back and sort of looked at the original SDN work we actually look at ethane ethane sort of nicely sets up and the network should satisfy and we came up with three tenets and just focusing on two of these tenets so one tenet is this notion of origin binding that at any point in the network if you have a packet you should be able to determine what is the origin of this packet like who is the host who is the particular physical machine that sent it or who is the user authenticated to send this packet and so on so there has to be this fundamental notion of origin binding because if you want to do attribution or authentication or any sort of policy management the ability to bind a packet to the origin the second sort of fundamental tenet was that the path that you set up in the SDN network should follow the policy I shouldn't have to like hard code all these crazy rules in the network the path should mandate sort of the path follow the policy sometimes the policy mandates what the path should be and what we find in the context of these dynamic actions of middle boxes is actually they fundamentally violate these two tenets if a NAT rewrites headers and that rewriting logic is property to the NAT and not exposed to the world you actually easily violate origin binding similarly the proxy example with great sessions we don't have a policy mandated path because you don't know how to set up this path because of the hidden actions of these boxes so it's kind of interesting that we said okay let's go back step back and say where are these problems standing from and really this was there the real problem was these middle boxes are violating these VN tenets and as I mentioned it's actually a very fundamental problem one problem is if you have a security system and you want to do some counting logic say I want to count whether your machine has been detected if your machine has been compromised and it's sending a lot of outgoing connections what I would do is do put like an IDS and have it count the number of scans you're sending out the problem of course is that if you have a NAT in front of it so your counting logic is totally wrong at this point and really looking at the count of the NAT as opposed to the count of the host who's what I'm trying to detect the other problem is actually that diagnosing what's happening inside the network becomes very hard so there's been very cool work inside the network debugging out of Stanford but if you think about this problem in a network with middle boxes even if you had every possible log every possible packet trace you could actually not solve this problem because it's difficult to correlate the packets I forget what NDB called it there's a nice word of the reports that the packet there's postcard so this notion of a postcard to the packet sends to the controller it's actually difficult to correlate what these postcards might be because the header you see at S1 is very different to header you may see it as too imagine you have a sort of this load balance network and a particular user is seeing a very high load a very high page load time and you want to diagnose what about when wrong maybe I need to scale up the firewall maybe I need to scale up the load balance or maybe I scale up the servers in this case it's actually difficult to correlate these network logs because the headers look very different at every half in this part so I'm not sure NDB is the good comparison I was using NDB's analogy if I'm architecting a network box I would think along the lines of Google staffer which is sort of like an implementation of X-Fix it tells you precisely for a flow how much time was spent in the network at the middle box even X-Fix some things like X-Fix would not work here because you need to correlate from the host to the network you would need somebody like tracking the provenance of the flow through the network I was just using NDB's as a sort of analogy you're right there's not a problem in NDB's seeking to solve so I'm not trying to say this is a limitation on NDB it's not a problem it's trying to solve so again even if you want to do a network diagnosis these middle boxes make things very hard and this is something I think even the I think the service changing group and IITF has sort of recognized some of these problems it's difficult to get a change and so on and finally there's actually a very subtle example where these dynamic actions may actually violate your policy and you may not even know that they're violating the policy so imagine you have two users and you want to block user 2 from accessing some website xyz.com and the way you do it is you have a proxy for accelerating performance and then you bought a separate web access control filter to block people from accessing some particular site now what could happen in this case is that host1, user1 sends a request and the response, the proxy may cache the response right okay it's all xyz.com which is let me put it in my content cache and when the next request from user 2 comes in it will actually get the cache response back from the proxy and this is something that is completely sort of opaque to us in the network because we don't know what the proxy is actually doing so what we see here is that sort of this lack of visibility in this context could actually cause violations of the policy itself alright I mean you actually come up with a lot I mean a lot of such examples where the hidden and dynamic actions cause problems for network management and of course at this point there are many many possible solutions you could think of so you could say oh why don't I place the middle boxes in the right place why can't I do tunneling such that the middle boxes sort of directly do this composition, why can't I do consolidation where I say all the middle box functions once in one box so in this case you don't have all the routing problems or you can do the correlation like what simple did so each of these are some sort of I think of them as sort of these patches they're like band-aids to a particular symptom but they're not really solving the root cause imagine root cause remember the root cause really is that these middle boxes are violating SD and tenant they're violating origin binding and they're violating the path follow policy tenant and what we really need to do is sort of address the root cause as opposed to come up with these sort of band-aids in particular scenarios I would argue that the policy violation problem you're trying to solve a problem that exists in the hardware network with the middle boxes already which is kind of surprising to me because you're not just really just re-implementing what we already have you're actually trying to solve a problem that's a much more fundamental problem even if you're just wiring together you still have the same problem even if you physically avoid the network it's a much more fundamental problem that we sort of stumbled on I mean he said tunneling just recreates that same physical wiring and so it's not going to solve a problem tunneling is some sort of problem it existed before you sort of went out tunneling actually makes some of these things very hard tunneling actually makes diagnoses even harder if you start putting these tunnels you don't know how do I correlate what happens because you have many more black boxes inside the network at this point because it virtualizes the switches and makes sure that the paths are the paths that you express sure but it doesn't solve for example the dynamic proxy problem or any other it's also on this point so the high level idea we came up with and it's sort of in hindsight it's obvious these boxes are fundamentally causing this problem so why not ask the boxes to tell you what they did so why not we just have the middle boxes help restore these tenets in some way this for example in the proxy context and so on this is the only option you have you cannot do correlation you cannot have any other option this is the only option you have and really what we want these middle boxes to do is expose this missing contextual information so you have this sort of causal context of what the prop packet went to and really we want to expose that as what we call flow tags each flow is sort of contextually tagged with its processing context or the other provenance and we need that to be exposed by the middle boxes so for example the NAT may tell you or here is the public to private IP mapping or the proxy might tell you this packet you know what it was a result of a cache hit or this other packet was a hit result of a cache miss and now it's up to you to do what you want to do but I'm exposing the processing context to you you can do what you want to do and you can imagine that now you have this SDN controller sort of an enhanced SDN controller that in addition to controlling the forwarding actions it's also going to control this tagging logic it says oh I need these middle boxes to expose this logic to me so you also need the mechanism to control this tagging logic and this tagging logic can be used by both switches and middle boxes so let's actually see sort of a more concrete example of how flow tags can be used similarly going back to the example where you say I want to block a subset of users from accessing the sort of internet and I want to let others go through so the nice thing now with our tag base solution is that the administrator can actually write the policy in terms of the original principles that he had in mind rather than have to reconstruct and reverse engineer what the NAT would have done, I can just say I really want to block 192.168.1.1 and 1.3 and that's my mental model of writing the policy so what the controller would do is set up these tag based rules at the middle boxes and these switches so the first thing is actually we need to generate the tags, in this case the NAT maps packets from 1.1 to tag 1, 1.2 to tag 2 and so on so the NAT is going to write these tags to the packet the switch is going to be a consumer of these tags it says okay, if I see tag 1, I need to forward if I see tag 2, I need to just send it to the internet and this action can be sent by the controller and finally we actually see that the other middle box is actually a consumer as well it needs to decode the tag back to the origin and that's in this packet so in some sense what we see here is that switches are sort of passive consumers of this tag and middle boxes are both generators and consumers of this tag so of course what we do need modifications to middle boxes to support tag generation and tag consumption switches can just use existing open flow as long as this tag is encoded in some header field that's open flow compatible so this is kind of the architecture we envision going forward if you want to do a clean integration of the middle boxes with these hidden actions inside SDN what we envision is this sort of architecture where you have a new API between flow tags enhanced middle boxes to control this tagging logic between that and the controller of the network operating system and again the reason we sort of decouple the middle boxes in the switches is that these are two very different classes of windows we don't want to sort of decouple their innovation paths we want the middle box innovation to be independent so switches continue to use existing flow table existing open flow mechanism as long as they can match on the tag bits that they're using so we have these new flow tags APIs between the controller and the middle box new policy mechanisms new steering logic, new verification mechanisms that use the tagging logic and of course we do need modifications to middle boxes that need to produce and consume these tags is there any way to just infer the added outbound provide new configuration so like you know that this firewall they basically some previous packets that was blocking these two and you have inferred that that would be forwarded and it wasn't so that's kind of getting back to the correlation idea I was talking about earlier you could do a lot of these things the problem is somehow it's not unsatisfying it's like fundamentally doesn't solve the problem you don't know if you got 100% accuracy or not so if you really want 100% accuracy you want a clean solution you could come up with fancier machine learning algorithms to do this correlation but I think it doesn't fundamentally solve the problem of these SDN violations how's the tagging mechanism different from let's say you do virtual network, your tag 1 is virtual network 1, your tag 2 is virtual network 2 and it's the power source of the concept of tagging is not new the concept of tag is just a plumbing mechanism a tag or a label switch is a plumbing mechanism the thing is what is the semantics of the tag is trying to capture in your case a tag is trying to capture the tags need to come from the middle box let's say this was the source one or this is the proxy cache it's the semantics having a tag in a network is not new nobody will say that's new but the problem is what's the semantics tag is capturing is very different depending on what the application is the difficulty versus doing virtual network we can say this offline but I think the question is the idea of tagging itself is not new it's been networking as well as networking but the concept is what's the semantics the tags are capturing is kind of subtle okay so there's actually a lot of practical challenges in realizing this architecture but I won't unfortunately have time to but I can sort of point you to the paper so one question is what semantics should these tags capture first of all right I sort of told you at a high level but how do you systematically generate these tags second is how do you build a practical flow tags enhanced controller and finally this is actually a more practical question is it possible for a middle box vendor to support flow tags right how easy is it to modify an existing middle box suppose you have source code how easy is it to add flow tags support to an existing middle box and this is really key for an option we say well if it says that the name is completely the architect how they wrote the middle box to support flow tags I think that might be a non-starter but we actually unfortunately show that it's not that that's not that difficult so the semantics this looks like a complex figure but really what it is doing is the semantics that we need to capture the tags we need to understand the all possible dynamic data paths a packet may go through so imagine going back to the proxy ACL example we need to have tags that say what is the origin and what is the current processing context of the processing state of the particular packet of flow and really the transitions from these different nodes in the network capture the origin and the processing context and that's really the semantics your flow tags must capture so in this case the transition on the edge from proxy to ACL says that if it's host one whether it hit or miss you need to send it to the ACL and this is really the semantics the flow tags need to capture so again there's one another practical question is that can we actually have enough tags, somebody asked can you encode these tags in practice it turns out we can reuse tags across flows we can reuse tags spatially and we can do it then like 16 threads we have to re-purpose something in the header be it the VLAN field or the IPID field or the IPV6 flow label field but it's possible it's not like too high it's like 16 bits so this is sort of the semantics and again this goes back to the originals we think about in the context of service chaining we need a richer abstraction for what service chaining means you need to actually capture these dynamic transitions within the service chain for NFV or service chaining to actually make sense to avoid the policy violations we're talking about so the policy you can't just have a static policy change that says do web goes to firewall ideas in proxy you actually need to have this annotated with the possible dynamic data paths that particular traffic could take so it's a richer policy abstraction that you might need in the context of service chaining could you explain why that actually solves the problem of the proxy serving the content to H2 that it wasn't supposed to get because that's really kind of not obvious it looks to me like the proxy should you know consult the ACL before it returns the data to H2 so what's doing is like actually we say that we want to route traffic from proxy to ACL irrespective of what weather it is oh okay I would say 21 yeah okay sorry okay maybe I guessed it so what the flow tags enhanced controller needs to do is actually it's a reactive controller and that's something we kind of fix what we know how to do is the completely reactive system so first it has to translate this dynamic policy graph into an actual data plane realization use something like what we did in simple to translate this policy into an actual physical data plane set of flow rules in addition it has to support like new event handlers like similar to like say you have a packet in or a flow mod handler that you have an open flow you would need like a tag consumer tag generate handlers in the controller and it also has to say okay how do I do this expiry handlers the flow expires I can re-purpose this track and other things so you don't need like corresponding flow expiry handlers so again so we sort of build this as an extension to custom extensions to support the tagging logic and finally sort of here's the I think the reason why we think this is hopeful we meaning the students side did a lot of hard work in trying to say how do you sort of modify these middle boxes to add the flow packs of code and it turns out the number of lines of code we need is actually pretty small right and this is us as like complete non-experts in the middle box space right we said okay we knew a little bit about how these work but we don't know much so it took us like a month or so to figure out where to add these lines but the actual lines of code is not that much and if you compare it to the sort of the source code of the actual middle boxes the flow tag support is pretty minimal and imagine if you have like the guy who wrote square to the guy who wrote snot needs to do it it can be very quick for them right it took us a month with as non-experts with very little background to do it so we have to use like control flow graphs and so on to do this but if you have this not like Martin Roche who wrote snot this must be this must be a piece of cake for you so the nice thing is actually requires minimal code modification and this gives us hope that maybe this can be adopted pretty soon and you have also shown that again this because it's a reactive controller we have a natural question on like how scalable is this controller and what you've shown is that with a few optimizations we actually take like very little time to compute the flow rules on each packet on each flow right again similar to open flow we're not doing per packet actually doing like a per flow or per session so that's kind of nice in the flow tag setup and the overhead of doing that is pretty small related to the overall flow set up time so I think before I come to this I want to sort of talk about sort of a broader such agenda that we've been working on this is like particularly looking at how can we extend SDN and how can we integrate these into SDN fold but really if you step back and look at what are the kind of pain points with these middle box deployments is not just management complexity which is really the focus of this talk how do you use SDN to simplify the management but there's also other problems like these are expensive like this kind of the problems the NFP world is trying to solve is that these boxes are expensive they require a high capital infrastructure and they're difficult to extend imagine you're an enterprise and you only want to support people bringing their own mobile phones the only way you need to you know as an operator to support these is to bring another iphone kplay box so in some sense anytime your policy changes the reaction of the operators is to go by more of these boxes and sort of increases the sprawl and increases the complexity of the network so wouldn't it be nice if you could somehow have these boxes be more flexible let's say if you have like a virtual system they can invoke a function on demand so to sort of address these other concerns we've actually done other work for example we have talked about a world and this comes back to some of these things we're talking about can you just think of it as instead of hardware box as a software box we said if you can decouple the software and hardware you can think about a general purpose platform a consolidated platform where you can run different kinds of the software modules and we've shown that there's a lot of benefits to doing that both from the management side as well as sort of the capital expenditure side and even going further if you think about these boxes as being just regular compute you can actually say well in fact I can outsource a lot of these functions to the cloud the same way NFE is thinking about it is that if it's just a compute engine and a lot of people are doing this you can buy a virtual appliance that has ideas in a firewall today can I just offload this whole function to the cloud and read out the traffic such that it comes back to me through the cloud ideas and now this solves the flexibility problem and the overload problem because I can now get the benefits that cloud gives I can do elasticity I can do dynamic policies and I can dynamically look new functions and just pay for what I can do but the reason why people have middle boxes and don't run them in EC2 is they don't want to pay for the expense and the latency the great question actually we've shown in the paper that if you have something like an Amazon style footprint, the latency is actually not that high in fact you get this benefit of like what overlay routing does sending it to a well connected node in the back is actually better you have native BGP we actually show that the latency is actually latency impact is not that bad the reason is that a lot of these cloud providers are not much better provision if you instead of going through the native BGP part you go one hop one hop overlay through a well connected node actually gives you pretty good latency I find it surprising to think that an IDS running in Oregon to secure our standard Wi-Fi is going to be better than one running but I guess you're right no no I mean it's entirely possible internet routing does not have triangle for routing properties so stepping back again as I said just scratching the surface here there's a broader set of interesting questions and research opportunities that arise here so for example if you think about SDN what are the nice kind of like northbound policy interfaces you want to expose there's like interesting questions like what are the right policy language you want to think about how do you automate these extensions to middle boxes say we are doing floor type somebody else wants to do something else how can we support tools for vendors to integrate new SDN capabilities to middle boxes other thing like thinking about like sort of NDB and ATPG and HSA and so on they're really cool tools and they sort of change the way we think about these problems but the question is if you start thinking about these dynamic actions and these stateful actions you start hitting bottlenecks with a lot of existing testing and electrification tools the question is can you test a whole network with both middle boxes and switches what does it take to do those things and finally as I said like okay the consolidation shows how do we do it but a broader question is what's the right sort of hardware software platform to support more sophisticated data plane capabilities and there's many more this is just really scratching the surface here somewhere I sort of just to conclude the real motivation for us came from looking at the industry surveys and so on and then at work we had done is these middle boxes are both sort of a necessity as well as a challenge for SDN it's a necessity, a challenge and an opportunity in some sense right it's like saying it's an opportunity for SDN to show sort of tangible value to the operators who have these advanced functions at the same time it's a challenge because it sort of it pushes the boundary of the abstraction that SDN currently offers so the challenges come from these composition requirements for service chaining new resource management abstractions with these hidden and dynamic modifications that is middle boxes and what we have done in our work we have taken these sort of baby steps towards the problem of integrating middle boxes inside the SDN fold so one was this backwards compatible traffic steering solution comes simple and the other is like a more sort of cleaner architect solution to integrate these boxes with dynamic and hidden modifications and I said okay the way I think about it is actually this is all part of a broader on the middle box manifesto where you sort of ignore these middle boxes and sort of have treated them as second-class citizens for a very long time but maybe it's a good time for us to rethink how these kinds of advanced data plane functions in the context of service chaining in the context of NFV how do we rethink how these middle boxes are designed and managed from the ground up I think it's sort of a good time for us to sort of rethink a lot of those issues so again thanks again for having me respond and I'm happy to take more questions so it seems that the flow tags lives on a domain wide namespace right have you seen any complexities of managing these namespace what happens when you add new box you actively assign some tags per middle box etc so right now talking about single domain are you worried about multiple domains? no no single domain but you have multiple boxes you have a limited space of flow tags how do you assign I see so we do have some mechanisms to encode in the tags so the number of bits you need is really a function of the number of edges with a DPG and the number of possible policy classes so we think like we have like these policy classes let's say E possible dynamic transitions it's sort of log to the base 2 of the T plus I mean we can talk about it offline that's really the the expressivity you need from these tags and as long as you have that many bits in fact you can do a lot of spatial and control reuse and bring it on even further so it's kind of logarithmic in sort of the number of policy classes and the number of edges dynamic transitions are you saying that you have structure to the tags so that we can have a portion of the tag devoted to IDS portion dedicated to proxy it's a great question we don't currently do that but that would be certainly something that would be interesting like you can imagine these tags being a stack of tags or hierarchy of tags we don't currently do it right now we treat as a flat name space because we're trying to optimally encode it inside the 14 bits that we have but you would certainly imagine that if you have a service to any header and that header is longer you actually have a nice sort of hierarchy of tags there are stack of tags we don't currently do that but that's a great observation as the network gets congested or some other sensors triggers you know kicked in takes in basically how do you influence these tags in real time so actually the flow tags work is actually a reactive controller so as the flow goes through we're actually like similar to what ethane or the original sort of open flow agenda was doing we actually do a reactive controller so it's pretty as long as the controller can track the current state of the network say your network information is reasonably up to date you can actually handle a lot of these dynamic issues what's the timing so the timing the table we had like around like 0.03 milliseconds very small like the real bottleneck is actually pushing the rules to the switches when the switches can update the rules the computation of the controller is actually really really small so it's a very tiny fraction of like okay it's a round trip to the controller that really kills you and that's not something that that's the problem with any of these centralized solutions where they're doing the reactive controller it's not unique to flow tags per se over at D-Head is like 0.03 over to open flow tags have you considered generating the tags at the end points when they're bonafide that's a great question instead of having these tags generated on the fly you can actually have the provenance where the host tells you what it was that certainly helps a little bit I mean you can certainly do that but some of these tags have to come from the middle box so if you think about there are two things like one is like the origin binding the origin binding is like who did this tag come from there you could have the source tag and say okay oh I'm IP address one and I'm going to put this in the tag but if you have these dynamic transitions like say oh the byte IPS triggered an alarm and you need to send it to the heavy IPS or the proxy decided the patches has gone and said oh it's a cache response those kind of tags have to necessarily come from the middle boxes so it's certainly possible that for some of these some of these use cases we need active tags to propagate to passively but there are certainly examples where we need active tags from the middle boxes themselves so they can concatenate on the way down because I think what was generated certainly spent a value on the way down and I think that can start with the stack of tags but you could certainly do that I think this point is interesting which is that going back to ethane it's also all the policies really should be specified in terms of kind of user and application and if the whole network in the middle boxes has existed out of abstraction they give people in the policy a very simple orthogonal way across the network there's a third internet that I didn't talk about it's a internet called high level names in ethane it's like oh the policy should be expressed in terms of high level names not the MAC address and IP address because there's a dynamic and there's a meeting lesson so we didn't go out to the high level names because that is a much more radical restructuring on the middle boxes as well and today we think about a rule set it's not written in terms of these high level names but it should be it wasn't too hard to modify the middle boxes to deal with your tags maybe it wouldn't be too hard to modify them to deal with the high level names certainly it's quit for example does support some authentication and some abstraction of users it would be possible I have one more question I have this question sorry I have a testing background I just want to know what is the tag like saying that this was already done post ideas so the packet will again go back to proxy no no great great we are hitting the tag generation either from the middle boxes or switches as coming from a trusted entity like the controller is controlling the action nothing is trusted no nothing is trusted we are assuming the tags can be trusted but if you imagine there is like something else like some attestation that the box came the endo is saying I am a trusted packet I went through post ideas and so on we don't currently deal with it but that would be something interesting the grander vision I think of some sort of an authenticated packet the packet tells you I have been attested by these boxes saying I am okay but right now we are implicitly trusting that the tags only come from the boxes so one option is you can say the increase gateway can actually zero the tag and say okay the host cannot tell me anything I don't believe anything the host says again the network itself could be corrupted but let's assume the routers cannot be corrupted and so on but yeah but you can at least say you can kind of close the attack surface and by zeroing out the tags at the endo but that is certainly something to look at okay thanks a lot again