 I'm sad. Okay. Good morning everyone. Let's just have a sit. Okay. I'm Shlomi Al Fassi from Context Swim, which is an HP company, a company that was acquired about a few months ago. And we are doing what we have done for HP and also in HP, we implemented an SDN controller, which is based on open daylight. And what I'm going to show today in about half an hour is how we implement one of the use cases of SDN, the service function chaining, which is one of the six that we are currently working on. Okay. So, and I want to show you also what we are and how we are different than other implementation that currently other vendor are implementing the way they implement the service function chaining. Okay. So, as for the agenda, I will start to understand first what is service function chaining. And also, probably some of you was in the previous session, and I will try to go a little deeper inside to understand what service function chaining is mean. And I will show several approaches on the traditional that was introduced before the SDN, another one, the one that is used currently in open daylight. And after that, I will describe our solution, which is subscriber aware. And I explained the difference between our implementation to the previous ones. And once we understand all what is subscriber aware service function chaining, I will try to go a little even deeper to our solution, what we at Context Stream, what we have done, how we try to solve this problem that subscriber aware is introducing. Okay. So, I'll start with the very high level definition from the ITF, what is the dictionary definition of the term service function chaining. So, service function chaining is actually used to describe the way we want to introduce a set of change in the network and how we steer traffic into them. Okay. So, service can be firewall, can be load balancing, can be lawful interception, any kind of service that the vendor want to introduce to its customers. Okay. So, when we are saying service function chaining, we need to understand where we want to place it and when it should run. Okay. In a carrier environment, we can distinguish between the, over here you see the left side where this is where the traffic of each subscriber is running. It could be a mobile network that goes until the PGW and it can be cable network, optical network or whatever. Each one of this network has its own, you can call it the edge router before the place that we want to introduce the service function chaining. And the right side, you can see that this traffic can go either to the internet or maybe into internal application of the vendor itself. For example, over here we can have an IMS application that the vendor is introducing to its customers. And over here, you can see a set of service function. It's going to be a content filtering, video optimization, caching or whatever. All of them is something that the carrier want to introduce to its customer and we want them to be part of each one of the chains. Okay. As I said, I will go over several approaches that we have to implement service function chaining. And the traditional one is the one that we have until now, before SDN. We have physical boxes which the carrier, if you want to introduce a new service, has to put a new service in line with all the others. It means that for this example, if you want to introduce content filtering, we have to put it somewhere in line with all the others. And each subscriber that traffic from his mobile device to the internet, his traffic somehow needs to be passed through this new service. All of them are physical devices. This is what we have done, what we have so far. And the problem with this approach is that, as you probably imagine, each one of the subscribers is affected by a new service that introduced to the system. Which means as long as it doesn't matter if you're going to use it or not, this subscriber got a service that's supposed to use this function or not, his traffic will go through this service function. And another problem that the carrier has with this approach is that each one of this service function needs to be scaled to the top. Because all of them are using, all subscribers are running through this function. It means that all the bandwidth that this function will receive is the full bandwidth of all subscribers. And you can introduce lightweight function unless because all the traffic is supposed to go through them. And another problem that when one of them is failed, it means that everything is blocked. All of them are in line. And then one of them fails. It means that the traffic cannot continue and everything is stopped. Okay. So this is the traditional approach. Then SDN came. And I'll show how SDN in ODL is actually implemented. ODL open daylight. It's an open source controller. It has its own plugin for SFC. Okay. It uses the ITF spec to follow the use of the ITF spec in order to implement it. And ODL is using something that is called NSH, which is an encapsulation that we add inside the packet itself in order to distinguish between chains along the way. So over here you see a set of blue boxes. SFF in this case is the switch, okay, service function folder that know how to identify an NSH encapsulation and for the traffic according to this encapsulation. Classifier, which is the point where the traffic of the subscriber enter the network. This classifier will never get a new traffic for a new subscriber. The first thing that it does is actually classify its traffic to a specific NSH ID. Okay. Once the traffic was classified, the packet was marked with this ID and all the elements along the path know how to forward the traffic according to this ID. So in this example you see that SFF1 is actually because it has the purple SFC encapsulation, it need to forward to SF1 and then when the traffic gets back, it forwarded to the next switch and the next switch is forwarding to the second function. In this approach, the classifier is pre-configured to a set of NSH IDs. Each one of them is mapped into a set of chains, to a set of services in the chain, okay? And this one, this is statically configured, okay? You have a set of IDs which map to a set of chains and it's not dynamically changed according to traffic, heavy traffic in the system and whenever you want to introduce a new one, you have to reconfigure the classifier, reconfigure each one of the service functions to identify the new NSH and to know what to do with it. And, okay, so this is the advantages and what NSH is capabilities and challenges. One of the benefits that you have in this RFC, okay, in order to better forward data in this chain, NSH is introduced in metadata that you can add to the packet, okay, in order to notify other elements in the chain, how they should forward, maybe to have a better decision other than the NSH. So you can add another metadata into the packet and the next element in the chain, we can use it to forward maybe differently. When you are using NSH, it means that the transport between the switches is transparent for you. We are using, they are using the basic headers to forward traffic between the switches. So this is one of the advantages and the challenges in this approach is that you need that each one of the network functions or service functions in this case need to know how to handle NSH, this encapsulation. You need to encapsulate, de-encapsulate the header from the packet and not all of them currently know how to do it. This is something which is not, you can't expect a network function which is already running and implemented that is not going to be changed to introduce a new protocol for him, okay, so it's very hard to use currently exist network function as part of the change. This can be solved by an SFC proxy, okay, that actually know how to encapsulate the de-encapsulate the packet and this proxy is running between the SFF and the service function, you know how to handle the NSH and continue from there to the service function as a trivial regular packet of that running in the network. This is from the service function point of view and another problem is that OpenFlow currently not supports NSH at least at version 1.5, it means that you need also to kind of manipulate the packet in order to use and currently exist switches, okay, so because of all this limitation, we at Context Stream decided that we want to go to a different approach which is subscriber aware and in the next slide I will explain what does it mean to be subscriber aware, okay, so let's start with this flow. Over here you can see in the left side, you can see the PGW which is in the mobile network the place where all the traffic of the subscriber entering the list of service function and in this setup we have a set of network function, available network function, we have the TCP optimization, the video optimization, content filtering and analytics collection. If you remember in the first slide about the trivial the traditional approach in order to follow traffic between the PGW and the firewall, we have to go along each one of these hopes, okay, so now we have let's say we have two subscribers the purple one and the red one whenever we want that the purple one will use the TCP optimization and video optimization and URL filter and bypass the analytics and not will not get into the analytics and we have another subscriber this one, the red one which only use TCP optimization and the URL filtering, okay, in this approach each one of the subscriber will not be interfered by the other one, okay, since we are not using analytics at all, analytics will not get any traffic, okay, and these two subscriber traffic is running in parallel without interfering each other. Now I want to a little go deeper to see how the data flow, okay, we use the first in the previous slide, we have the controller, okay, that know how to configure the traffic inside this network in order to achieve this chain, these two chains, okay, and once this configured in the in the switches, we will see over here how the traffic and the data is actually flowing, okay, so I'll deep dive only to one of the one of the hops just to see which how we use the switch table and the rules inside this open flow switch to forward the data and okay, so I illustrate the traffic between the TCP optimization and to the URL filtering and over here we are using all this V-switch that we have are currently running and actually not just the V-switch, also the VNFs and the controller, all of them are running inside an OpenStack, so OpenStack is not aware that we are running something special that forward traffic, for OpenStack we are just a set of VMs that he loads, we use the Nova to load them and to and Neutron to configure tunnels between set of OpenStack tunnels between data centers if we need to go between data centers, so like that, okay, so we started with the TCP optimization, okay, the VNF, the VNF returns traffic for a specific subscriber and the first thing that we are doing is just going to a subscriber switch, now this subscriber switch is OpenFlow switch, a regular OpenFlow switch with a set of table, the only difference between the subscriber switch and the tunnel switch that I will talk about it later, it's the size of the table, okay, all of them are using the same mechanism of match from one side and do an action as a result, like any OpenFlow rule, but subscriber switch handles millions of rules, okay, because each subscriber that we have, let's say if we are in a large scale setup where the carrier has millions of subscribers, each one of these subscribers has its own rules in the switch and we need to have a very efficient switch that will know how to handle large scale tables, okay, the current implementation of this switch know are very optimized to a data center application which has not too many five tuple matches to look for and since we are just forwarding to a server, a set of servers, not a limited set of servers and which are not dramatically changed, okay, they are pretty much static, when you go to subscriber, subscribers are connecting, disconnecting rules are added to the table, removed from the table and you need to have a very efficient switch that know how to handle it, currently one can say we can use the top of rack switch and we can use the hypervisor switch to install something over there and the current implementation that we try to use was not efficient enough, okay, so and we are in context we have our own v-switch that run in on a VM and that can handle this up to 100 million subscribers and with very low latency and latency I'm talking about something about tens of micros per subscriber and and okay, how now that we know what how the subscribers which table the subscriber is using we need to understand how it actually forward the data, okay, so eSubscriber we identify subscriber by its IP address, okay, and maybe the villain that it came to came from, so and the first thing that we are doing in this subscriber switch in the rule table we are trying to match the IP and the interface that the subscribers came from, once we identify this subscriber the action that we need to do is probably go to a different different vnf and and go so we get go back into the folding engine and probably we need to decide in this case that we want to go to a different data center we need to under to identify which tunnel we want to use in order to continue so we have a tunnel switch which is actually the same v-switch different tables that we are match this tunnel switch only by the interface that we came from not this is not a scale switch and the action of this rule in this tunnel switch is where to go which harder switch we want to use which tunnel we want to use in order to forward to the next vnf and in this particular example you can see that the subscribers which and the tunnels which are separated but basically we can have them in the same in the same open flow switch it's depend what access do we have to the entire network if we can configure the the hypervisor switch the v-switch and the hypervisor all in the top of rack we may consider to put the tunnel over there okay and use the high capacity of these elements if we cannot because any reasonable reason okay we can put and join these two switches into a one process and combine the subscriber switch and the tunnel switch into a single switch each one of them is using a different tables so there is no overlapping in the information and and there should be a problem to run both of them together okay so once the tunnel switch decided in this case that you want to go out of the network and maybe go in this case go to a different data center and a different open stack and we are forwarding the traffic into the hardware switch and from the hardware switch it's just go in the same path until it gets to the url filter and okay so okay and what are the challenges okay it's nice to say that we want to handle million of subscribers but there are challenges that came with that and first of all the first challenge is actually the the number of subscribers number of rules that we want to introduce in each one of the open flow tables and I talked about that that switches have some capability issues with this number of rules and so and it's not just the number of the rules it's also the number of changes in the mobile network where users are connecting and disconnecting and it can get up to a thousand and tens of thousands of changes per second okay and this is something that subscriber-aware approach need to handle and I will show later how we solve these problems and multi-data center as you know open stack today has some issues with multi-data center we need to overcome them and to see how we forward between multi-data center we are using a federated controller between them just in order to move traffic efficiently between different multi-data center and since we are carrier grade solution and we need to have a very efficient and high availability and redundancy each one of the elements in this solution need to be redundant and can start with the vnf the controller each one of the switches and all the connectivity between elements need to be redundant and high available and I have a slide later on how we manage this issue and another challenge is to reuse vnf okay even though they are virtual vnf virtual network function and we can load as much as we want and eventually at the end we want to reuse vnf to a set of chains okay we don't want to introduce pair chain network function this vnf should be shared between between chains and we want to use physical as well okay we want we don't want our chain to be agnostic to physical or virtual and this service function the approach just for the packet into this element without even knowing if it's a virtual or not virtual and this way we can reuse a network function that we currently have and not we're not mandatory must change all of them just because of this new approach and and this subscriber aware approach overcome the limitation that we have in the odl approach which is using the nsh in our solution there is no need to understand and to encapsulate the didn't encapsulate the nsh header from the packet we are forwarding it just according to ip mac any traditional protocol that we have okay and advantages okay once we finish with the challenges what we gain from being a subscriber aware and actually we are using the sdn fabric to do it okay it means that sdn know how to control the traffic and forward into a forward into a specific network element very transparently and this is something that is advantage for us and we can enable and deliver services per subscriber which mean and we can introduce a new set of change very easily very dynamically and this is not just for and to satisfy customer needs it's maybe also for introducing new services as a carrier you want to test it before you try it before you provide it to your customer and once you isolate a specific chain just for your testing no one will going to be interfered by that and and this is something very important and security as well the same way you can introduce a new service chain you can disable a specific service chain without interfering all the others other approaches in the traditional one if you stop one of the elements it could be and block the entire chain in the nsh it means that you probably shared with many other subscribers and changing this chain will affect all the others and in this subscriber where everything is very easy to to be automatic and to okay very elasticity and we'll not go over all these details actually we talked about it in the previous slide so kind of repeating myself and okay so now we that we understand a little bit more about subscriber aware and I will let go deeper to see to show you how we in context stream and HP and how we solve this problem and what we have done so far to make it carrier grade solution okay so a little bit about the architecture and we have the network network layer and it's an in the left side you can see the mobile network in the middle we have the underlay which is the connectivity between our elements and after that it's actually the network itself between between these two elements we have overlay network okay which is actually configured it can be configured by the open stack it's a set of tunnels and whatever tunnel we want it could be MPLS, GRE, VXLAN or whatever and when it's actually a connect between open flow switches between the two sides of the network okay on top of them we have the context control okay which is the SDN controller SDN controller which is based on open daylight and it's a federated controller okay which means it can reside in several data center different data centers but still act like a like a one they know this controller know how to talk one to each other to get a better performance and to utilize better the network and this controller is configure each one of the switches using the open flow and this is what we have right now but basically as long as there is an API from the switch we can configure it in any way that we want and now we talked about that this the controller are federated but there is no direct connection between each one of the controller to another and we are using a mapping service which is using a distributed database to propagate information from one controller to the other okay this and the information this database could be scale tables like subscribers okay subscriber tables are measured by millions if a single data center handle can be located let's say in the east coast of the united states and handle a portion of the subscriber and the other one it's in the west coast and handle the other portion and the mapping service which use lisp is located identification separation protocol and we want to use and be able to identify each one of the subscriber across all data centers so we are using distributed database that know how to propagate all the information between the nodes each context control is accessing a specific context map which is an interface into the mapping service and okay so each one of these three some is of switch control and map we call it a context node a context node is located in a in a single data center and as you can see these two data centers can communicate one to another and and the limitation that we have it's not kind of limitation but the issue that we have in open stack that when walking with the multi data center is solved by this approach we are using the capabilities of a distributed database to propagate information between two or more open stack data open stack instances and as for a northbound of this solution we have a set of a broker and it could be rest broker could be triple a if we want to identify new subscriber and learn about new subscriber from a triple a server it could be a radius it could be a gx diameter whatever we have a broker for each one of them and we have a context management that expose the information like any other management ui cli and we show the configuration of this entire system and and we have a context which is a performance tool that know how to collect analytics and additional information from our system and on top of all this we we are using open stack and it doesn't have to be necessarily open stack but this is what we are using right now and to create all these instances each one of these box blue box over here is actually running in its own vm and and we are using open stack the nova open stack to introduce new one to create new to create an instance to remove and and to actually to manage their life cycle and in addition to that i mentioned it earlier and all the overlay connectivity the tunnels that we are using and and the connectivity between the component themselves internally and we are using also the newton to configure all this connectivity okay and as is this a solution for a carrier grade solution okay the major thing that we want to handle is failures and high availability and in order to in a regular case where this let's say it's this rack is handling a set of vm which are and belong to application which and it's not carrier applications and in case of a failure and the high availability is most of the time is solved by the application themselves okay if you are running in amazon and one of the rack is failing and amazon know how to handle this failure and their application know how to recover from it and in the carrier solution situation you can't afford losing a rack because once you lose this rack it means that million of subscribers will lose connection that will be interfere okay so and we want to make our solution very efficiently high available and right now you will see that we talked about each one of the context net node and which i showed you that each one of them has a control map and switch and in order to make it high available each one of them is actually in the same node we can have a set of instances of each one of this element and this actually gave us an n plus one active all active instances of each one of the element and whenever there is a map a failure and we know how to connect to a different a different map and and once this map is going up again the distributed database know how to sync it again and as for context control it's it is a federated controller and once one one of them is going down and it doesn't affect all the other because the walk is split into another controls and and as for the switches we are using redundant redundant connectivity between them in order to make sure that in case of failure and we won't get and we won't be affected from that and okay so once there i talked about controller that can fail and how we make sure that it's not interfering us in each one of the node we have a leader okay and this leader its responsibility is actually to split the walk between all the current controllers in its node and this way we can reduce the the load from a single controller okay and this controller is actually doing a very simple walk that identify which one of the controller should handle and and the walk that need to be done because we want to have we want the the controller need to configure the switches and the the efficient way to do it is to make the controller that very close to this switch okay in the same hypervisor in the same problem the same node and we need to select the the appropriate the best one the best one of the controller to do this job this is the what the leader needs to do and it's a very simple walk of dispatching and the walk to other controllers in high availability in case one of the elements of the controller failed and stopped walking the leader identify it and just stop forwarding traffic to this controller and in in this case nothing is going to be harmed because of a failure of controller as the connectivity between the switches we have a multiple underlay pass okay for redundancy we can allow that if a single pass is failing to stop all the connectivity so each one of these elements are having at least two redundant passes and if the same goes for the broker if we are learning about new subscriber from radius okay so and so we need to have a duplicate multiple radius listener each one of the broker is highly available and we can do the same in each one of of its instances to do the same job and now as for for a vnf vnf failure whenever we are not just configuring a single path for each one of the subscriber we are configuring two or more so in case one of the vnf fails and the alternative pass will start working because the switch will identify that it's not the path to this network function is not accessible and we'll use the second the second path and so this way we cover a high availability for all the elements that we have and this is kind of level of availability a carrier great solution should have okay so this is for high availability and how we gain new improved scalability and we are talking about having several context node okay in order to better utilize the network we don't want to have a single node that handle all the millions of subscribers in the same place and we can we want to make them as close as possible to the actual subscriber exist where they are actually exist and but we can introduce this is a distributed solution so we can introduce several hundreds of context nodes in each context node we have as I showed before a set of context controls and and since context control is open daylight based and is java based we have our own limitation java has its own limitation regarding garbage collection and memory footprint and so in the same vm context control vm we can have several instances of several jvms that running the controller code each one of them is limited to a lower memory full footprint and so in order to reduce as much as we can the garbage collection effect on the entire system and scale out for vnf is since everything is a vm we can scale out as much as we want we can introduce new vm and collect them as a group and just forward the traffic into this vm and we can support up to a thousand vnf interfaces okay this is pretty much it we are running a demo downstairs in the hp booth that show exactly what I show over here how dynamically we can add new services to the system per subscriber so thank you everybody if you have any questions thank you I can hardly see with the light to my face okay I'm thinking give you a microphone all right so the the questions my question is focusing on around the odl controller layer okay so once the the the federation you mentioned across multiple data center and I guess it's all odl I guess it's essentially a clustering capability right yeah with the federation yeah the word is the federation's more I guess actually the bottom my question has that the capability you have here federation or clustering high availability are these partially I guess are these efforts with hp or some some of these are actually part of the odl community effort projects upstreaming and some capability is going to be part of standard odl the federation solution for us in that we made for the controller it's not something that part of the odl okay the odl has clustering mechanism to do some and distribute information between instances of a controller we are not using this clustering we are using the as I mentioned before I we are using the mapping data mapping service in order to federate the knowledge about what exists and what not exists in the in the system about statuses and subscribers and stuff like that odl as a platform currently is not federated the way we need it okay that's clear thank you so the then the sfc part of it you know as you mentioned that you're now using ish and then basically ran out based on mac and ip address you know to identify the subscribers and then you know doing different personalized chains and so the also the same question the this part of the sfc has anything to do as what's going on within odl project or is basically something that no our sfc it's different sfc that i mentioned over here it's a plugin and part of the repository of open daylight ours is external to that so we are not we consider contributing our solution into the odl but it's not something that we have done so far but okay we expect to do it some potential something okay great thank you right thanks any other question do you have right now some kind of a mechanism to provide you know assurance for for service chaining because you know you can have a chain yes but probably the performance on this chain can can degrade it also do you have some kind of mechanism right now yeah we can the easy it's not just you're looking for overloading of this chain or something like that to understand where you want to split it to several what do you mean by what you're saying because for for me the chain it's it's like a transport yes i just i just care the the consequences of vnfs which i should pass through yeah but it doesn't mean i i have a guarantee that's like my pocket will be done well like so so yeah you we can troubleshoot to see where the problem is but the first thing the controller is doing is actually configure this chain okay specifically for you for your ip i mean and it shouldn't be interfered by other if there is some kind of failure in the network okay that make your traffic i don't know blocked or something like that and we have troubleshooting you know to see it and but you should be you shouldn't be affected by others so if so in case of failure it's probably not going to harm only your session and we have a mechanism that identify the state and the accessibility of each one of the network function to see if your network function is i don't know fail we know how to move you to a different network function with the same capabilities of course and to maintain your session as much as we can okay okay so how do you identify subscribers you mentioned ip address so are you using anything else in addition in addition to ip address yeah and it's a walker hi okay it's okay i mentioned the triple a broker okay the triple broker is something that the pgw for example a pgw notify our system by a radius message about new subscriber that introduced into the network and part of this information it could be usually we are using the imsi the mc id and the ip address but basically we can have what whatever we like and the basic concept of flisp which is a location identification separation protocol it's it's the basic concept over there it's that it's not just ip okay ip it's the traditional way to identify a subscriber because most of us are using ip but basically it could be whatever you like okay so and we are identified subscribers in our system also by an ip plus imsi and because in a network not networks and you can have the same ip address for different subscribers so the uniqueness is gained by using also the imsi of the subscribers and okay so how do i unwire myself