 Okay, so good morning everyone. Thank you for coming. My name is Yuki I am from the cloud band product management team with me I have motion deletion from our engineering team and when I talk to you about Taking applications taking tank telco applications to the cloud Opportunities and challenges we will go briefly through the transformation and a V is going through Describe the business perspective of of EPC talk to do to you about actual usage Describe the architecture and the NFV platform and then we conclude with a demo of root cause analysis Suggest briefly about alcatel Lucent. I'm not sure how many telco guys do we have here telecom guys? Quite a few so for those that are not alchohol guys, so we'll tell you a little bit about alcatel Lucent So alcatel Lucent is a telecom a telecom vendor is one of those vendors that sell High-performance high reliability boxes that when you pick up your phone it allows you To speak to your family or text your colleagues with the high quality and with little little chances of failure and Alcatel Lucent back in 30 2013 actually Introduced its Shift strategy as part of it. It's shifting to the cloud and as part of this Alcatel Lucent now provides a Full end-to-end solution of NFV actually starting from the NFV platform the act the software the SDN platform and the actual telco applications be it the virtual IMS EPC and V run and virtual routing Helping us innovate our our strategic partners in the infrastructure layer. We are able we have early access to Technologies that enable innovation on the NFV front and We also demonstrate our commitment to openness so the cloud and ecosystem program where which actually allows other vendors not only alcatel Lucent to leverage the platform and Run their applications and on top of it So a little bit about cloud band so cloud band was as a business unit that was found by out founded by Alcatel Lucent Four years ago actually it was given a single task Which is to provide a platform and NFV plot a cloud platform that allows service providers to run Telco applications in the cloud Alcatel Lucent applications as well as as other applications When he started back in 2011 Let's say the term NFV did not exist and we were quite alone in this area only in October 2012 The first NFV white paper was published by a group of service provider January 2013 at the NFV ISG first meeting If you look at what happened in the open stack timeline during this period, this is an open stack summit So in Portland, you wouldn't find a person who knows what is NFV in Hong Kong There was an NFV mini summit a held in a remote hotel with a handful of participants Only around spring 2014 pretty much around the Atlanta summit timeline the telco working group In open stack was established and actually some awareness to NFV started to Started to be created and for those of you who were in Paris last November we saw that NFV was really starting to take center stage and I think another another thing to mention is the Establishment establishment of opnfv group late in late 2014 Anyone by the way anyone here was in the opnfv day on Monday Okay, so those of you that didn't come good. You did good that you didn't come because the room was Fully booked you wouldn't be you wouldn't have place to get in Which actually demonstrates? how How top of mind NFV is for open stack so here in Vancouver if I Go around I would and speak to people I can I'm willing to make a bold bet and say that anyone Knows what NFV is about at least Okay, but you didn't hear you didn't come here to talk about to hear about the history of NFV We're here to talk to you about the transformation NFV is going through and How it actually affects actual applications going to the cloud? so as I as I started and said the cloud man has been in this area for four years Two two and a half years ago. We started to see the first customer going into NFV And those was were a CTO and Innovation teams trying to build an NFV platform Along the lines of at CNF V something that is very pure very futuristic Which allows whoever deploys this platform to leverage all the benefits of the cloud elasticity elasticity distribution and you know reap all the benefits of OPEX and capex Reduction as well as the service agility And this is what we built the cloud and platform for However in the last 12 months or so we're seeing a transformation Actually Operations teams those teams that have the big budgets at their hands and they are in charge of Maintaining and growing the existing telco business They are now coming to the telco to service to telco equipment vendors like actually loosened and they're telling us What we would like to buy from you is not those boxes that you were selling us for all these years Now we have an FV and we will we would like to have those those applications as VNF's To run in the cloud and actually I think this is a huge thing for the industry That's such a present as an NFV is not being pushed only in a purist way by you know Evangelists and CTO teams, but rather let's say from the ground out ground up by the operations teams Okay So we at Alcatel Lucent we have let's say the full portfolio of telco applications We have IMS we have EPC virtual run we have routing routing portfolio you name it. I arbitrarily chose EPC to focus on And for those of you who don't come from telco. I saw the some of you don't I I Created this slide to do a little to explain briefly what EPC is about So when each of you takes your smartphone actually wants to browse the web for instance his traffic goes through the To the radio access network on the left and to the web and of course the other way around and On the way we have several gateways This seems quite straightforward, but actually it is not that simple Just to describe in a simplified way what happens is when you try to browse the web the gateway actually Contacts the policy server the PCRF which in turn goes to the subscriber that database to pull your profile Authenticate and authorize you based on your credentials and then you're able to log into the web But let us not forget that this is a mobile network, which is why we have This element on top left the MME which actually is responsible for handling mobility So when you transition from one gateway Coverage area to another actually this element makes sure that your session is handed off Correctly so this entire suite of applications. It's called EPC evolved packet core and Taking it to NFV means that each and every of these elements needs to be virtualized and running the cloud So now that you're all experts on EPC Let's take it a little bit from a business perspective So what we're showing here are the main drivers for a mobile traffic growth in the next few years So first we have Volte voice over LTE a voice over LTE is about taking the legacy voice that has been around let's say for the last 20 or 30 years and moving it to a data-based architecture Which allows actually all mobile service provider to discard their old equipment so this this Application actually is growing 145% compound annual growth rate To describe it in a let's say more Intuitive way it go grows 2.5 by a factor of 2.5 each year 2012 to 2017 Video we all know video is king Now we have HD. We have 4k coming in and we will have something else coming in after that and this is this actually is a you This constitutes a huge demand demand for data and the networks have to grow to address that machine to machine all those Smart meters cars all these elements that talk between themselves So this one talks about revenues, but obviously the increase of revenues also talks about Creating networks that are able to address these requirements and last but not least enterprise so enterprise is a huge use case for for mobile and We will talk a little bit about enterprise use case later on So as we saw this is a huge opportunity. So for service provider to be able Service providers that will be able to address this will actually be able to grow their business and grab market share But in order to do all that we need to enable for rapid innovation service agility scalability is in quick ROI and We will show in the next few flights how slides how NFV actually helps us address doors those requirements by a few examples, so first example is The introduction of a new enterprise gateway so again as I told you enterprises the huge use case for mobile So once if we want to do enterprise VPN for example one of the best way to do it Nowadays actually to introduce an enterprise gateway that actually terminates The VPN for the enterprise itself now in the old days You would have to get a box and actually put it somewhere and connect it And now with the NFV around you're able to do it within a click of a button And this is something that let's say a completely new operation modern allows you to introduce the service take it out without actually without actually Investing in capex specifically service agility is huge. You can you know order and one day get it the day after Example first example second example scalability So capacity needs to be scaled sometimes. It's temporarily sometimes. It's permanently this use case We're talking use here on the red at the stadium just for example a public event or capacity needs to be scaled up for a limited period So if you want to do that In an NFV in the NFV era We're actually able to monitor what's going on and then based on the actual KPIs that we have in the network And the actual use of traffic we can predict when it when we need new capacity and actually to grow it automatically and You know, we can take it out if it's something that this temporary you can take it out So also from a capex perspective. It's not so painful as it was once before Service chaining so service chaining is about Inserting advanced services in the data plane so that users is either enterprise users or or a Residential use residential users could enjoy advanced services So what that would mean in the legacy world is for each such service You would have to take a box put it in a data center then connect it to the network configure their switches the routers and There you have it. You have a service. So this is heavy lifting that No one would have wanted to introduce such a service in the past Now that we have that we have NFV and SDN This actually mean creating a service actually means creating a template and Once we want to instantiate it We do it like that and then we have all these appliances interconnected We have the STN the SDN steering the traffic Along this service chain path and it's a new service can be applied for one enterprise and for the second enterprise in the same way And we're in a slightly modified way So obviously this actually enables new business models for for mobile operators for example this you can now Suggests of the service for trial period if you want to take it out No, probably you don't have to repurpose the equipment don't have to cause service outages Something that again was not possible in the legacy world So I told you about all these benefits, but this session is about opportunities and challenges So I should speak about at least one challenge So let's take it from an Service assurance perspective talking a little bit about alarm correlation. What happens? When there is a failure so in order to explain the complexity I have to dive deeper and explain What such a VNF how? It is composed So that's very simple. Let's go next slide So this what we have here actually describes how VNF how it maps to actual virtual resources in this layer and to physical resources on the layer Beneath So as you can see the mapping is is not that trivial So if we look at what we had in the old days, right? We had a blade in a chassis that used to provide the service this blade had the blinking lead on it if this if the application was done we could restart it if the blade was Faulty we would go to the we would send a truck to replace it with a new one Now that we have This architecture then we cannot really tell whether an issue in application is Experiences is experiencing is due to an application issue or something that happens on the physical infrastructure For example, maybe another application running on the same data center on the same physical infrastructure is actually Causing this trouble what we call a noisy neighbor problem Or maybe it's an application problem Looking at it from a different angle So if we remember what we had in the old days, we had actually we had a chassis with two blades configured in active standby mode talking to one another in For a in a keep alive protocol that would rely on let's say a 50 micro seconds Turn around time which was possible because it was a high throughput Zero latency switch fabric at the end and now when we have when this When this blade actually transformed to a VM we need to make sure that those VMs are placed with particular affinity requirements and we need to address those quality of service requirements of Those VMs so that actually the application could even function. So so much about the use cases So let me give you a couple of architecture slides So those of you that are familiar with Etsy NFV with the reference architect architecture could Recognize let's say a simplified view of this architecture. And this is how The Alcatel Lucent EPC solution actually maps to this architecture at the bottom We have the cloud band node. Actually, it's a cloud in a box I will talk a little bit more about it later, but it has the Vim the open stack That manages the cloud in it At the top right, we have the cloud management system, which is the orchestrator Which assists and several tasks in this all these use cases that I just described described In the middle on the left, we have the VNF's actually those are the network functions that actually make sure that the mobile functionality actually works and on top we have the 5620 SEM which is an element that Include that contains also the EMS the element management server Server that takes care of managing all the functionality of EPC as well as the VNFM Which actually makes sure that life cycle management of these VNF's is Handled correctly like you know deployment scaling and healing so a little bit more about the NFV platform So again the cloud we know that I talked about Actually again a cloud in a box it contains all the infrastructure like compute distributed storage in the form of CEPF SDN in the form of Nuage network from Alcatel Lucent In addition to that we have monitoring capabilities of the entire cluster. We have management capability life cycle management of the cluster itself Allowing it to function Actually autonomously because you know those things they don't go in a managed data center. They actually go on a curbside central office and we don't have personnel to maintain them on top we have the cloud and management system which actually Manages multiple nodes multiple open stack instances. So imagine that you have multiple data centers then this Piece on top actually makes sure that you have a single pane of glass to manage all of them Let's talk a little bit about what we have Here inside so actually this is a set of services that could be used all together or one by one separately All accessible through APIs So for example, we have application life cycle management as I mentioned this capability is not used in the EPC use case But if for example, you have a simpler application that doesn't have a VN FM Virtual network function management as part of its infrastructure Then you could use this infrastructure for one-stop shop for an application catalog and life cycle management We have distributed resource management capability Which allows you to make sure that all the resources in your infrastructure in the mall in the different clouds are actually synced So for example, if you need to introduce an image and make sure it's available here and there then you don't have to do it manually this it simply a Policy-based deployment makes sure that it is available wherever it's needed and this is true also for users for keepers and any other resource skyview make sure that You have a consolidated data model that allows you to get to proper placement decisions and Allows you to do broad root cause analysis and also to provide you a view of your virtual infrastructure from a single pane of glass Advanced network integration and SDN as I mentioned for example, if you would like to have your applications actually use networking that is Distributed across multiple data centers as most our applications do need then we do have this capability to deploy distributed networks Throughout data centers and the when Finally, we have insights, which is actually all about monitoring collecting data and maintaining it and performing root cause analysis and Generating alarms Which is actually a good point to move to the next part of the session We will dive deeper into these capabilities in Cloud management system and I like to call Alicia and Moshe to proceed with that part Hi So when we say root cause analysis, what do we mean and specifically in the context of of NFV? So as you can mention, we have several layers over here. There are happening in which these are happening in parallel So we first of all the application layer where the actual VNF is running We have issues We have the virtualization layer the IS services in a sense and we have the physical layer and all each one of these layers These are happening and there could be problems There can be and these problems can be also correlated a problem on the one hand on the host level Can propagate up all the way to the application level and when we have applications That are sharing the same physical as but so they can also affect each other So when we start having problems in the system when faults begin to occur there's going to be a deluge of Alerts pouring into our tower monitoring system and we need to start making heads or tails of it specifically what that means is trying to find what the core issues are what the root cause analysis is What the root cause is of a specific events so that we can make Focus our energies on where we can really address things. We can also be able to understand responsibility to the correct the correct service or the correct People in the organization that should address it and also automatically deal with the problems when possible Okay, and this is this is effectively what root cause analysis is all about it's especially important in the context of NFV another aspect which We also work on the cloud is what we call deduced alerts Sometimes especially, you know with open stack having, you know only initial monitoring capabilities There will be situations where there were there are certain things We know are occurring because we understand our system we understand that when something happens it can cause something else to happen But we don't see it. We don't actually see it explicitly for example if we have The physical infrastructure has a storage switch through which a VM will access its storage and the storage is the switch goes down Then the VM might still be alive from Nova's perspective But it can't really do anything because it doesn't have access to its storage But if we notice that there's a problem on the storage level on the physical level We can also alert raise an alert on the VMs, which can also notify the VNF's people that are managing them That there could be a problem on their VM due to a problem in the infrastructure So this is what we call deduced alerts where we sort of whereas RCA is about taking all the alerts together and finding What the source of the problem is here? It's understanding that we understand that certain things affect each other and so we can you know deduce what how a affects be and report it accordingly and And this is our like the long-term view of what we would like to have right We would like to have something along along these lines where and actually we could our assistant already now could support this if we had Enough information where you can see over here on the very bottom. You have an application down alert That's indicating that there's a problem in that application and that location can be divided into several tiers that's the second layer from the bottom which is based on state of certain VMs that are part of this application and Finally we go up we can see that there's problems with the hosts upon which these VMs are are sitting and the problem on the host is because of You know specific issues such as a memory load or there's no access to the host So this is sort of like the long-term view of what we would like to see And actually the engine we developed can support this specific scenario as long as we have enough information About the system that relates these different Entities one to the other so we're going to see a smaller version of exactly this as part of As part of we do we show the demo and it's also important to mention that those of you who were an opnfv And some of the talks yesterday they talked about doctor. So doctor is the one of the analytics Projects within opnfv. This is exactly the view that they're that they're promoting there We need to have this holistic view of the system where we see the physical and the virtual Coexisting in terms of our alerts and monitoring so that we don't have this problem where we have one system monitoring the physical layer And another physical system monitoring the virtual and then you have to go and make heads of tails of it So let me just talk what we're going to show for the demo and then I'll We'll start that demo itself So what we're going to do for the purpose of this demo is we have Nagios monitoring our physical infrastructure And we're going to create a load on On the CPU of one of our two hosts that we have running there We have we're gonna have an application heat stack which is going to be running on those on on on this on this node Which is by the way sitting all the way across the Atlantic in Israel and We're gonna basically have Nagios notify of a certain of a problem the problem the CPU This is going to cause an alert to be raised for the VMs that Indicated that they have a CPU a high load of CPU This is simulating a noisy neighbor scenario where the sort of the VM is suffering because its neighbors on the physical infrastructure are Have are taking up a lot of the host CPU This information will be processed by cloud bands RCA engine Which will then notify the VN FM the VN F manager that the problem is because we have this issue on the host the VN FM Then it could take action and decide to move the VMs from one host to another where this problem does not exist Okay, and this is exactly the kind of a fast reaction. We would want and expect from our system So that's further ado. I'll pass over to Moshe for a moment to show you how in cloud and we Interact with stacks and then we'll move on to the actual demo. Thank you, Alicia So we'll start with the demo presentation Yeah, okay, so we're gonna look at the cloud band user interface and We'll see the demo that Alicia described So here is see the deployment screen. This is an Aggregated view of all application deployed across different open stack instances With various application types like heat and Tosca So I'll just go ahead and deploy a heat application Just give it a name here. You can see I can choose the open stack instance. This stack will get deployed on Just have the template Okay, this template Simulates a media gateway application So it's very simple to servers and and the volume This Mac is great Okay Okay, so this stock is now getting deployed I'll have here a The application we have a media gateway application that already got deployed earlier on Just go here to the runtime view Okay, and this is a topology of our application and we can see here Actually cloud band is auto detecting these resources from the open stack instance and creating a cloud band that's a model of the application in which we know how to link the virtual resources to the physical hardware and In order to do all the root cause analysis that the Lisha mentioned So we can see here the network topology. We can see the servers. We can see the volumes all the Interesting components of our application and we also have here a distribution view But we can see that we have one VM on compute zero zero and another VM on compute zero one So let's go ahead and For the next step. So now once the the stock got deployed What a VN FM man? What a VNF manager would do is to subscribe to alerts on that stack So this simulates a HTTP request that the VNF manager will do So basically what we say here to cloud band I am interested in alerts on this stack ID any alert that is related to a VM resource and Once an alert occurs callback this URL of the VNF M Application manager and with all the details of the alert allowing the VNF manager to decide what to do So we'll send this Okay, and the alert was created Hand it off to Alicia to Thank you So as mentioned over here in the distribution view of the application So you can see on the bottom that the current status we have is we have this application is divided between two hosts, right? One on compute zero one one computer zero zero. So we're gonna raise the load on compute zero zero And then we're gonna see what happens. Hopefully it works So here we are compute zero zero. We're gonna use a Linux stress tool to raise the load in terms of CPU on the On this compute and back here So now what's happening in the background? Of course is that the load is actually being raised on the compute Nagios is going to be monitoring things detecting when these alerts When the wind that when it reaches a certain threshold, so that's why we're waiting over here for a little bit and Once that happens what's gonna happen is that the alerts are going to be sent to cloud band Cloud metal process the Nagios alerts and potentially also other physical layer Reports can be processed once we it's gonna process this determine that it also impacts the VMs and understand that these two are Related and notify the VNF manager that this is happening so it can take action and migrate the VMs Now what's important for me to note is that this mechanism that we built over here It's not a mechanism that is like hard-coded. We haven't just said oh We have like these three use cases which we know are gonna happen and we're gonna write them down in code And now they're gonna this is what's going to happen in the system But rather what's gonna happen is a much more sophisticated thing which is that we have Let's also go over and see quickly if we can see them migrating before the alerts arrive because sometimes it happened so quick You can't even see it happen Okay We'll get a notification when once the alert goes on Anyway, so what we did was okay, it's still running very good. Okay What we did oh here we are the alerts have arrived. Let's head over as quick as we can To see the migration and process sometimes happens too fast Let's see if we can see Okay, let's move over to the deployment view See if we can find it happening there. Okay, so right now the alerts have arrived I can also see that little icon on the top saying the alerts have arrived now the migration process is going to start Now as I'm as I began to say what we built here really is an engine What you need for this engine is to understand the relationship between the different the different hosts and the different entities in the system So the more you understand the system you can simply enter some sort of a template indicating that these are the relationships between the Between the entities They will just stay over here and You can say when a certain alert of one type is raised then we want to address this We want to address this Like then let me know that one type of alert could cause another alert And we can link these two together and understand that one is the cause of the other and so this is really an issue which we a Tool which you can then upgrade and improve the more you understand your system Whether it is as we understand like a specific system or automatically detect these using machine learning tools all these these are capable So while wait for the migration to take place. I want to show you for a moment What's happening in terms of the root cause analysis? So here we have an alert VM CPU suboptimal performance which was raised because we had an alert on the host And so if I try to understand why it happened This is from the the view of the VNF manager of the VNF owner So I'm looking at my system saying why do I have a problem my CPU? I can see over here that the V the VM CPU suboptimal Alert is actually caused by another another alert the host high CPU load So I even though I don't actually have access to the host view I don't actually I can actually see what's happening On the host level I can see I can still get information from from cloud band indicated. This is the source of the problem Let's see if the migration is completed. Okay, so it takes a bit to refresh So I think while we wait for this to refresh If we have any questions, I think we have five more minutes till the end of the session. So And you Think we'll wait for this to refresh while Right, so what's gonna happen actually as we as we watch this thing is that we're gonna see that these VMs migrate to a new location And specifically in the end you're gonna see over there in the bottom left in the bottom left that we have VMs both of them are gonna be placed in the compute zero one not a compute zero zero And that's an L show basically that we've cloud band has notified the VNF manager of the current status of The VMs the VNF manager can then decide to migrate them to a location where there is no load on the host Okay, so we gave you a crash course on EPC Told you walked you through the use cases and we showed you a demo of how actually we're using a cloud platform to actually address the practical Difficulties of going to a cloud deployment. So if anyone has any questions, then that's the time Please can you head ahead to the mic or it would be excellent Specific to NFV In the cloud, what about the bear path? I mean you had an SGW there. That's a bear plane yet a pgw That's a bear plane at large scale and and I represent a very large-scale company You know, how how do you actually envision that functioning in the cloud? Is the network good enough anywhere near good enough to handle the kind of load and the complexity that's there Versus something like an MME that's essentially a computer to begin with. Okay, so excellent question Actually, these are the exact questions that being asked and are being addressed. So obviously it's a journey It's not that bare matter is going to be replaced by the cloud all at once. So I would assume that Initially on the lower scale deployments by the way, which is it's another advantage You're able to you know to write size the deployment to the actual requirements They will be addressed by NFV and only later, you know a terabit routers will be replaced by by DNF I can tell you that technologies for data plane and In general performance acceleration like SRROV and DP DK are you know, they're They have huge interest and they're being pushed together being pushed and already in 2015 We hope to have some applications using those technologies and you know able to reach Significant significant performance, but this will be again. This is something that will evolve during the next few years Any other questions Please So I guess see for this demo the source of information generated from magios But some do you have any plan to integrate with this like a monitoring surface of cilometers as a source of the information? So entire cloud infrastructure Okay So the the question was if we're here we're showing getting information from magios and we want to also get information from cilometer so actually any source of information of Monitoring is game in this context. We're also considering ganglia and and cilometer now is added You know, I think it also had before but now it's also added a few more capabilities in terms of hard of monitoring So as far as we're concerned the more information the better The whole point of this capability is to be able to take a lot of information without it confusing the user So you can actually tease it apart and understand what's the central issue and what's the peripheral one? So you can focus your energies. So the more information the better actually here To mention that we have a monaska picking up as an open-sec project that will Perhaps constitute a replacement more generic replacement for Nagios and this is something we will be we are involved in and we'll be monitoring closely Other questions, please. What's typical migration time for your VMs? I mean how dirty are they how how much memory writes do they do so how much time does it take to move it? From one host to another when you have an issue Okay, so we don't so we hear in this in this use case. We use the migrate capabilities of open stack So I personally can't testify to the exact Capabilities open up and we're just relying on open sacks. I mean migration capabilities That's what the VNF M the VNF manager. It's going to you know It's gonna a request from open stack or from any other service to migrate the VMs our responsibility here from above it is to be able to motivate that migration and Make sure that it happens as fast as possible. So perhaps to add to that So I would say probably many seconds to minutes. It's a good guess, but again this we need to make sure we need to understand that this is like Healing is only a supplement to high availability, right? We rely on high availability to make sure that service is not disrupted and healing just make sure that The active and standby out there Okay, so we're being signal that we're going out of time So thank you all for coming. See you next time