 Welcome everybody, I'm Fred Oliver, I'm here to talk about some of my open-stack experiences and some of what we've done and what we've been able to do with Verizon. I'll just have a few slides here and then we can get it open to questions and feel free to kind of find the right best spot here. This is just Verizon's regulatory slide, where we are, we're large. Me, he's a business engineer, a few things. I was a part of a small startup of PotSwitch in Fired three years ago. And before that, various startup companies and various large companies in fact, mostly the North Universalization Systems Offer in PotSwitch. So, what's our problem? Here, typically our biggest problem is that we are, network growth is growing faster than we can monetize it. The costs are still very high and we can't get revenue to rise as quickly as possible. Return on investments networks are still very expensive and we're deploying more and more high-speed long-haul networks. All of those are very expensive as well and monetized. But we're trying to form another solution for how to make that work. And then kind of our existing problems, and this is probably a career problem and I'm not sure how many of you are aware of career issues. But we tend to divide things in vertical silos and we have lots of single-vendor installations of a lot of the network equipment, and we're trying to get out of the boat of having a stack of equipment that serves a particular single purpose that can't be reduced isn't easily expandable, isn't incrementally expandable. And so one of the solutions we're getting after is, what, how do we go after this? And one of the, so this is our current problem, is that there's tons of equipment, of all single usage, single equipment, single service, and we'd like to turn this into a virtualized function. So there's a group that's started up under Etsy to add virtualization to this environment and see if that will enable the separation of some of the hardware environments and enable software functions to run on that, a more common, less single-purpose environment. And on top of that, in fact, probably the largest gain of all functions is somewhat automation. All of our installation today is kind of in single-purpose, single, and by manual innovation, there is these operation manuals that are large that all of our operating staff follow to provide a particular problem. We need to get somewhat out of that role, enable automation, trust automation, I think that's actually from an organizational problem is the trust aspect of the organization environment. How do we make that work? Benefits of the operators, I just built here. So, again, common hardware platforms, part of our goal is to reduce these special purpose, purpose-built hardware environments. If you look inside these frames, it turns out, we're often our common hardware built by lots of other common hardware vendors, server vendors, but they're built in a specific configuration for a special purpose. And we like to have a more general-purpose environment that we can leverage different software components on that. Again, classic, faster time market and reduced business risks. Our average deployment time is six to nine months. We like to reduce that order magnitude that we can to have a new service. And along with that deployment risk comes along, business risk because the deployment time takes a long time. We can't really deploy a new service very cheaply. Limits our ability to provide new services in this environment because it's just another process involved. And then if something is deployed and we have to make a mistake, we have dead equipment sitting there that actually is consuming electricity making heat. And so we need to get out of this mode that we can reuse equipment, reuse the environment, leverage automation to make this deployment work better, easier, and faster. This graceful software upgrade is kind of an interesting environment today to do the software upgrade. And basically our environment needs to have a complete separate system on which to do the upgrade. And then once that's proven to work, then you can online the existing system. So there's a very large task of deployment. We like to get the mode of incremental add of equipment slowly migrate, perhaps live migrate some of the operations from one set of equipment to another set of equipment and do this without having any service interruption. Again, service interruption is probably our biggest problem that we're trying to address. Whenever a cell tower or anything goes down, this is the first thing we get a call from is something's failed. Any network failure is bad for us. This is where the classic Curio RAID 5.9s and above level of operation when we try to run it. And I can look at it. So there's an organization, European, I don't know if that's the answer. It's a European state of organization, but they have started an effort called NFP, Network Function Recalculation. And there's a goal basically to put a standard model together where a application developers can develop their application independently and deploy that on a common set of hardware and put all the interface points, common interface environments, common control paths so that people can have a unified environment and develop their application once and develop standard interfaces or element management systems. They're classically today done on a single hardware basis or single application basis. That they're the standard interface to this manager environment. So again, develop these things once, run it on many different environments. How does Opensack come in? So I think one of the things that's become apparent both in the NFP community and outside in the application developers is that it's become a de facto implementation in the common deployment environment or existing vendors to port their applications into this environment. And there's a lot of deployment happening in this environment, a lot of different applications and a lot of different infrastructure environments. But there are people specifically targeting whole sets of infrastructure, specific hardware environments, specific orchestration and actually specific limitations of Opensack which I'll talk a little bit about. So my task, my goal was to act at the common platform and this is encompassing both the hardware aspects of this environment to be hardware agnostic but kind of network focused. We have to design all, most of our traffic is north-south internet or just the internet or our internal network. So we don't have to have a large majority of the traffic going between the ends inside our environment. There is a bit in what we call service chains and what's the one part of the next but a lot of the traffic ends up going in and out of the internet or into our network. And then operational service. This is again probably our biggest environment. How do we get five nines or six nines of behavior out of an environment that is probably hardware capable, three, four nines, software capable right now, probably two or three nines. How do we make that reliable and kind of solve the process perfectly after is trying to, within the community, Opensack and some of our partners Red Hat and others to make an environment and push upstream functionality that would make this environment more reliable. Some of the things that, because carriers are not the largest part of the community in Opensack, we end up not having necessarily the largest voice in the community. So some of the capabilities that we're looking for in the environment are getting addressed in the Opensack environment. So we'd like to somehow push that up more as much as possible. And so that's again, leveraging some of our community environments. This is self-service. This is self-service. Again, this is a typical thing. Today we have all these singles, silos back from hardware to our recent environment to catch the application itself. It's typically managed off the bottom by a single operational staff. We'd like to break that into a horizontal environment that there's a platform environment that can address multiple of these applications and have an operational staff that's knowledgeable about the operation, the hardware and the virtualization environment. And I actually want to step back. It doesn't actually need necessarily be completely virtualized. And in fact, I guess it does previously a lot of the functionality is automation and orchestration of the environment. So there may be hardware components in this that need to be orchestrated. In fact, they do bare metal provisioning of some of the operations, some of the virtualized functions which are not Microsoft's instance on this environment. But security, certainly security and isolation are very important for us. I want you to do this. How do you provide non-interference between multiple internal customers, which is multiple running applications in this environment. How do you separate the resource allocation and can you do resource-related reservation on this environment? A couple of little bit later, a lot of the response is clear. So this is my environment. I actually have a couple of environments that I know people have seen. This is one rack of my equipment. Part of my task is a lot of evaluation. I've been talking to just about all the suppliers in this environment. And what I've built up in this environment, this is one rack of equipment. And this is interesting because these are all QSFB cover wires, this 40-digit connections is kind of my high-speed path to test all the environment. But as you can tell, it's not professionally wired. It's all my wiring, which is all the entangled wires. But as part of this environment, I have tested a lot of variations of this. I currently have six open-to-back environments running in this environment. And these are basically four or five simultaneous concepts in this lab. And that's kind of my ongoing process for now and perhaps in the last year, I'm going for probably the next year and a half. But from this, a lot of me have tested a lot of different integrations. I'll talk about that in a minute. I actually have a little bit of description here. But in this environment, are all direct patch storage. So this is SAS and SATA drives in the front of these storage servers. And I'll talk a little bit more about the different tests I've done in this environment. But so shared direct-to-patch storage is probably the best answer to that. These are all the things that I've been in the company consent areas that I've talked to over the last year or so. And tested out the various capabilities of these things. Part of our environment is if you start, you know, get a random piece of hardware, how do you initially provision it? Initially loaded with operating systems. There's ChefPuppet. Certainly all the environments available from around this kind of old Red Hat all have provisioning environments. And I've dealt with all of them. They're all incomplete. I'll say at this point, they're very maturing, pretty well. So there's something that's coming along pretty well. But there's... And the storage environment, so most of what my path has been using ChefBuster and Elmium. So these are all environments of locally shared drives that are sent out over a shared environment. That was back into my 4GIG environment. I had basically a split plane. I have one 4GIG port for the storage path and another 4GIG port for the communication path. But each one of my commute servers is a storage server as well. I also have been playing around with RDNA, iSCSI, and there are actually... I'm actually starting to be in RDNA with Chef. But that kind of reducing compute load in this environment, again, because I'm sharing the compute server with storage server, I want to reduce the load, compute load of my storage operation as much as possible. And using something like RDNA, iSCSI over RDNA, is one of the... Some of the things that I talked about and this polyallocation function is kind of cool. I was testing in the context of a particular evaluation that I approve a concept that I want to talk about in a few slides. How do you get your volume allocation in a ChefLVM cluster environment to actually allocate your volume for your IPM to run, particularly if you're running in distributed environment? Neutron, again, a lot of our tasks are network-focused. So we're looking at how do we best leverage some of the neutron capabilities. And so I've done a lot of testing of various combinations of standard OBS. There's a couple of nicks that have embedded switches inside them that will take OBS inside the box. There's various commercial overlays, and again, I've tested pretty much all the commercial overlays that are out there. OpenFlow or not, again, it's another interesting thing. L2 or OpenFlow or L3 overlays or it's another interesting issue from what I've seen today seems like the industry and OBSAC and the environment is moving towards a L3 or L2 or L3 topology that doesn't match really well into our core network. So we're trying to figure out whether that's a how do we match that environment or what do we do in our core network to work well with that environment. And the last is some of the policy functions that we'd like to have. Again, because of the network overlay or the network control we want to have in the environment, we really do want to apply policies to sets of VMs, sets of applications, make sure that they path through an environment or a policy gets applied whenever you scale, start, stop an application. So that the network policies will be an important part for us. So I was going to just talk about one proof of concept that we did and this is something we call the distributed pitisites in a CDN kind of distribution network as a use case that we were going through. And again, our issue is how do you reduce the network traffic here? What we have is content somewhere over on the edge at some parent location that we get off somebody and there's all the YouTube Netflix, all those guys are out there on the edge, as well as our internal environment which we'll have somewhere in here we have our own that we deliver out of the environment. How do you reduce this network path so we don't have to have teravits of connectivity between all the points in all network? And put as much of that as we can into, you know, close to the user so they can get the low latency they would want as well as reduce the amount of core network ever using. And our goal basically is we have lots of sites we'd like to deploy this environment in at least sites as much as possible and distribute this work effectively. So our goal is basically to have many sites and manage them as a single cloud but still have multiple sites out there that we can actually know about and then control them for one central location that's a highly available environment. Some operational issues we come up with is high availability of control elements. There's... At the point where we're doing this and this is six to nine months old that we actually did this the HA was pretty lacking and so and the environment we're running and we're an active passive HA for all the control paths there's still some environments that are in the current now of the ice house environment that actually is a pretty reasonable HA environment but this time it was pretty poor. Control elements of the cloud yes and that's again we think this has a single cloud and having lots of different small locations with the central control of this environment. Some of the issues we were facing this is provider networks we're trying to plug into our existing network environment we have Verizon as well as BIOS as well as all the wireless environments so we're deploying a bit of content out to all these environments and so there's lots of different sites lots of existing networks how do we plug into that network environment there was some interesting issues we found allocating large numbers of these networks particularly when we're actually deploying to small sites out into the network and connecting them to the existing networks had a kind of deployed provider network out to each individual site and then a large collection because we're we want to kind of manage the environment while we're running and actually look at the operational model it turns out that the from the control paths things like a lot of collection is a big issue and it takes up a lot of our bandwidth it's kind of like a management path and trying to manage that is an interesting issue. This is the IPART they're just a bunch of different networks they've created for every single site we would have kind of environment our goal was to have orchestration here in the central site and then there's a storage environment also in the central site this is actually where a lot of our internal content is stored and then have kind of these little sites scattered around the environment to do different functions we have a this is a reconfigurable and an optimal address this is our core network stuff and there's 100 gig generally passed through and this is moving up to multiple 100 gigs in the future but a lot of the connectivity comes through there and we're basically managing the connection of these networks into a wavelength going across one of these fibers just a little more open environment so at the end site we were looking to deploy 4 and 8 12 nodes each one of them would have an open flow switch and support a scalable basically this caching server would scale up and down based on different people came on and the rest of the environment would scale up and down a shield cache environment would basically just a local cache department and we could do a predictive push of the data that would be expected that would be loaded and then all the updates that were happening in the department as a request upright system as a system this is so this is the control cluster the origin file server this is kind of where the original content was stored and then this tracker would track which nodes had which segments and the interesting thing in here is that each video and because we're doing adaptive bitrate all these environments is actually chopped up into 18 different versions in this environment and then chopped up into 2-5 second chunks and that's how we would deliver to your clients a little chunks of seconds from that once you have a second in local environment tracker tracks it when a client asks for a request to come into the server authenticates this environment, authorizes and gets built one of the issues is identifying what the user location is and the current environment is kind of mapping an IP address into a location and there isn't a clean particularly on a mobile phone that ends up at a proxy point inside the network that's actually relative to the local environment doesn't actually have a good connection to the locality of that you select one that's relatively close select one of the distributed sites as the points that would serve that data and from that that would be a follow-on catching site a log activation and this is the points of how we aggregate all the logs together this is the each peer at distributed site we're using openflow switch to create all these networks and manage the connectivity and quality service to go back to the server, control server and to every client site that was out there how do we go about this? We deploy an invulated environment in our walk-on app and created this test environment where we can have many hundreds of thousands of clients and and it's interesting that we can actually for fiber schools, I'm not sure if you have seen these things 400 kilometers of fiber schools are actually connecting all these things so we can actually pretend it's a real network in this environment and we leveraged Red Hat because we wanted to actually go through a vendor appointment and Red Hat deployed their open stack platform and at this point it was RISD based and they kind of found this provider network configuration problem and provided us with a patch and also pushed us upstream so this is kind of a part of Red Hat to help us with develop some of the Finstender filter roles because we wanted to have VMs deployed at these remote sites and the volumes that were allocated to those VMs actually be near that site we actually had to use some filtering where the site the volumes were and managed the storage there, lovely and glance images, whenever we wanted to deploy a new version of the hardware we didn't want to have all that glance images which these things are almost 100 gigabytes of image of these servers and trackers pre-deploying these things without what's kind of what we want to do this off hours deploying the glance images into the remote sites and running it from there and then again we deployed an active passive H&J at our central control site and this is the open stack H&J, the open stack environment part of our environment so this was our process we actually went through several iterations here we would track the prediction content that we had to have a production content network we could get logs off of that track what all the requests were all what the hot night environment was and Chris would snatch out of that into our global environment we can again apply with our test equipment, apply the previous days requests and then test how that the system behaved and then we went through and injected fault into this environment and generated so that we could test various scenarios of the environment and understand what the system would be and this is our models of is this CNN a valid use case for this environment that's kind of what we're trying to test lesson learned open stack H&J needs work IceHouse is currently looking better, I have an existing open stack IceHouse H&J environment active active and that just seems to be working pretty well quantum at the time and new kind of handling of network functions still non-existent this is a particular problem we have if we want some of the VMs to see multiple segment IDs segmentation IDs in open stack terms and we'd like to be able to specify that multiple segments end up on a single port and this is in the environment SRIOV doesn't work very well yet I have a problem with a couple of hardware NIC vendors trying to get that and in the existing environment the vendors have a a solution that doesn't fit very easily into the existing public available of the stack environment I think will change slowly it is important and SRIOV has a big impact on our network Linnix and KVM so that this is certainly still ongoing work one gig huge pages just came in the Linnix kernel so we're using 6.4G so we're using 6.4G so we're using 6.5G the X7 is coming along and it has a lot of these things in the environment particularly huge pages and new memory are kind of in the REL7 and after the SSD is a cache all are all in REL7 so REL7 will predict I know they do and I set this up by hand so in the environments I had they were all direct to patch drives and I had a depending on the configuration one or two SSDs and four or five hard drives in each of the server base and then that's actually how we were able to get these old we were able to get the 10 gig streaming out of these boxes and they were not for this environment because this was really completely a caching environment so we lost it we could just get it back from the other caching as long as we could detect it so they wouldn't match it but we were just pretty exact and from SSDs we were actually because of our we could segment the SSD environment our CDN software here's the tie-in fast storage and it would deploy the hot cache environments on that storage but one of the things that I'm looking for would be to actually use SSD as a kind of an automatic cache that wouldn't have to segment that environment necessarily but those were out of frame so I think there's a couple of net apps that have something like scaches there's a couple of things that are possible there and then I think what the other thing to learn is Verizon it doesn't we don't have a lot of experts in go-to-stack environment we need to help in this environment so I think one of the things we definitely learn we need I can do a lot of this stuff in the lab but as you deploy this in a field real field environment to do the leverage and find and fix some of the bugs and then push them upstream that's actually one of the things that happen right now is to push on our desires some of our needs upstream as well as find and fix the bugs and pre-test the environment before we do it and I think that's my future work more a day that's kind of our mantra horizon is how do we make this a reliable environment hundreds of sites if you can imagine all of our cell sites having a component of this how do we make this work geographic storage redundancy that's we didn't need this in this POC but that's another one of those areas we need to focus more time on and get a useful function out of that network policies again and we've got this hypervisor actually I have one of these things running in my lab I have a one over stack environment as KDM and VMware ESXi hypervisors in one pool and it's actually been very useful we have a lot of the existing applications we get from vendors if they've been virtualized they've often been virtualized on VMware ESXi and so that actually is a reasonable for particularly in my lab environment for them to develop and test their environment in VMware in parallel there's a KDM environment they can actually test their new environment and we can compare side to side as to what the behavior is and it actually seems to work I'm able to bring this up easily and manage it as to kind of hypervisors but I'm actually knowing this putting filters on the ESXi just really the only change I had to make and then again today we had to act on some lessons earlier from our dealer trials we're actually going to be deploying some of these into a field in the buying quarter and once we get this we'll have real live traffic you guys probably will be our employees on this environment but Q3 are more intent to be off the field and I believe that's it that's it questions many many how did KDM compare to ESXi in the environment we were running on it was quite a bit slower and we looked at some of the things that KDM KDM was slower and there was some reasons why we had to identify some of the reasons and push some of the things up through some of the things like huge pages and some of the scheduling aspects were some of the reasons so there's several, lots of the projects are in process for making those KDM better the one place where that did change is actually where we specifically had a PDK environment that was tuned for the KDM stack that performed better than the ESXi a large amount of the legacy requirement that's what we have in our network so part of the issue that KDM came out is that the our vendors supplied a hardware of active equipment part of which was probably a switch or a router function and we're trying to decompose that into the all the virtualness functions in order for our operational structure we tended to create VLAN as a customer separation or a function separation because this was a large piece of solution to use the equipment we ended up bringing all those VLANs into one environment the application ended up using that the technology has supported so fitting into that is kind of retrofitting all of that is I think it'll be it's kind of an artificial operational constraint on the environment I think it will be at least for us it's an interesting problem and I think it will be well, yeah I think that's where the in the end it will be on the version of vendors to rewrite their applications to be less acquiring of that functionality but that's correct and that's actually one of the things we're finding that are the first generation of apps that are getting this NFPs or BNFs in technology are really kind of part of the existing application they really take the code that's running on a blade and make that a VM and then you have this large chassis and they actually pretend in the software environment that what they have is a bunch of blades that can control all of it so that's all the kind of legacy that you can place on that we expect that one to be written it's a step on the approach and I think it'll soon come to a point where it's separate than that so we drove this to do that to look at the various switching environments and overlays so some of the things we didn't think were addressed well covered performance was not ideal we're looking for tens of big bits of throughput for this environment at the time we're looking at basic OBS environments we're also looking for how do we deploy our applying our policies to a scale VM or to a newly created VM set service changing is kind of a particular and so this thing and there was no convenient way to do that in native OBS environment and that's what's kind of let us down this path the functionality didn't seem to be there and the performance didn't seem to be there and there were alternatives and openflow was one alternative to kind of directly manage the environment there's several embedded switch things which takes OBS out of the compute path environment and then there's all the overlays that have a lot of integration and I guess I have some MLS environments that do a lot of that integration for you of course they can't give you a resolution or a performance or results from that so this is part of our Verizon problem so our goal is to go into field trial in Q3 Verizon typically will spend six months in field and then go into production one of the things we'd like to change in this model is to break that process instead of six months beyond or a couple weeks but that's not going to happen so I think in this environment we'll probably be in production in the beginning of next year if you have any other questions and we're actually looking at OPEX86 in the arm from a hardware perspective and bare metal Docker PBM and ESXi are all hypervisor Verizon is telling me that I can't tell you the results they both work and they're both out and relatively easy to set up but we work with Canonical as well and I think this was more kind of a relationship than environment we have with Red Hat people are a lot of people out there and I have kind of the most experience in my experience with Red Hat and CentOS and partners have been doing the Red Hat very responsive and good job for us but we've got a conical client I can't Verizon told me I can't thank you all for coming I appreciate it