 Good morning. Thank you all for coming. My name is Gary Kevorkian. I'm with the Cisco events team. I'd like to welcome you all to the Cisco sponsored room this morning. We're gonna kick off our day of events with a presentation from Steven Pierce and Mike Cohen. Steve is one of our cloud engineers. Mike is a director of product management on one of the many cloud teams at Cisco. I'll let him go into a little bit more detail on that. Also would just like to remind everybody that came in. I think you all got a little business card, size card. I'm gonna be collecting those at the end of the session in this little fishbowl. We have a drawing for an Apple watch. So one lucky person in the audience will walk out of here with an Apple watch today. But again, thank you for coming. I'm gonna keep my remarks brief and we'll pick those up at the end. Turn it over to Mike and Steve. Thank you. Thank you very much for that introduction. My name is Steven Pierce. I'm a member of technical staff in the infrastructure architecture and design group and have been the open-stack solutions architect for the last year, year and a half or so. Today we'll talk about Cisco IT's implementation journey. We'll talk about the application enterprise application trends that we have found while implementing applications in our open-stack cloud. We'll talk about the lessons learned and our next generation architecture. Mike will talk more about the open-stack ACI integration and we'll have a demo at the end and then Q&A. So initially when we deployed open-stack about a year and a half ago in Cisco IT, we obviously used Cisco infrastructure, Cisco servers. Our compute network and management nodes are all based on the UCSB series while our storage cluster which runs CEPH runs on UCSC series. Our network fabric is Nexus 5K and 9K switches and for ending that to allow people to provision into our cloud, we use our prime service catalog and process orchestrator with automation packs specifically for open-stack to all our users to provision projects within the cloud. After our users provision their projects, they use open-stack data APIs to leverage the cloud resources. In Cisco, we have many data centers all over the world. Our first year, we deployed into three of our US facilities, our RTP or Research Triangle Park data center which is a non-production data center very close to the campus there and we also deployed into our Allen and Richardson Texas data centers which are our production data centers which run our most critical applications at Cisco. In addition, within the next year, we plan on deploying into Amsterdam and India for support of applications which are in those geos. So, we started with our Grizzly release as part of the express environment. This environment allowed us to get experience with operating and running and deploying open-stack. The express environment was specifically set up so that users with just a CEC username and password could get a VM or two for up to 90 days and that allowed us to do POCs and other investigations allowed us to build experience with running an open-stack cloud and we did that in September of 2013. In July of 2014, we deployed Havana into our first data center and that was in Research Triangle Park. In December of 2014, we deployed into our first Texas data center in support of production applications which I'll get into later on in the presentation and in August of 2015, we did our Juno release into our second Texas data center to allow us to do data center failover in Texas. We plan on deploying in January 2016 open-stack with ACI integration based on the Juno or Kilo release. So, while we were deploying applications into our cloud, we discovered that we needed a framework to help classify the applications so that we could help the application architects leverage their infrastructure better and we discovered that we needed to basically break the applications into three groups. We had the cloud tolerant applications and these are the legacy or monolithic application architectures, the traditional applications that you would expect to see on traditional infrastructure like legacy virtualization and bare metal servers. This kind of cloud tolerant application maintains state in its components and its resiliency is limited to the infrastructure capacities. In other words, the service itself as written doesn't provide redundancy that expects the applications to be redundant based on the infrastructure. Then we have on the other side, we have a cloud native application where the cloud application leverages APIs to do things like auto-scaling, to do things like monitoring, to do things like fault tolerance, where the application itself expects the infrastructure to be non-reliable and the service is a P1, not the infrastructure. So, this is the Cisco Commerce Renewals cloud. This is an $8 billion application and this application was fully deployed into our open-stack cloud in Texas. This application is a cloud native application that leverages the APIs within the open-stack cloud. Notice the technologies that are used in the platform side. Things like HA proxy, RabbitMQ, Elasticsearch, Kibana, these are all application components which are redundant in themselves and don't rely upon resiliency of the infrastructure. In addition, they also can leverage the APIs in the cloud to build more Kibana or Elasticsearch or Tomcat engines when needed. The important thing about this application is that it scales linearly. That we have zero downtime because it's a very critical application. A lot of revenue goes through this and the other important thing was that we have low latency in this application and the open-stack infrastructure provided all of this. What did we learn while deploying applications in open-stack? Well, we learned that first of all, running an open-stack infrastructure is not trivial. There is a steep learning curve to build the skilled teams. We discovered that we needed to provide additional build and operational tooling in the enterprise to support open-stack cloud infrastructure. Obviously, the update cycle is very fast. Every six months, there's a new version of open-stack and we have to plan for these upgrades. As you can see from our timeline, we got a little bit behind, but we're catching up now. We need to be always looking to that upgrade every six months. More importantly, we needed to understand the interactions between the application, the network, and the infrastructure. We needed to work with the application architects to classify their applications so that we could help them write cloud native applications. We also discovered that the application deployments are still too difficult. The developers are application-centric, which kind of makes sense, but the deployments are infrastructure-centric. We tried to figure out ways that we could solve that problem. Our new open-stack architecture is going to help us solve that. Based on Juno and Kilo and group-based policies, which we'll get into in a moment, we will integrate open-stack with ACI and get a policy engine to help us deploy applications into the cloud. We're going to be leveraging the native VX LAN from the server with Nick Offload for performance within the cloud. Thanks a lot, everybody. I'm going to take over now, and I want to give you more details about the underlying technology that Steven's group is deploying Cisco IT, leveraging ACI and open-stack. I'm going to start by highlighting some of the advantages of the ACI environment, in particular when you're running in open-stack. This is, again, part of what attracted Cisco IT to the platform and a number of other customers as well. One of the key things you get is distributed scalable virtual networking. Part of our architecture, enabling things across both the physical and virtual environment, allows us to give you fully distributed networking services. Distributed Layer 2, because we're running a VX LAN fabric, distributed network, anycast gateway. This can be both local and the hypervisor, as well as in the physical switch. And then distributed services like metadata and DHCP. So you no longer need to run them on a centralized neutron node, and you have fewer things to HA in your environment. We can also offer distributed NAT floating IP capabilities across the entire fabric. So you no longer need to run those, again, through a central device, for capability like NAT. And we also give you the choice of using group-based policy or the regular neutron API to take advantage of these services. innate in the system is hardware accelerated performance. You know, this can happen both at the physical switch layer where we run VX LAN directly in the top of rack switch across the ACI fabric, as well as teamed with capabilities like Nick Offload in the server, if you choose to use VX LAN up to the fabric as well. And one of the big advantages you get here is not wasting CPU cycles that you could either be reselling or using for your applications in doing things like tunneling and software. Probably one of the biggest areas of advantage that comes from the ACI environment, though, is actually having operations and telemetry that are tied across the physical and virtual environment. One of the big problems that you have as you start scaling up your open-stack cloud is, what do you do when one of your tenants calls you and says, you know, my VMs don't, my VMs, you know, the virtual network I created doesn't work anymore in neutron, you know. Could that be a problem in the virtual switching layer and open V switch? Could it be at the physical switching layer in the underlay environment? It's very hard and it takes multiple teams and often multiple steps and different tools to debug these kind of problems. With ACI, we can give you an essential portal. We can actually help you get to the bottom of those kind of problems nearly instantly. This is one of the huge advantage of the platforms. We can also give you things like health scores and counters and capacity planners that actually let you head off problems before they occur. Another big advantage of the ACI environment is integrating both the underlay and the overlay. So, part of this means that now you don't need to manage separately the underlay and overlay environment. You get a fully managed VXLin underlay that's automatically built as you connect the ACI switches. And then you can also connect this in your overlay environment with different physical servers, multiple hypervisors. We'll even extend it across different container technologies as well. Also, native to ACI is the ability to do service chaining. And we actually have an ecosystem partners where we've developed device packages that allow us to integrate with different load balancer firewalls and other devices and actually stitch them at both layer two and layer three. Another advantage that comes in the ACI environment because we're dealing with both physical and virtual is the ability to offer a higher degree of security than you may get in a virtual only environment. What I mean by that is if you're worried about an environment where, say, a hypervisor may become compromised, we actually have an additional enforcement layer at the physical switch. So, we can provide improved network security and make sure that if maybe your hypervisor would get compromised, it can't compromise the network and it can't reach out to other hypervisors and jump on to different overlay networks. And we can provide that enforcement again in a physical switch outside of an environment with, you know, outside of a hyper, direct hypervisor environment. So, this gives us two points of control inside the cloud and a point for added security. One of the technologies I mentioned that you can take advantage of in this scenario is group-based policy. Group-based policy is an option we have when you're using ACI. It's also an upstream component that's been delivered by Cisco working with a community of different developers. It's 100% open source, Apache license, where the goal is to offer an application-driven API on top of Open Stack, where you can describe application intent without worrying about the details of the underlying network and allow that intent to be automated across the infrastructure. This was actually taken from ideas we had developed with ACI, but are actually completely hardware diagnostic and can be deployed across any neutron plug-in. Today, this solution is for networking. It actually allows you to describe network profiles, although in the future, we hope to take on more with it as well. So, if I were to highlight the key features for ACI and Open Stack that attracts users to our platform, one is this choice of either using the neutron API or if you're going the cloud-native route as Steven's group is doing, thinking about doing group-based policy for describing application profiles and application intent. Achieving a managed, you know, an automatically managed underlay with a multi-tenant overlay in a fully integrated manner, having the deep operations, telemetry, and troubleshooting that comes from the physical and virtual integration that ACI can offer, and the fully distributed architecture, where you no longer have gateways or bottlenecks inserted into your network architecture. We can offer all of the networking services distributed across every node and do it in a platform that can offer a rich service-chaining model where we already have an ecosystem of partners across all the major commercial vendors. As I mentioned, we have two different drivers that you can use with ACI and Open Stack. One is a regular neutron mechanism driver for ML2, allows us to consume the neutron API, convert that into our ACI policy, and then use that to build up a capability inside ACI and APIC on the network. Many of our customers are using this approach today. We also have customers that have chosen to use group-based policy, which gives them the application centric view directly as a native Open Stack API, and this allows them to build application profiles directly inside Open Stack itself, through Horizon or CLI, and then drive those through APIC to the underlying infrastructure. One of the other significant innovations we're introducing right now in ACI and Open Stack is the introduction of OpFlex. We announced OpFlex over a year ago as an open protocol for transmitting policies from a remote controller down to a particular device. What we've done now is we've created a fully open source implementation, one that works on top of OpenVSwitch, and we've leveraged that with ACI to extend the ACI policy model directly down into the hypervisor. This allows us to have APIC managing not just the physical switches in the ACI fabric, but also the OpenVSwitch as well in the Open Stack environment. This opens up another point of control and capability for us in ACI, and one, you know, enabled purely through open source technologies. This is how we achieve things like our distributed NAT capability, or, you know, local switching and routing inside the hypervisor, because we're controlling not just the physical fabric layer, but also controlling the virtual switch as well. And again, in this capability, we stand out in the market because we have both physical and virtual tied together as part of our solution. Now, with just a few minutes to remain, I want to show you a quick demo of, you know, the group-based policy in action. I'm showing you two views of what I can show you in the demo here. One is a policy view, which is the policy I created through group-based policy. It's a very simple concept, where I have a set of web servers noted here as Apache, which happened to have a NAT policy associated with them. And the reason I did that is each of these VMs will automatically get a floating IP when they start up. And that could be something I would either have automated separately, but since I know this is going to be an externally visible group, you know, group-based policy allows me to define a policy for applying floating IP automatically. I also have a MySQL group, which represents database in my scenario. And then, you know, here I've also hooked up an external network configuration for my web servers to reach the outside world. On the right side of the picture, you'll actually see the ACI environment this is mapping to, which is, again, a very simple two-server scenario running OpFlex with APIC and an external router connected to it. I'm going to be driving the presentation off of video today because my lab happens to be back in San Jose, and I've always had mixed luck with, you know, cross-continent VPN. But hopefully, I can give you about the same experience. So quickly here, you can see an example of the policy I created. This is what I showed you in the picture, and we can provide this for reference. But essentially, it gives you an example of the CLI that can be used to create group-based policy. This could also have been automated via heat template. In this case, I showed an example with the CLI. You can also see here I have, you know, a view of the different VMs I created in OpenStack. So there's, you know, two database and, you know, two web VMs. Floating IPs were automatically attached onto the web VMs, as I pointed out. If we start looking at the policies, we can see in Horizon there's a group-based policy tab now available when I have the proper packages installed. And I have two different groups, one for Apache and MySQL. You know, in the case of the MySQL group, it is providing a MySQL contract. This is the contract that allows the web servers to speak to the database servers. And then we also have our Apache contract, our Apache group, which is consuming the MySQL contract, allowing it to speak to the database servers, and also connect to the external world as well over this external contract that was created. If we go slightly across here, we can actually see I also have an external group. In this case, this represents access to the external network that I've configured. And we have what we call a data center out, which is an external policy for reaching the outside world. If we dig it a little deeper, we see that I've created different policy rule sets. In ACI, these map into different contracts, and they contain different policy rules. Policy rules are nothing more than different traffic classifiers and actions. In this case, my actions just allow traffic to flow from one to the other. And the policy classifiers describe specific types of traffic. So this model is very easy in that it allows you to describe the API of a particular set of machines. And then the provide-consume relationships allow you to describe who is providing that API and who's actually calling that API. This is why we call it an application-centric model. It's based off of how APIs work rather than how networks were actually built and designed. So we see here that we have policy classifiers and actions. And then we also have a number of other things. One is an L3 policy. This describes an isolated IP address space. It also includes essentially a supernet range that can be handed out automatically to different groups. This allows tenants to create groups without worrying about exactly what subnet to use. They'll just receive the next one from a pool that some administrators picked out. It's a simple bit of automation, but it actually answers the question of, well, what subnet is free? What should I be using? How am I not going to collide with someone else? Group-based policy allows you to create multiple L3 policies that can be spread across different verfs as well. There's also a service policy here for NAT. I mentioned that in one of our groups, we actually attach floating IPs automatically. That's what this NAT policy is allowing me to do. I could also have manually attached the floating IPs. Now, if I spin ahead quickly to APIC, I'll show you the policy that gets built up as a result of what I did in OpenStack. Now, keeping in mind, no work here was required in APIC to set up this environment other than bring up the fabric, installing the Outflex agents correctly. But what I've been able to do is actually build up an entire policy as part of ACI that includes the different EPGs, the contracts between them, and the external networking, fully driven by OpenStack and fully driven by group-based policy. So it allows me to get the most out of the ACI environment and drive a really rich policy model directly through OpenStack. I think on that note, I'll probably terminate the demo at this point, and I'll leave some time available for questions if people have them. So that'll take the mic apart. OK, any questions for Mike or Steve? Asked a call for questions. It was that good? You recovered all the bases that well. But I went all the way up to the front. The example of the SQL and the Apache service with example, what type of apps or workloads were you running there in reality? So in this case, again, this is just kind of a simple web app. When we built up this example, I think we were using one that was like a little finance app that graphed historical NASDAQ data or something. So it basically had a data of historical stock information and ended up building a graphical view out of it. So a very standard web app sort of view. We played with other scenarios as well, but that tends to be one of the ones that demonstrates this technology in probably the simplest yet very pretty effective manner. Any other questions? Based on that, I'm not sure where to walk now. So I'm going to sort of position myself in the center here. Any other questions for Mike or Steve? I believe you guys are clear. Anything else you guys want to add? The only thing I'd like to add is that, from our perspective, group-based policy encapsulates the application architect's intent. And because of that, we can use group-based policies to more accurately deploy applications into the cloud. And that's kind of the entire reason why we're looking at group-based policies. And the tight integration of group-based policy with OpenStack, where we're able to do things through OpenStack and have them be put into the APIC, is a key feature for us in IT driving applications into OpenStack. Actually, so there's one more thing I can show you, because I think I may have an extra minute. If we scoot along in the demo a little bit, one of the things you can actually see here, since we have our virtual physical integration, some of the other stuff you can see in the demo here is directly through APIC, you can get operations and telemetry data directly through our console. So this includes stats on your virtual networks or on your virtual endpoints, which we can show you in a very graphical manner. And we can also give you really rich operations and telemetry data. So for example, if I was running a ping, which I'm doing this demo across two of the VMs, in this case, I'm actually bringing down a physical port inside my fabric. So I'm essentially breaking the ping. And you see here that the ping, which had been working, actually ends up stopping. And actually, and from an end user perspective, this would mean that my network broke. I have no idea why something's down. How did this happen? Let me show you how APIC allows you to debug this. So if I open the endpoint group for my database, you can actually see that the health score is suddenly impacted by 50%. In my particular scenario, one of the links is down making one of the two VMs inaccessible. So that actually immediately dinged the health score. And you can generate alerts on this information. And that would immediately tell you that something happened to that group even before the end user or right as the end user may have noticed it. But the next thing ACI does is actually pretty compelling. It'll actually show you, if you open the health chain, it'll show you exactly what happened. You'll be able to dig in and say, oh, and node 401, there's a particular port down. And that port has a health score of 0. And I can immediately know what port it is and go actually investigate that at a physical layer. So what we're doing is we're closing the loop across the physical and virtual environment. And we're being able to show you that how you can troubleshoot things, a physical problem that manifests in the open stack environment. Just a quick add to that is that the infrastructure is completely separate from the application side. So the ability for operations engineers to debug the infrastructure is completely separate from the application policy. And so we can deploy application policies into many different kinds of infrastructures without changing the application policy at all. So I do have one more question. Heat and auto-scaling in open stack, trying to get away from troubleshooting, but having these applications being self-predictive, self-healing, et cetera. Any plans to integrate the logic and the telemetry data out of the APIC layer into that auto-scaling type engine of open stack? Into, so we've talked to a couple, so there's a couple different places we can go with this. So a salameter is one we've actually been looking at. Can we make some of these things inputs to salameter? The auto-scaling would be another one. So far there has not been work done on that, but it's obviously an area where we can do, there's things we can do on fault tolerance and there's things we can do on auto-scaling based on the telemetry data we have, because we actually have pretty deep capacity information about the underlying network, and that can be used in scheduling, that could be used in scaling in a lot of different ways. There is some other work as Cisco going on in this domain, I'm probably not the right person to speak on it because I'm not an expert in it, but there is work going on around smart scheduling more generally where we can place VMs in an intelligent manner. Not specifically connected to ACI data at this point, but we'll get there. I still think it's good, I just think it's another API endpoint you'd interact with and have a holistic. Yeah. You'd hit auto-scale, you'd also hit the APIC stuff and make it a termination on what to do. Well, the auto-scale could take an input, could call the API and actually use this as one of its inputs. Yeah, great, yeah. Okay. Unless there's anything else, no hands going up. Thank you to Mike and Steve for presenting today. Let's hear it for them today.