 So good afternoon. Thank you for joining. This session is going to be common networking operations across Kubernetes and open stack with Calico It's a really good echo in here. It's like Imagine being an announcer at a stadium. So As my name is Mark Baker, I'm part of product strategy at canonical with I'm joined with Karthik Who's a Calico expert says director of solutions architecture at Tagira and we have Larry who's a Working on open stack helm and various other container initiatives at AT&T and finally Steve Who's also doing the same? So we're gonna talk about a number of different topics first off if I kick off I talk about a bunch of open stacks So a bunch of producers and open stack distribution as I'm sure you know And that's been successful out there in the market with lots of people using it But increasingly we've seen people wanting to be able to connect that with containers and so whether that's running Containers inside open stack alongside open stack underneath open stack the interaction of open stack with containers Is a challenge that many people are looking to try and address today And in fact, I have a nice build that emphasizes this containers throughout Canonical or Ubuntu open stack today. We have them at the control plane So we've been running Ubuntu open stack in LXC containers for many many years several years. I should say In fact, many of the production deployments we have at telcos around the world are in LXC containers So we have a containerized control plane But now we're looking at how can we bring that into sort of immutable containers with with technologies like Docker? Or Kohler, I should say but we also have them at other layers throughout the stack too And we see customers deploying them in the same way So whether it's using LXC containers on top of KVM whether it's using Lexi containers are you machine containers with a hypervisor called Nova Lexi? It's all talk about in a second or whether it's running Kubernetes on top of open stack containers are a Sprinkled liberally throughout our open stack deployment Now anybody that's been using open stack for a little while knows That successful open stack requires successful networking and I'm sure the same is true of Kubernetes too So figuring out how these things work together has been very important For those of you not familiar the different types of containers that actually we saw in that previous diagram We see virtual machines, of course based with KVM. They have networking implications machine containers She's essentially VM, but using container primitives, but looks and behaves just like a VM With technologies like Lexi or or perhaps open VZ if you know that Process containers things like Docker and rocket, which I'm sure you know love already And then different styles of application containers in the IoT space or Embedded space things like snappy and flat pack in there We're going to be primarily talking about the process containers. So Docker containers here how we can network Kubernetes environments and open stack environments together I'll skip through that. We do use Lexi with open stack So machine containers with open stack if you want to be able to run for machine containers in conjunction with open stack We can do that using this Lexi technology So this is slightly separate from running Kubernetes, but it's networked in the same way So all everything you'll hear will apply in a very similar way And finally With Kubernetes so as I said, we see a lot of the customers are engaging with open stack today looking at Kubernetes Deploying that on top of or alongside And so if that's something that you're interested in doing the solutions the things that we're talking about over the next half hour The integrations between these technologies are both with canonical Kubernetes with a bunch of open stack and of course their upstream fathers So let's look at the three types and Karthik who's going to deal with these in more detail as I pass when I pass over in a second There's those three scenarios. So In fact, let's ask the audience Put your hand up if you're running Kubernetes alongside an existing open stack say that again alongside an existing open stack On bare metal if you is anybody doing that today Okay, good opportunity for Kubernetes distributions there. Thank you. Is anybody running Kubernetes inside of open stack today? Okay, that's good. And anybody running Kubernetes Open stack in Kubernetes so using things like Kohler and on whatever okay, so a good good mix and somebody doing all three over there good brave man, so At this point as I say you'll know that Whatever way you're doing it getting the networking right is going to drive success or rather getting it wrong is going to cause you lots of headaches and sleepless nights With that I'm going to hand across to Karthik from Tigera. We're going to introduce Calico and how they've been addressing some of these problems Absolutely, so first of all to introduce Tigera Tiger is a company behind project Calico, which is a pure layer 3 approach to networking It's a very simple and scalable approach But we also co-maintenance a flannel we're co-maintenance of CNI, which is the networking abstraction used in Kubernetes and Before we sort of dive into Calico a little bit I want to give you a little bit of context for how Calico came about or rather What is what are some of the approaches that have been taken previously in that from the early days of virtual machine networking and Specifically in the early days of VM networking and this is by the way a standard Neutron with open v-switch slide and you can see the amount of complexity there in terms of bridges V-switches overlays security enforcement points and so on and the reason you have this complexity in the early days of VM networking if you needed to Applications to talk to each other you put them in an overlay you expose the virtual network concept to users and then you sort of conflate Isolation with network topology and so as a result as your applications grew the number of overlays grew and then you started adding layers of complexity and Then you start looking at things like east to west flows and not thought flows and then you had things like virtual routing to be able to get between overlays and The thing soon becomes a total mess as I'm sure many of you guys know if you if like me You've spent nights and weekends troubleshooting open v-switch and why packets going in a virtual wire don't come out the other end You know what I'm talking about right so that fundamentally is an artifact of the fact that people have sort of conflated Isolation with network topology and doing that by creating overlays and complex overlay networking And as we move to microservices that overlay approach starts to break down because if you look at the more The larger microservices deployments and this is a picture from Netflix from a few years ago this concept of creating overlays for individual application instances of groups of application instances that need to communicate and That does not scale from a networking perspective Right and so what you'll find is that the container orchestrators like Kubernetes have taken a fundamentally different approach Towards networking and it started with Kubernetes But all of the other container platforms are moving in the same direction which is to assume things like an An IP per pod or an IP per container depending on which orchestrator you're talking about in Kubernetes It's an IP per pod you assume that the communication between pods within the cluster is over IP and those IP addresses are unique within the cluster We assume the world is IP right? That's the fundamental assumption and the second Major difference is that all of the container orchestrators are moving in this direction Kubernetes has chosen to start in this sort of a model which is To have a separate set of policy constructs that applications can declare Typically in a YAML syntax in terms of what sort of isolation they want So Kubernetes has these concept of labels and selectors and then you can create network policy That defines what should talk to what so if you have an LDAP server or a bunch of LDAP servers objects labeled LDAP servers and you have a bunch of objects labeled LDAP clients and you have a label production or QE Then network policy you can define things like I want all LDAP servers to talk to all LDAP clients as long as they are in Production and they are all labeled project A right? So you start a decouple isolation from networking and What that allows you to do is it allows you to move to a much more simple network fabric With scales and an example of that is Calico So what is Calico Calico is a pure layer 3 networking abstraction built for cloud native platforms We work across multiple cloud native platforms Certainly Kubernetes tends to have Calico as one of the most popular network plugins So today and most Kubernetes deployment you'll see typically Calico being the network plugin of choice But Calico also works with OpenStack. It works with Docker. It works with Mesos. It works It works across multiple orchestrators and so the way it works is typically we use the plugin mechanism defined by the orchestrator in Kubernetes it is CNI in OpenStack it is Neutron and and other orchestrators Mesos also has adopted CNI So depending on the orchestrator we use the plugin mechanism and we create virtual Ethernet that connect In fact, I have a slide coming up on this we use we basically connect the workload a Virtual machine in OpenStack a pod in Kubernetes Directly to the host namespace using virtual Ethernet's and we treat every compute node or every Worker node in Kubernetes as a router Basically had routes in the host routing table pointing to the workload and We advertise those routes within the cluster using standard internet routing protocol specifically BGP And this is within the Kubernetes or within the OpenStack cluster Super simple. We don't use OpenV switches. We don't use bridges no complex overlays keep things Extremely simple and so if you think back to the picture I showed earlier about Neutron with OpenV switch And you look at a similar picture for Kubernetes networking. This is it Every node is a router. We use the standard Linux routing table What Calico does is it creates routes in the routing table pointing to the local pods on that host and Then we have a daemon called bird which is a BGP daemon which creates a BGP peering amongst the Calico nodes That's exchanges routes with each other. So all nodes in the cluster know how to reach a given pod using its IP address Extremely simple and surprise surprise when you look at Calico for Neutron. It's exactly the same architecture No V switches no bridges no complex overlays no tunnels So that's from a networking perspective. We've radically simplified networking and now for isolation the way we do isolation is using policy in The case of Kubernetes We actually use Kubernetes network policy and the reason for that is because the team that the team behind project Calico help developed Help develop the network policy implementation for Kubernetes and the design for it So you'll find that Kubernetes network policy mirrors Calico's policy infrastructure almost identically right and In the case of Neutron because Neutron has more core-screen artifacts like security groups It's not nearly as dynamic as something like Kubernetes's network policy and labels and selectors We have a concept called profiles, which is mapping security groups into profiles From an overall operations perspective Calico like I said the focus is both on simplicity and scalability The way you deploy Calico on Kubernetes if you go back and look at the recording of the session I did yesterday, which is introduction of Kubernetes networking for open stackers I did a live demo of deploying a Kubernetes cluster across multiple nodes with networking with Calico networking And that took two minutes. It's three commands right the networking portion of that is one command CubeCut will apply Calico.yaml If you take more than two minutes to deploy a simple Kubernetes cluster chances You're doing something wrong right that focus is simplicity and the way you operate Calico and production is You use the tools that you've used for network planning and engineering and operations and monitoring for the last 30 40 years There is a lot of thought that's gone into building out network tooling. We are not creating complex overlays We're not creating overlays and overlays We're doing simple IP routing and so what that means is all the tooling that people have Built up over the years can be directly leveraged with Calico Okay, huge huge win in terms of operational scalability and from a troubleshooting perspective Troubleshooting container networking with something like Calico or even networking an open stack with Calico is No different than troubleshooting when your laptop is unable to connect to the internet. Can I ping my destination? If I cannot ping let's do a trace right and see where packets are being dropped and At that point you're looking if it's being dropped on the host you look at is there routing table entry for the destination? If there is is there a policy rule that's preventing that traffic from being received at a destination that policy rule is simply IP tables Actually IP sets for scalability views IP sets not IP tables So it's a really simple a really simple approach from a water deployment Operations and troubleshooting perspective and for those of you have deployed neutrons with OBS and all of the complex layer three Approaches on top of that This is something you will really appreciate so coming started the scenarios of deploying open stacks side-by-side with Kubernetes or one on top of the other so just going into some of these scenarios with Calico There's two sorts of things you have to think about one is how do you network the workloads? So how do you network sort of workloads happening in Kubernetes and connect them with workloads in open stack and? That's one aspect and the second aspect is how do you provide isolation between them when you have different flows different projects How do you provide isolation especially when those applications are running in both domains? Okay? From a Calico perspective because the networking is really simple Every workload whether it's an open stack on Kubernetes has an IP address we use standard IP routing All it is is IP routing if you'd like you can you can peer across them or you can use your infrastructure If you're running in a public cloud, it's perfectly fine. It's IP routing It doesn't matter whether your clusters are in the same data center or in data centers at the other end of the planet It's simple IP right we know how to do IP routing at scale It's a solid problem. This should be no different right, so that's one part of it from a policy and isolation part of it the way Calico does it is we like I said we use network policy in Kubernetes in open stack we actually map open stack security groups to profiles in Calico's Key value store so in our data model we have the concept of workloads which could be in Kubernetes Which could be in open stack, but we have a common data store and that common data store for us typically is Xcd So as long as you have a common data store Xcd across both open stack and Kubernetes that is managed as part of Calico you can define through the use of labels and Profiles in terms of what needs to talk to what so if you have an LDAP server on open stack talking to an LDAP set of LDAP clients and Kubernetes which are being spun up and down and these are changing dynamically as your microservices require them Calico dynamically adjusts to keeping that Policy control and enforcing that through IP sets across the infrastructure. So it's a really simple way to scale your your isolation and Do that at the very end point dynamically based on how your workloads are spinning up across multiple orchestrators So to give you a sort of a pictorial illustration networking is really simple. That's IP routing Nothing to it and we use policy defined typically in yaml syntax for Kubernetes or mapping open stack security groups with things like tags into a common policy model and By the way, the same approach will also work when parts of your application are also running on host Linux instances So if you have an Ubuntu Linux instance running in the cloud somewhere You can have the policy part of Calico also applied there and apply the same IP sex rules there so as you have say a database running in the in the Host Linux instance maybe database clients running on Kubernetes and on open stack all of them the policies enforce dynamically at the very end point as Instances come up and down and it's enforced at both sides of that of that connection all sides of that connection So it's a very Distributed dynamic. There are no centralized sdn controllers. It's truly distributed. It's truly dynamic So that keeps up with your with your cloud as you have instances coming up and down With Kubernetes running Kubernetes nodes running inside virtual machines and open stack again There's different approaches you can take if you have an existing open stack Installation with an existing sdn you could absolutely keep right keep using that run Calico for the Kubernetes Which most? Calico is one of the more popular deploy network implementations for Kubernetes and essentially you would run Calico inside as part of a Kubernetes installation Keep it really simple But if you want to get even more sophisticated and you're looking for a simpler networking solution for open stack as well Guess what you can run Calico for open stack and at this point the way you connect the two is you simply do BGP peering between Calico running on Kubernetes to Calico running on open stack as Far as Calico and the BGP instance inside the Kubernetes node Things it's just talking to a top of rack switch. It's that same concept. So it's simple IP routing its routes in the Linux routing table and normal packet forwarding the data path is totally it's a Calico is not in a data path Calico just sets up the routes and it's simple IP routing right so super simple And then you use the common at CD data store to enforce policy between applications running in Kubernetes and those running in in open stack and Then sort of the interesting to one of the use case Which I think is emerging much more rapidly these days and something that it is going to talk about is when you run Open stack as a containerized application on top of Kubernetes and again It's sort of from a Calico perspective It's the same sort of models simple network peering with BGP and Policy with something like the Calico policy model applied across both Kubernetes and open stack But I'll hand over to the AT&T folks to give more Detail around this By the way, I see a few faces in absolute shock. It's not possible to make new trying this simple. Is it? Yeah, so we're gonna be doing a little bit of a demo today for you guys But before that we should talk about the environment so we're right right now We're running a three node cluster running Ubuntu 16.04 on bare metal Our CNI we're leveraging is Calico so Calico is providing the BGP mesh that our nodes are communicating on As well as the network policy that Karthik talked about Additionally, it's worth noting that Calico supports or our cluster was created using kube admin and Calico has a hosted kube admin manifest that it's one command and it'll apply the The nodes it'll configure the nodes with the ACLs Also, it'll set up an xcd instance for you as well as the policy controller. So we already have a Misclick, so we already have a version of open stack installed running mataka And We have an instance created The instance has a floating IP associated with it and at the end of the day all this is a web server that's running We have a video that's loaded onto it so we can show During the upgrade there is no service disruption Yeah, and we're also going to start a script that will It's that good by the way and the back they're able to see So this script is just going to keep calling Keystone asking for a token We're going to kick off our upgrade to Newton Real estate is at a premium here, and we're going to start a ping on that VM that's running So now the upgrade is happening So now you can see that some of the pods are being de-scheduled as the lively this probes and the writing this probes are being kicked off So you also see some new pods coming up in the init state So of course our init container is going to run and it's basically going to make sure the dependencies for these new pods that are Going to be scheduled to met So we're going to check for the keystone jobs are complete. We're going to check that Well, of course the database is going to come back up before the pods are rescheduled and If I remember right this does happen pretty Quick once the pods get terminated This entire process usually takes about three to five minutes. So yeah, but the big takeaway like larry mentioned is The video is still running We're going to be able to keep getting a token from Keystone and that's because we're leveraging rolling updates here So with rolling updates, it's going to make sure that it gradually moves a service or deployment down one replica at a time And replaces it so there's no interruption to the service at all So if you're looking at this in terms of Keeping your availability high for your your open stack deployments. It's it's pretty Important and it's also a pretty pretty awesome thing to watch, especially as you're doing upgrades I think there was a quick question here. Oh, would you want to take questions now later? Sure Got time Yeah, it's um Yeah, repeat the question So the question was can you can you sort of explain what's happening in behind the scenes to make this sort of magic happen? Sure, you want to take this? I mean at the end of the day, it's what steve just touched on with the rolling upgrades. It's slowly tearing down Pods one by one making sure that at least one version or one replica of that pod is always able to serve requests No, it's uh, the question was is uh, am I corrected? It's happening because it's replacing them faster than they're being torn down Uh, it's it's how it's a built-in mechanism with rolling updates with kubernetes So if you have a deployment that is Using a rolling update it will never schedule fewer than one So as it's tearing them down it will leave at least one replica up So that service always maintains an entry in the load balancer so it can serve requests So it's it's not a matter of speed. It's just a built-in feature for kubernetes with the api So it can make sure that there's something there always to serve the request So I think it's done running look here the tokens likely expired, but Yeah, so you can see that Okay, yeah, so you can see that the formerly gray Side menu bar over here is now the Newton bootstrap blue theme Our vm is still running Still serving The video stream hasn't been interrupted. We can check There was no packet loss on the ping to the vm. So there was no service disruption disruption on our vm workload All right, that's it So At the few minutes we should probably take a few questions as well But I also want to leave you with some sort of parting thoughts mark I'm sure you can add some as well So this is obviously a pretty in my opinion a hugely impressive demo to have a containerized open stack Deployed very quickly with simple networking and do sort of a live upgrade while an application is running And to be able to pull off a demo like this on stage I think is super impressive All right, that's sort of your moment of truth. Can you do this live on stage and have it work? All right, I kind of had my heart in my hands, but you guys seem pretty comfortable Um, so this is sort of like where the future is going Join the community open stack home. There's a lot of activity going on Calico as well strong and thriving open source community Join us obviously canonical is a good supportive of this whole whole initiative, so Thanks for the presentation. I was just wondering is this out of the box open stack home Or is there any did you guys make any edits to make it work like this? Well, this should be out of the box So a question about using um calico as a controller for open stack So you mentioned you don't have you don't use virtual bridges So I was just wondering the sorry don't use what sir you mentioned you're not using virtual bridges correct all switches Right or switches. So, um, I was just wondering so the the virtual machine is running on the compute nodes. So Where do you get the networking ports from? Yeah, so what we do is we essentially create virtual ethernet connecting to the workloads network namespace the vm's network namespace Down to the host namespace and essentially what we do is we we have routes in the linux routing table That says to reach that workloads ip address send traffic into that virtual ethernet So it's really simple No complex bridges or v switches or combinations of bridges and v switches And we essentially do the do our ip 6 rules ip tables or ip 6 actually at that Uh entry point into the workload. So we actually do it at the very edge So using linux name spaces correct. We absolutely use linux namespaces And when you when you have Vm some different hypervisors connected to you know the same virtual network that is How's that traffic isolated from out of virtual networks the so first of all in terms of different hypervisors Every workload essentially has an ip address and the way the networking works a simple ip routing The isolation is with policy which in the case of open stack is artifacts like security groups When you have a security group that says you are allowed this traffic in We have the concept also tags and calico where you can do a little bit more labeling The same thing in kubernetes is with a more sophisticated set of labels But open stack doesn't have the concept of labels to that same degree So it's basically mapping security groups to say hey, this Workload should be has this security group which gets mapped into a profile Which is something applied to a workload in Calico and that profile can be applied across multiple Virtual machines and multiple nodes and essentially that same profile is enforced at the endpoints At the watch late and that's what which connect to the workload and last question. I promise and so but you can still have a Same layer to broadcast across multiple hypervisors. So calico is a pure layer three solution It doesn't it's not focused on layer two. It doesn't use bridges So essentially every node in the calico domain you can consider the router Right, so it's a pure layer three solution a question right here Does your upgrade involve any database Schema changes and if yes, how is that done? Yeah, so every time you run an upgrade or even an install for that matter We have a set of jobs that run for every service. So we have a db and it which is going to start It's going to create the user grant them permissions And then we have a db sync which is going to prep the schema for the version of that container for whatever service it has So these are What kind of jobs are these kubernetes jobs? Yes kubernetes jobs that we use in it containers to check dependencies Okay, and does cola provide that out of the box? Is is it a feature of cola? Or uh, sorry the open stack uh helm Those scripts or those um Init jobs the what project is that part of yeah the dependency checking uh in the past We had used staconetti's entry point. I think right now in master for open stack helm. We're leveraging cola toolbox. It has a dependency checking application in it So calico being a layer three solution Can you talk to how that may or may not impact? Technologies acceleration technologies like dpdk. That's our iov. Sure. Uh, that's actually uh Answer in two parts the really simple answer is that calico what the way calico does networking is that Programs routes into a linux kernel essentially normal linux routing So the data path for a flows application flows Is the normal linux data path calico does not Unless you're free calico can also work if you have things like uh ip ip tunnels or ip sec tunnels And there's other ways you can have as a transport But it's not required. So calico policy can still apply on that But in the in the default case where you're using simple ip routing and simple packet forwarding in linux Calico is not in the data path And so what that means is uh Very depending on the technology you're using If you're using accelerated data paths those generally just work Right and even if you're not using even if you're not using acceleration The thing to keep in mind is because calico is not in the data path Unlike something like an overlay network plugged into open re-switch where every packet gets encapsulated into an overlay Calico is not doing any of that stuff, right? It's so even in the default case Generally you get much better performance with calico Because typically it's it's whatever your linux router can do which is normal packet forwarding Does it help? Yeah I had a question regarding Service function chaining or service insertion and that kind of stuff So how would it work in open stack because calico has a neutron plug in red So probably has to do something for that and also just on a pure kubernetes environment How does service insertion service chaining? Yeah, there's different ways you can do that That's actually a more complex question And I think there's different sorts of use cases to get into there So let's start by talking a bit on the kubernetes side kubernetes You have the concept of services in kubernetes, which is Depending on whether you're using cluster virtual ips and node ports or load balances There's sort of different ways you can do services so Typically those services get you work with something like qproxy which does the translation and sort of and qproxy just works with calico They work at different layers the calico works on the Workload ips and qproxy basically works also works using ip tables But they work with each other calico is fully compatible You'll find some network plugins and kubernetes have to replace qproxy. That's not the case with calico. It just works Now that said there's more advanced things you can do And as an example, an example i'll call out in the case of kubernetes Is you find that some load balances for example fi being Call fi out They have the ability to do things like bgp peering So they actually have written up a document and how you Do bgp peering between fi load balancers And calico so that fi can do more intelligent things with how they get traffic into and out of the cluster So there's different approaches that you can take to optimize sort of your different elements of your network Same thing again in open stack calico does certain things and does it well There's other things you can do sort of outside of calico Which are again, it's it's it's a function of how you architect your overall network Depending on what sort of specific Service functions you're looking to get chained happy to talk to you offline I might have some recipes for you Reach out to us on we have a calico user's public slack channel as well And reach out to us there. We'd be happy to give you some more detail on some of the possibilities Yep, actually, that's that's good. It's good to see that calico can handle the Neutron security group. Can it handle the neutron firewall service? Uh, so so the way you would do this in calico is essentially calico is providing a policy Implementation so you use calico policy typically. So for example Neutron firewall as a service I haven't looked at that recently But generally users would use calico policy to provide that same function up to up to users Two quick questions if I may depending on how much time we have one is Do you use how do you solve overlapping IPs across tenants? Do you use something like multi protocol bgp or whatever? Yeah And the second one is related to l2 for vlan aware vms Do you handle that? Is there any implication? Yeah, let me answer Let me take the second one first and like come to the second first one as well So, uh, you can do your layer to sort of abstractions on the need from a physical network fabric Uh, essentially think of every calico node as a router. So it's a super clean architecture If you want to clear your routers over an l2 fabric with Beef spine l2 architecture great More people these days the general approach in the industry Even outside of cloud infrastructure when you look at the web scale providers Most of them are moving towards l3 all the way to the edge Not just the top of rack when you look at the large web scale providers They're moving to l3 up into the servers as well, right? And so the calico really Fits that model now if you want to use vlands on the physical infrastructure and do other things It's not required. It's uh, it's typically making your network more complex But there's different reasons why people tend to do that and that's fine Coming to your other question overlapping IPs There are approaches that calico can use for overlapping IP address spaces Calico does not support overlapping IPs today. That's not in our current product set There are ways we can do that But surprisingly we have calico deployed on some very large service provider scale infrastructures And we have yet to Get some get requests from service providers saying oh, you need to do overlapping IPs and the reason is this The world we are moving to is a cloud native world The kinds of workloads being moved out into the cloud and into cloud like infrastructures Are workloads where you have instances coming up and down and in a clean IP Layer 3 model like calico When you can keep the networking really simple and do policy Independent the networking you can essentially get get that same effect using policy and a simple networking fabric So don't see a question calico does not do overlapping IP support today. It's something that we actually have Something scoped out, but we haven't had to implement that yet I think um, I think no, no, no, I think we're uh, we're over time now. So Um, uh, we'll obviously hang around here if you want any follow-up questions But just like to say thank you very much to car thick for presenting calico pieces and uh, But steve and larry for giving us a great demo if you have any questions, please please follow up with us here Great. Thank you very much