 Mae'r gwaith. Well, hello everyone. I'm Neil Gerram and I'm going to talk about networking Calico. So networking Calico comes from the open source project Calico, which is sponsored by my employer MetaSwitch Networks, but has quite a large community already beyond MetaSwitch Networks and hopefully that will continue to grow. So networking Calico is a networking back-end for Neutron, which is based on just IP and IP routing. It's a Neutron stadium project, and it has, which means it's part of OpenStack.org. It's an official part of Neutron. You can find the docs and the Git repository on OpenStack.org. And it's an example implementation of a new kind of network for which a spec has just been merged in Neutron, the Rooted Network spec. So this is a spec that's been worked on for a long time since Vancouver or I think even before then, and it's been owned and shepherded wonderfully by Carl Baldwin and much QDOS is due to him for pushing that through. Through at least three major forms that spec's been in, and it's finally landed now. It was approved and merged about a week ago. In a very good form. So QDOS to Carl there and to everyone else who contributed to the review of that spec. And the thing about Project Calico and therefore about networking Calico is that it's basic premise is that there is a vast vast number of data center workloads out there perhaps the vast majority that only need IP level connectivity between themselves. So they don't necessarily need the emulation of being on a single layer two segment, a layer two broadcast domain, which is the emulation that's traditional Neutron networking implementations provide. And the theory is born out I think by practice now that if you accept that premise, if you've got workloads which only require IP level connectivity, then there are various simpler things that you can do to try to connect them together. And networking Calico is one of the ways of doing that. And I'll be describing in some detail in a moment exactly how we do that. So this concept applies to any kind of virtualization plus platform, maesos, Kubernetes, as well as OpenStack. So the project Calico thing is that concept in general. Networking Calico is the integration of that idea into OpenStack and specifically into Neutron. So networking Calico is suitable for connecting VNs that are going to talk to each other at layer three or above, so in other words, using IP-based protocols. And for those workloads, we think that IP routing is a simpler and potentially better performing way of connecting them as I'm going to show than the layer two emulation approaches which require bridging and some form of tunneling between compute hosts. If you, I will admit, if you do have workloads that require on layer two broadcast or layer two multicast or something like VRRP, which builds on those, then those are not going to work out of the box of Calico. So hold on my hands there. This is for IP-based workloads and basically the premise of the project is we think there were a lot of those. So I just want to look a bit more at that rooted network spec that I referred to a moment ago which has recently been approved. So one of the things that that spec does is allow a neutron plug-in or driver to advertise that its networks do not provide layer two adjacency between VMs. So this is new because until now, the implicit assumption has been that on a network, if you connect two VMs or any number of VMs to a network, there is a layer two adjacency between them. So there's going to be a new read-only layer two adjacency property on the network object. And for all traditional neutron networks, that will, if you read that property, you'll get the value true. But a rooted networking implementation like networking Calico can arrange that when that property is read, you'll get false. And in general, that doesn't necessarily mean that there's never any layer two adjacency. It just means you can't rely on it. But in a networking Calico specific implementation, there actually is never any layer two adjacency between any pair of VMs. So you can imagine that that probably is useful because it means that neutron now formally recognises the possibility of networks that provide layer three only connectivity. And you can imagine that someone might provision a data centre so as to have both traditional rooted networks available in it. And then a user of that data centre, someone launching a batch of VMs would be able to look at the neutron IPI. They'd be able to see that this network says layer two adjacency true. This one says layer two adjacency false. And then they would use, they would attach their VMs to the network which was correct for their workloads. So how does Calico connectivity work? What is its data path? And I'm going to go through that here. So, yeah, that's a bit small on my screen, but I'm pleased to see it's nice and big on the monitor. Key ingredients that they are here are that the tap interfaces, so you've got a tap interface coming from each VM. It's E0 on the virtual machine side. It's a tap interface on the host side. So we leave that unbridged. Again, I keep parking on about most existing neutron implementations. So, but most of them plug it into some form of bridge, Linux bridge or OVS. We don't do that, we just leave it dangling. And we enable proxy ARP on that tap interface. That means that if an ARP request comes through on that tap interface, then the compute host will reply to that ARP request regardless of what the IP address being requested is with its own MAC address. And other ingredients that basically there are routes in a routing table on the compute hosts to all of the possible VM destinations. So imagine you've got, I've got two VMs in this picture, one with a dot two address and the one on the right with the dot three address. And imagine one of those is sending data to the other. So the source VM, oh, click something there. The source VM has a routing table like just a sub, just a traditional subnet routing entry. So 10.65.0 slash 24 through dev at zero. Which means that it thinks it's directly connected to anything else in that subnet. It actually isn't, but we maintain that illusion here. So because of that, it's trying to get to 10.65.0.3. It will send an ARP request to discover the destination MAC because of the proxy ARP, the local compute host will reply with its own MAC address. So then there's a single hop, a layer two to the local compute host. But it's not IP address to that compute host. And so what happens is that it pops out of layer two, it goes up to the next routing. And then in the next routing table on that local host, sorry, the source host, I should say. There's a routing, there's a route like, I couldn't point it, but I haven't got one, sorry. So there's a route, basically saying that 10.65.0.3 is via the address of the destination compute host. So this is an indirect route. And so the packet gets routed onto that next host, which is the destination compute host. And then on the destination compute host, there is a route like the one on the right, the one, the direct route down that tap interface. And obviously that means that if you want to get to that address, go down the tap interface and the packet goes down that tap interface and it's delivered to the VM that it was intended to be delivered to. So pretty simple. How did all of those routes get there? So the route on the source vn is there by perfectly normal DHCP. That's absolutely standard, that DHCP tells that your address is 10.65.0.2, tells you the size of the subnet and Linux will respond to that by programming a directly connected route like that one. The tap route on the destination compute host gets there because we have an agent running on every compute host. The Calico agent is the thing that we call Felix and Felix programs that route. So what happens is that the networking Calico ML2 driver handles the port creation. It then basically passes information about that port including its IP address and the tap device name to the Felix agent which is running on that compute host and the Felix agent responds to that by programming a route like that one that you see there. And the final route gets there through BGP. So as well as running this Calico agent called Felix on every compute host, we also run a BGP speaker on every compute host, specifically bird. So those birds pair with each other and propagate routes around the compute host network. And that's what generates the indirect route for 10.65.0.3 on the source compute host. So basically when the direct route, the tap route that you see on the right hand side there is BGP exported by a compute host that has that .44 IP address. On all of the other compute host it causes that indirect route, 10.65.0.3 via the .44 address to appear to be programmed. That's just standard BGP operation. It's exactly the same as how routing in the internet works. So there you have it. That's the Calico data path, three simple IP hops. There are no overlays, bridges, or tunneling here. And also note that we're not using any namespaces. So everything I've shown you here is in the default namespace. So I should say a few words about use cases, provisioning and a bit about isolation and IP addressing because you might be wondering based on what I've said how it could work for different tenants to use overlapping IP ranges, for example, or about how we provide isolation between tenant networks. And we do have some slightly different answers to those questions. So Calico is primarily intended for provider networks, which means so in other words for networks that the cloud admin provisions ahead of time and then makes available to the cloud's users. It is also possible for a tenant to create a Calico network if your neutral server is configured with Calico there. But semantically, that network basically behaves the same as if a cloud admin had provisioned it ahead of time. So it's only tenant-related aspect is that if the tenant doesn't create it as a shared network then only that tenant will be able to see it and so only that tenant will be able to use IPs from the SIDAs which are associated with that network. And specifically, it's not possible for a Calico tenant network to have its own address space or address scope and hence not possible for different tenants to use overlapping IP ranges. So as we stand at the moment, all Calico networks are in the same address space. We do have a design for that if we need it for overlapping IPs. But so far none of the partners and customers and interested parties that we've talked to about using Calico have had a firm need for that and so it's unimplemented at this point. When it comes to isolation, Calico doesn't implement any automatic isolation between its networks. In general, OpenStack doesn't really specify whether isolation exists. I hope I'm correct on this. Whether isolation exists between provider networks. It does for tenant networks clearly, but for provider networks not. And in Calico we take the view that we want all Calico provider networks to be reachable to each other and that whatever isolation you need, you achieve using security groups. And that's actually very simple. So to take an example, suppose you had a Calico provider network with massive 10 slash 8 IP range and you got two tenants using this, but the typical case would be that each of those tenants wants their VMs to be isolated from the VMs of the other tenants. Absolutely standard use case. And they each have a security group called default and typically that security group only allows inbound access from other VMs in the default group. And as long as they launch their own VMs in that default security group, they will get the isolation that they want. Slightly confusing because they both have a security group called default. So even though the names are confusing the same, these are actually scopes to the tenant. So they're actually different security groups. They have different UIDs, so you really do get isolation through that mechanism. Another networking Calico feature is efficient handling of both public and private VIX IPs. And we still do allow floating IPs as well where that's useful. So in a typical Calico deployment, the cloud admin would provision two Calico provider networks, one with a range, probably a small range, of just because of scarcity, of public by which I mean internet-routable IP addresses, and one with a much larger range of private, so RFC 1918 addresses. And when a VM is launched that needs to be contactable inbound from the internet, it can be attached to the first of those networks, and so it gets a public fixed IP. Otherwise it gets attached to the second of those networks and it gets a private fixed IP. By the way, obviously, as per normal, it's still possible for a VM with a private IP to access the internet. You just need to have NAT somewhere, ideally on the border between the data centres I've shown here and the outside world, so it's not to do unnecessary NATs when you're just doing traffic internal within the data centre. If you're communicating between the internet and a VM that has a public IP, you don't have to have NAT anywhere. If you're communicating within a data centre between a VM with a private IP and a VM with a public IP, or more generally between any two VMs on different calico networks, then that works. We've decided the same data path as I showed earlier, so routing obviously still works when the source and destination addresses are not in the same sublets, obviously. And that's also related to the fact that we haven't implemented any isolation by default between these calico networks. I promise a bit more about how we make DHCP work, which I think is not obvious, because normally DHCP requires the server and all potential clients to be on our layer to broadcast domain, and that's very much what we're not doing in calico, so I said before that we leave all these tap interfaces which come from the VMs, we leave them dangling. We don't bridge them. So how does DHCP work in that scenario? And happily, DNS mask helps us here because DNS mask has neat features that allow it to basically the interface that DNS mask regards as its master DHCP interface, where there has to be an address in the sub that it's going to allocate, defined. That's the one on the left there, so that's DNS mask's primary interface. And DNS mask has features that allow it to listen on many other interfaces as well as that one, so specifically on all of those tap interfaces. And to treat all of those tap interfaces as aliases of the primary interface, in the sense that if a DHCP request comes in on one of those aliased interfaces, it will satisfy it using the DHCP information that it has for the primary interface. So we had to upstream some very minor deltas there, too, to DNS mask. In fact, that was implemented for IPv4, but not yet for IPv6, so we contributed the change there for IPv6. Generally, by the way, even though I'm putting IPv4 addresses on all of these slides for brevity, everything you see here works for IPv6 as well, with only very minor technical changes, such as changing ARP to NDP in the IPv6 case. So, yeah, using these DNS mask options, listen, bind dynamic. Listen means listen to all of the tap interfaces. Bind dynamic tells it to kind of to watch for any new tap interfaces that may appear at any time and to listen to all of the tap interfaces. Bridge interface is the option which tells it to basically behave as though the interfaces were bridged, even though they are in fact not. And so, basically, we just have to modify the DHCP agent a bit in order to drive DNS mask in that way, and then DHCP works for us. So just a note about releases. We've been working on this for a while. When we first developed this, we've been working on this for a while. We've been working on this for a while. We've been working on this for a while. When we first developed Calico, there were a couple of key ingredients missing from the vanilla open stack, which we needed to demonstrate the Calico connectivity approach. Firstly, in both Nova and Neutron, there was no concept of having this unbridge tap interface. So that's something that we had to add. It's called now in the code base is VIF type tap. So this is a tap interface on the compute host side into any kind of bridge. And secondly, in Neutron, we needed to add some enhancements to the DHCP agent code to allow it to drive DNS mask in the way that I've just described. And because of that, for a few releases, the first few releases that we supported, Icehouse, Juno and Kilo, we actually had our own forks of Nova and Neutron with those patches in, but by the time that Liberty had come along, we had managed to get those upstream. We were actually working with Vanilla OpenStack. Formerly speaking, even with Liberty and Metaca, there was still something missing, which is something explicit on the Neutron API to say, I've gone wrong there, sorry, to say that this network only provides layer-free connectivity. But that's the thing I was talking about earlier. That's the L2 adjacency property by this rooted network spec. So the spec is now there. The implementation of that will land in Neutron. And it was always possible to demonstrate Calico in those prior releases as well. It's just that you needed to kind of have an understanding that the Calico network was going to give you slightly different semantics, whereas from Neutron onwards, that will be absolutely explicit. I'd like to talk about a couple of developments that we've added more recently to Networking Calico. The first of these is support for floating IPs, for which I'm indebted to Nick Bartos of Piston, ex Piston and Cisco. So floating IP requires a one-to-one DINAT on the inbound path because the VM itself is not aware of its floating IPs. In mainstream in reference Neutron, this is done by the L3 agents, but in the Calico implementation we pass the information for the mapping from the floating to the fixed IP to the Felix agent that I mentioned before. And because Felix is already doing a load of other programming of IP tables for things like security and metadata, it handles that as well. Because that's running on every compute host, it's a bit like what DVR does. We have a slight modelling issue here, which is that I said earlier that Calico networks are really provider networks, but the representation of floating IPs, the modelling of floating IPs as it stands in Neutron in a moment, requires basically a mapping from a floating IP pool which is defined on an external network through a virtual router to a tenor network. So this is perhaps one reason to use tenor networks with Calico, so that you can do that. But in the medium term, I think we hope that there's no fundamental reason why floating IPs have to be like that. So hopefully then we can get the model generalised to allow floating IPs to target IPs in provider networks as well as in tenor networks. The second development is using a Calico-specific DHCP agent instead of the reference Neutron DHCP agent. We found that when we ran the Neutron DHCP agents on every compute host so a consequence of the DHCP thing I was talking about earlier is that we have to run DHCP agents everywhere on every compute host, rather than say just on a network node. We found that the RPC communications put too much load on the Neutron server and cause requests to the other requests to the Neutron server to time out. And we think that's to do with things like the agent state reporting and the finding out of port information from the server to the DHCP agents. So Calico's Felix agent gets the information from the ML2 mechanism driver through a distributed database which is, we use XCD. And actually that XCD database already included almost all of the information that we needed for the DHCP provisioning. And so we thought it would be interesting to make a Calico DHCP agent which was driven not by the Neutron RPC mechanisms but by the information that we had in that XCD database. And we found that with the Neutron DHCP agent we were struggling beyond about 250 nodes with the Calico DHCP agent our current scale target is 500 nodes and we're reaching so that seems to help us. And I think there's a broader conversation going on about that within Neutron about the scalability of these RPC communications. And I should say that although the kind of the top level of that DHCP agent is now different from the reference Neutron one still benefiting from a lot of the internal classes which are used from the reference implementation such as all the logic for generating DNS mass config and for driving DNS mass. So actually we're still getting a lot of value out of the reference implementation there. So I'll skip over this quite quickly because I think we should be getting on to questions soon. But we have addressed that plug-in. I'm going to show a video very shortly which shows that being being used. It works with Liberty We have packages for packages for various operating systems trustysenial Red Hat, we have 2D charms, we have fuel plugins While at this summit I've made an integration, I've submitted a change which hopefully makes Calico work with Korea and I've had people approach me about integration for Ansible and Source. So hopefully those will learn LAN before much longer. And what next? So Neutron networking Calico and the wider Calico project is already completely open source. But I'd like us to do more in the sense of operating fully in the open stack spirit to do a few more of our design discussions in public, get a bit more community input. And so I do want to set up an IRC meeting for networking Calico. I plan to do that soon. And I'd really appreciate your input on all of these things. So step one has been this talk which is the first opportunity I've had to really introduce this project in detail to the community. Step two will be that meeting. We do have a list of things that could be coming next for networking Calico. So one of those is IPAN for example, to help the clustering of the routes that we generate. But I think that more generally we want to keep an open mind and take community input on what other things we could be added to. So that's it. We think it's networking Calico. We think it's a simple and scalable way to network IP workloads. Security is on there too. I haven't talked about security in this presentation very much. But basically in an open stack context it just renders neutron security groups using IP tables. It does actually allow various richer more expressive kinds of security policy as well such as for example allowing the cloud admin to provision policies that are applied before any tenant specified security groups and so can't be overridden by tenants. But that's not currently through the open stack API. So I'd like to thank you all for coming to listen to this presentation today and encourage you to participate in the IRC meeting when I set that up. Thank you very much.