 All right, so this is providing open stack high availability through any cast routing My name is Richard Raisley, and I'm a systems operation engineer at puppet labs So before we dive into the topic. I'll just give you a little bit of my background. So, you know where I'm coming from I've been involved in IT for about a decade and have had various system administration engineering and architecture rules at companies like concur technologies and Microsoft I Was fortunate enough to become involved with open stack in early 2012 and have been equally fortunate to be able to leverage that Into the role I have today, which is thinking about open stack and automation and those associated things on a full-time basis. I Am also along with my esteemed Friend there Christopher from Arantas the co-organizer of the open stack PDX user group down in Portland, Oregon So please stop by if you ever find yourself down that way Third Thursday of every month. Thank you Christopher So Here's the big question. Why are we all here? So for many of us this summit is going to be your first exposure to open stack the technology and the community Well on the other end of the spectrum many of you are running large production clouds, and I'm sure there's a lot of places in between But we're all here because we've all had the same important realization and that realization is that high availability is important It's not just important. It's absolutely critical So hardware and software will fail. They will fail especially at scale It's not a question of if it's a question of when and in what way So as we start thinking about our environments and as we start modeling these various scenarios We have at least one additional realization and that's high availability is difficult and It's difficult because in addition to the innate complexities of managing open stack and all the components that go along with that You're now forced to manage an additional software layer its configuration and all those associated pieces so My goal for today is to present an additional option. I Just want to give operators and administrators another tool and their belt So when they think about the high availability options out there, they have a wider array of choices And they can pick the right tool for the right job, which is obviously always the most important thing So before we get started I want to give a little buyer beware You know the This is Technologies that we've run in our lab and anycast itself is really not on trial here because it is obviously a very widely used And used daily by everyone Protocol, but I thought it in the interest of full transparency to say that we have not tried this at scale We're not running a large public cloud with this. This is more of a I guess academic exercise at this point So with that I'm going to jump into the agenda for today So first we're gonna go ahead and introduce anycast And we're gonna talk about what some high level flows look like in any cast and kind of what that that scenario looks like Secondly, we're gonna loop back around and dive down into all the supporting components and add a little bit more technical Meet to the discussion then we're gonna go back and put it all together and figure out how we can build these blocks to Actually give us something useful and then we'll finally come to some conclusions and figure out what we actually learned today if anything So for this initial section, we're gonna go ahead and do an introduction to anycast I'm gonna put a definition up here on the screen then we're gonna walk through Looking at what any cat what makes any cast traffic special and then kind of look at what a Example traffic flow looks like in an anycast environment So this is The Wikipedia definition of anycast which is a good place to start as any and that says that anycast is a network addressing and routing methodology Not a protocol. That's an important distinction in which dead egrams from a single sender are routed to the topologically nearest node In a group of potential receivers though maybe sent to several nodes all identified by the same destination address So to realize what that means in practice. Let's first go ahead and take a look at Something much more basic and then we'll build upon that so this is a Typical Unicast traffic pattern in a routed IP network When client zero wants to send traffic to server zero It has essentially one path that can possibly take and that path consists of two hops a hop from client zero to router zero And then a hop from router zero to server zero Now this slightly more complex diagram adds a couple of additional routers and servers to the mix But it's still relatively straightforward client zero can reach any of these servers in a pretty straightforward way with Unicast traffic now That's all good and well But what if all of these servers are running identical services? Which is often the case in a HA scenario furthermore What if we want to abstract away the need for a client to actually understand the underlying service topology? Which is actually what we want? Traditional HA methods for example load balancers cluster managers do this through the use of something like a VIP But and again in the interest of presenting some additional options for us today I want to show how any cast can also solve this problem. So we'll go ahead and throw it into the mix So the first thing we're gonna do is Go ahead and Provision our any cast address on top of our servers So we're gonna do this using a virtual interface in the form of a loop back interface again Not a physical interface, but a virtual interface Now through the use of software, which we will dive into more detail later These nodes will advertise their address to the routing infrastructure and essentially this Advertisement is an assertion that I server zero can service requests destined for my anycast IP Which is 10 dot 10 dot 10 dot 10 in this case Now the server will talk to the routers and those routers will propagate that information through the rest of the routing topology Which essentially allows us to build a map of the underlying network now When client zero since traffic Destined for any cast address the process is a little bit different The packets will egress from client one they'll reach router zero But now the router says I have three different ways I can reach my destinations. I can take one hop from router zero To server zero I can take two hops from router zero to router one to server one Or I could take three hops from router zero to router one to router two to server two in This scenario one hop is the shortest path available for our data to take Therefore, this is the path the traffic will follow. This is what is meant when we talk about sending data to the topologically nearest node So this is an industry model and one that is used very commonly and You know a very good one for anycast, but I'm actually interested in applying this in a slightly different way So let's take a look at what that looks like So we still have our single client We still have three servers, but now all of them are directly connected to router zero as With before anycast address is still being advertised From the clients the servers in this case or I'm sorry from the servers in this case to the router And this is again going to allow our routers to build the appropriate topology map understanding where anycast IP address is so this time when Client zero initiates traffic to our anycast address the packets egress from client zero come into router zero and Router zero says well, I still have three ways to get there, but they look a little bit different I have a single hop from router zero to server zero a single hop from router zero to server one And a single hop from router zero to server two So we now have what we'll call three different equal cost routes to get to our address Using a technology called equal cost multipathene, which is something again. I'll talk about more later The router will then go ahead and load balance this data or these packets across the equal cost routes In this model as data comes into router zero destined for the anycast address The first packet would go to server zero the second to server one the third to server two and so on in this way We've achieved a really basic layer three load balancing so now let's go ahead and Move into talking about some supporting components. So Up until this point. I've moved a bit fast over some really essential technical details and terminology and I want to dig into that now I'm trying to assume a really minimal Working knowledge of networks. We're obviously not all network engineers So I'll go ahead and start with some fundamental concepts and then work our way up in complexity I want to make sure we're all working with the same definition. So we arrived at the same place So the first thing I just want to level set on is a network and in my world A network is a collection of nodes which are computers in this case and links which allow those nodes to exchange data okay a packet is Simply a structured unit of data carried by a network a Router is a networking device which is responsible for moving packets between networks and now Routers employ two primary functions to accomplish this task the first function that's employed is unsurprisingly called routing Routing is the actual act of finding the available paths which packets could take to any known networks They're associated metrics if any and then maintaining a record of all that information and something called the routing table The second method that a router's employees is called forwarding forwarding is the act of actually moving a packet from one interface to the other on its way towards its destination address The information to decide the appropriate interface including ports Mac addresses metrics things of that nature are Stored in a data structure called the fib or forwarding information base Part of the data that exists in the fib is populated by the routing table Autonomous system is a collection of networks and their associated devices Generally routers under the control of a particular entity that can be a corporation a university or some similar Open shortest path first Open shortest path first or SPF is an implementation of what's called a link-state routing protocol At a high level this means that each router participating in this OSPF topology Builds and maintains its own map of the underlying network not just its adjacent nodes so it understands what is happening across the entire network and That is built by router advertisements from other routers within the same autonomous system Equal cost multi-pathing or ECMP is a routing strategy Whereby packets bound for the same destination can be sent over multiple available best routes I touched on this earlier when I had the diagram of three servers all connected to the same router ECMP generally employs a round robin load balancing strategy Which is a really simple strategy which basically means that as data comes in the first packet will go over the first Equal cost link the second will go over the second equal cost link and so on So if you bought an enterprise grade router in recent memory, this is likely a supported feature flow pinning and again This is not a network standard terminology because none exists is a routing strategy whereby packets from a particular source IP and port pair Bound for a particular destination IP and port pair will get pinned to the same route. So if I have multiple Multiple equal cost routes. This is what allows me to use connection-oriented protocols like HTTP as opposed to just connectionless ones like DNS Different vendors have different names on this as I alluded to earlier For example, it's called per destination load balancing on Cisco devices. It's called somewhat misleadingly per packet load balancing on Juniper devices Again, if you brought an enterprise router in the last five years This is likely supported and the final piece that I want to talk to with regards to our components are quagga So quagga is an open-source network routing software suite What it does is essentially implement various routing protocols such as BGP and OSPF Which we're particularly interested in on a variety of Unix likes platforms This is what allows our servers to talk OSPF even though they're not routers in the traditional sense so Let's go ahead and put all this together and see how we can actually build something that is useful for us So we'll start with talking about bringing up some shiny new nodes adding our interfaces Installing quagga our shared services open stack services, and then we'll look at see what Traffic flows look like and how we can handle failures in this environment So let's imagine that we've got some shiny new nodes upon which we want to run some highly available API services Now we're going to go ahead and provision the interface which will hold our anycast address Again, this is a virtual interface in the form of a loopback In our case, we'll stick with our earlier example and continue to use the 10.10.10.10 example Now we'll install and configure quagga. This is ideally done via some automated process or automation software That can be puppet or that can be something different The configuration itself is fairly basic and consists of three primary parts The first is the router ID and the router ID is how this particular device will represent itself to the rest of the network The second is the networks which were advertising in this case We're not advertising networks in the traditional extents. We're advertising a single IP address So that'd be the 10.10.10.10 32 address Optionally, we can provide authentication information which will prevent unauthorized actors from either viewing or manipulating our routing topology in some way Now we're going to go ahead and bring our shared services into the mix and generally an open stack That'll consist of a database and a message broker. Their configuration is outside the scope of this talk today But it goes without saying that we're doing this in some automated fashion potentially with puppet, potentially with something else So now that we have these prerequisites in place, let's go ahead and bring our open stack services into the mix Again, and that's sound like a broken record. We're doing this in some automated fashion Possibly using an excellent set of puppet modules that are now part of the open stack project possibly using another one of our friends solutions Obviously much of this is beyond the scope of our talk today, but I want to call out two specific bits Which is firstly we want to make sure we configure our services to listen for requests inbound to our Kennycast address Not the particular address of one node or another And secondly, we want to make sure that when we configure our Keystone service catalog We're configuring it with the anycast address or a DNS record associated that with that address again Not with the individual node IPs So now that we got this in place, let's go ahead and take a look at what a traffic flow might look like in this scenario So first we'll bring client a client into the picture. We'll call him client zero Client zero makes a request to our highly available API service having got the anycast address from our Keystone service catalog packets egress from client zero and arrive at router zero router zero conducts a look up in its forwarding information base and Sees that it has three different equal cost ways to get to our destination service Implementing the equal cost multi-pathing strategy, which we talked about earlier It's going to go and then pick one of those links to start with and send a packet out that interface because of flow pinning subsequent packets from the same flow so the same Source port IP and port same destination IP and port are going to be bound to the same backend member or same path Packets arrive at the node the open stack API services Except those packets because they're listening for traffic on our anycast address in an ideal world They will fulfill those requests in the same way and then they'll send back a response Now as we add more clients we'll begin and those clients are bringing up and tearing down more sessions and making more requests We'll eventually see a more uniform distribution of traffic across our participating nodes So assuming all of these components cooperate and things work in the way that we think we're working We've now have a functional or semi-functional load balancing scenario across our participating nodes. This is great But now we need to think about what we do when something fails Remember it is only a question of when not if So there are certainly a variety of scenarios We want to protect against and what we couldn't possibly enumerate them today or I couldn't possibly know what's relevant for your particular scenario We do have a common mechanism now to help us deal with these scenarios or these failures And that mechanism is the loopback interface upon which our anycast IP address lives So because each node is actively participating in our OSPF topology and actively talking to our routers If our interface is brought down for example doing something as simple as an IF down as you can see I've done there on the left node OSPF daemon, which is supplied by quagga becomes immediately aware of this change and withdrawals the router advertisement to the router effectively evicting this particular node from the purchasing participating group of anycast servers now This happens fairly quickly generally in less than half a second If we want to bring a node back into our anycast group we do the reverse for example with an IF up Now once that interface comes back up again our OSPF daemon detects that change detects that it has a new Network to advertise to the rest of the world and does so This effectively brings our node back into our anycast group. This also happens very quickly the same less than half a second or so So as the state of this interface is pretty easily managed we can implement a variety of different scenarios here This could be something as simple as having a local script which runs in check services and says if my services aren't running Bring down my interface It could be something more complex like a trigger to an event in a monitoring system or really anything that you want to do Anything that's valid to your use case The beauty is that because of the simplicity of the mechanism. It's really flexible in what we can do with it and Additionally if a failure creates a situation where we're unable to gracefully bring down this interface OSPF does have some built-in health check mechanisms, which will result in the eviction of this member now That's might be a little bit slower process, but again, that's tunable Again, the beauty is the flexibility so Let's talk about some conclusions today So the first conclusion I want to say is or talk about is that anycast is a viable option for HA The the test that we've run and the confidence in the usage of anycast in general kind of bears this out The second conclusion is that there's going to be some special networking capabilities and knowledge that are required to implement this I see that as an opportunity maybe less of a challenge Go talk to your networking guys bring them into the fold We're all on this together It introduces some additional troubleshooting considerations, you know If I have packets that are leaving a client and they're bound for an anycast address That's not so deterministic and telling me where it went through the network So, you know, you're gonna have to think about that and figure out, you know What kind of options you have available to keep track of those things The fourth one is that this has not been tested at large scale and when I say this I mean anycast as it specifically relates to open stack if you use DNS on the internet you are using anycast today So again, don't take my word for it go and test go and test The second conclusion is again know the trade-offs and that's just really a combination of the things we've already covered You know, do I have the right networking equipment? Do I have the right in-house expertise and my comfortable deploying these additional daemons? Does it just make sense for my use case? again, I can't answer that and The the final conclusion and this is a conclusion that should be on every slide Which is use the right tool for the right job? There are no silver bullets The goal was just to present an additional option want to add your tool belt and just to make you aware of it and With that, thank you, and I'll take any questions. You might have Yeah, so the it seems like you're Trying to conceptualize anycast for the API services from a front-end perspective I'm assuming you really haven't looked at the back-end services, which many of which are active passive so in terms of For example when certain services need to talk to a database only one of them can be active at a given time, right? Yeah, so I mean that's that's obviously an example of use the right tool for the right job I mean any a lot of the open-stack services fit into you know two or three kind of buckets depending on how you look at them You have the stateless front-end services. You have stateless services that hang off a message broker You have kind of other you know Various types of services, but certainly there are some services where this doesn't make sense You may need a more traditional Cluster stack like core sinker pacemaker or something simpler like keep alive D You know that's kind of knowing your environment and figuring out the right tool to use at the right level Yeah, I understood and this goes a good front-end way of scaling as well as a chair, you know Which makes sense. Yeah additional questions comments or statements Yeah Yes, yes, so OSPF and other Routing protocols have ability to have health checks So, you know if an advertisement hasn't been you know master of the right terminology is updated or or re-advertised after a certain time period You have the ability on the routing side to withdraw that advertisement and that would probably route Result in a little bit greater turnover time. I'm guessing but I'm you know, that's tunable and it's just figuring out kind of what the right value is So obviously this kind of competes with h.i. Proxy Deployments because if you have a hardware load balancer, then of course that's the different Issues so in terms of efficiency out. How does this look outperform? Have you done any? Measurements to see what kind of operational efficiencies you get with by eliminating h.i. Proxy, let's say So I have not done performance testing on this setup Anecdotally my usage of any other casts other any cast services have shown that it's highly performant I mean, you know routers are good at routing They're good at pushing packets over the wire And so, you know if you can bring that logic up into the rowdy really taking advantage of their specialized equipment I think it was cloud front had a nice white paper on the performance of their network in there You should do many cast which had a lot more kind of numbers in it in terms of the performance. They've seen it seems like more of a replacement for or not alternative to to h.i. Proxy rather than alternative to pacemaker and in that Great h.i. Talk. I just came from by David Vossel he one of the questions at the end was Like, you know should we use h.i. Proxy or pacemaker and he said well why not use both together? They actually are great fit for each other It seems maybe that could apply here that you could still be using pacemaker to manage the services Yeah, you know, I think it's you know certain set of services are going to be more stateful They're going to require more care and feeding, you know You're going to put a little more care into them and use something like pacemaker and core sync I Think and at last time I looked at it the open stack high availability guide was prescribing The full cluster stack for things like the API services and so I think that's a little bit overkill, but I absolutely agree with the model, you know Crossing can pacemaker makes sense for those services. It might be used in combination with this or h.i. Proxy or kind of whatever You know that particular service calls for yeah, definitely. It's a it's a hybrid approach It's not I think the argument David was maybe getting at is that You know, okay, something with fencing might be a bit overkill because it's stateless but you still benefits to having pacemaker just from the resource management and the Service ordering point of view and that kind of thing. So maybe There's value in having it alongside. I mean, yeah, sure adds complexity, but sure I Totally believe that One question since you said you tried it out in a presumably a lab environment right in small scale What what was your what was the end user the client behavior when you had a Say a flapping API server, but are you flap the API server? Where obviously a TCP connection is gonna get reset when you move over to a different server, right? What what was the client behavior in that case when it was talking to say sender API? You're trying to do a you know volume create and suddenly in the middle of that API called So there's change so a TCP connection gets reset. What what what happens? What are you notice? so That is you know part of the the testing we're continuing to developing is trying to figure out what all those behaviors are I Don't necessarily have on the tip of my tongue exactly to reproduce all those scenarios and see what those are But I'm hoping that we can as we go through this process publish more documentation and kind of talk about the Pragmatic side effects of anything like this So, I'm sorry. I just thought of something else What would happen if one of the API services Hung so they you know that the box is still rootable like the network layer is all fine But maybe you know even the API process is still running, but it's just not Responsive Well, I mean it at that point it depends on how you're managing that particular node and how you're managing the anycast interface So, you know if if you're not checking for an unresponsive service Then I imagine that node would hang around and continue to do damage in your in your infrastructure However, if you're if you're building your check so that you're detecting that Unresponsiveness and then you can you know manage that interface in a way that will remove it From the network, but again, it's all about kind of what what checks you've implemented and and what you're looking for There's nothing inherently to say that like if the API service goes down that the anycast address will be withdrawn Or the route advertisement will be withdrawn. It's all about what you implement in terms of your monitoring and so so what? What checks have you tested? So, you know, we've started with you know the super simple things like If if the service is not running bring down the interface, you know that type of stuff We're kind of working on figuring out if we want to you know get a little bit more complex and and Implement not just like that light layer of sanity checking but also kind of deep intelligence into You know, what do I expect to get back? Not just like you know is the service up and running because that actually doesn't mean a whole lot You know in terms of the actual functionality I just wanted to say something about a large scale We run some open-stack service with any case to across Europe and in North America What you with I'm sorry, I can't sorry what company are you with I can't see down there. I What I work for OVH French company Something you have to keep in mind is that internet router don't do flow piding and Routing table on internet change a lot So you don't want to have a long connection Manage on an anycast IP So anycast works perfectly for short connection like keystone or horizon You don't want to put an anycast IP on Swift for example Sure, that's I appreciate you bringing up that point and that's one of the specific reasons I highlighted like OSPF over BGP because it's more commonly used like within a geographic site Which is kind of the scenario where we're targeting not across, you know those distances. So it's definitely a good point Thank you. Thank you everyone