 Hi everyone, thank you for coming to FOSDEM and thank you for attending our presentation. I'm Razvan Kraina from OpenSips Project and along with my colleague Livia will be showing you how to build a multi-node platform using OpenSips. So first of all, why use a multi-node setup? There are a couple of reasons in order to scale because you probably have resources and then you get some more clients, some more customers are interested in your services and you need to add some extra resources. You can add it horizontally or vertically, preferably horizontally because it's cheaper, right? And you want to use low-grade hardware. Another reason is to do geographic distribution. Imagine that you have customers in one part of Europe and customers in another part of Europe. You want to have physical nodes as close as possible to those clients in order to offer a better experience for the services you sell. You also want to do some sort of load balancing. For example, you don't want to use all the resources to power up a lot of resources to serve just a small amount of clients. You want to do that dynamically depending on your platform or on your business size. And probably all of you are interested in high availability in order to be able to failover in case a node crashes or something happens. So imagine that you have this kind of setup where you have some nodes in a data center in London, others in Amsterdam. So you have your nodes in order to offer geographic distribution spread all over Europe. You want your customers to have a unified experience. You want your customers, all customers to see each other no matter where they are located. You want your customers that are calling from London to be able to reach the ones in Netherlands, right? You don't really want them to be aware of your topology. You never want to, for example, assign IPs to your customers just in order to do some sort of manual load distribution. You want to allow them to use any entry point in your network that it's available. And in order for them to have a unified experience, all of them need to share the same profiles. Wherever they are, they need to see the same thing. And they also need to use the same resources too. So this is a few use cases that you want to have when using a unified platform. First of all, user location. So imagine that you have different clients that are registered some in London, some in Germany, some in Paris, in France. So each of them use a different entry point in the platform. Now what happens if the guy from London calls the guy from Paris? Well, he doesn't really know where the green guy is. He doesn't really know he's in Paris or somewhere else. In order to be able to do that, we need to distribute the data location among all POPs. So that node 1, when the call gets to node 1, he should know that the endpoint is registered in node 2. So he's able to see the customers that are in node 2. And there are a couple of models that we use to do this. We can have full mirror. All the nodes share the entire data, the entire location data, which in practice is not very, very useful because usually these clients are behind that. So even if I have the entire user data, node 1 will not be able to reach the guy in France. He will have to use node 2 to break through node 1. And another model is federated, which basically means that each node has its own set of users, and all that a node must know is on which node the user is registered and let that guy and use that guy's information. Another use case is the profile share. For example, you might have some limits on the concurrent called a customer is allowed to make. So let's say a customer is paying for 100 concurrent called limits and he starts to send 80 calls on node 4. Well, what happens if he decides it somehow, if he decides to send 60 more calls to a different node? In this case, he's using 140 channels out of your platform, although he's only allowed to use 100 because that's what he paid for. So again, in this case, you have to replicate the number of concurrent calls that are happening on each node so that when node 5 gets the extra calls or whoever gets the extra calls, he must reject the new calls so that you can enforce the limit of 100. Similar, if you want to use any class to do load balancing, you have a different problem. For example, let's say that you want to reach the guy that's registered in node 2 and you use node 2 to contact him. Well, you send the invite and then you get a reply. But the reply, according to the Anycast rules, which is something completely different, is not CIP related, it's basically layer 3 related, decides to go to a different node. What happens in this case? Well, the call will be completely correctly through node 5, I think. However, node 2 will not get a reply. If he doesn't get a reply, he will start doing retransmissions. If he won't get the replies, because all replies go to node 5, it will eventually time out and close the call. So you really don't want to do that. In order to avoid this, we also, in OpenCIPs, we developed a mechanism that we use to inform node 2 about transaction changes. And what's interesting in this thing is that basically what happens is node 2 is the only one that has the real information about the transaction. So node 5 will not construct any transaction which you might or might not know is very heavy in terms of resources. One extra use case is high availability and ensuring it using hot backups. Imagine that you start the call with node 2, but this call, this node crashes. What happens with the call? What happens with the CDRs? Well, in this case, you will have to reach another node. I don't know, using Anycast or DNS or whatever failover mechanism you are using. But for this guy to be able to generate proper CDRs, he needs to know when the dialogue, the call started, right? And all sorts of information about the call start. These are... So again, you have to replicate information about calls, the start time and so on in order to be able to generate proper CDR. And you also have to make sure that one node does the dialogue-related jobs, for example, time out. Imagine that all the nodes within our architecture have information about the dialogue that doesn't receive a buy. They will all generate a time out and they will all try to close the same call. And they will all generate, like, five CDRs for a call that was never closed. You don't really want to do that. So you have to do some sort of delegating a node to a dialogue. But that's what my colleague, in the following minutes. So in this presentation, we will only talk about how to organize these nodes in clusters, how to use these clusters and assign them to different models depending on their purposes. And all you have to do is let the modules do their jobs. Okay? I will hand the presentation to my colleague. Thank you, Rosvan. Speaking of clusters, how many people have played with OpenSips? First to go. Okay? And how many of you have played with the OpenSips cluster features? Okay, so that's what I was thinking. I was expecting this, actually, because I saw a lot of new faces. And I'm actually going to explain the new clustering features using some practical examples. And here we, let's start with a basic RFC3261 implementation of SIP. And basically, you can achieve this using OpenSips by just loading the user location module and using the default routing config. And let's apply it in real life. What is an obvious problem with having this type of SIP registrar, which users can send in and make their calls either to each other or to PSTN? What is an obvious problem? Yeah. Okay, I guess. But what if it crashes, right? We're going to have downtime. Not only that, but we cannot even handle a freaking restart. If you restart that daemon, you will lose all the registrations. And the phones will have to take another, maybe an hour, four hours to re-register. Okay, so we can add like another layer on top of it of database persistence. All good and done. That's just a couple lines of config. Change the user location mode and hook in a MySQL URL and we are good to go. What's still wrong with this? It can still catch fire. If that server goes down, you will run into more downtime. You have to answer to your boss or maybe even work to your clients and start handing out emails, whatnot. Answering calls, tickets. So you can take this one step further. We can add high availability into all of this. Okay, so we throw a backup node now and we use a shared database. We put in a virtual IP on top of this and instruct all the end points to register over to the VIP using, I don't know, some sort of VRRP protocol keep alive the VRRP that handles the moving of the IP and all that part for us. So the plan is that when the active node fails the backup, we have this little backup procedure failover procedure that says let's reload all the dialogues from the shared database, let's re-cache all the registrations and maybe afterwards the service will survive. So that's basically what we had up until OpenSIP 1.11 when we started moving into the clustering area. But what's wrong with this? This is still not enough. What's the downside of this? Or should I say, they're not obvious right now, it's getting more subtle. One of them is the fact that this re-caching procedure may take time. It's all good if we're talking like a thousand phones or a thousand dialogues, but what if you have a million dialogues on one instance? How long does it take to query that and re-cache it into memory? That can take as good as one minute although you're saying I have high availability, you have high availability after one minute. Another problem is that what if the master simply does a reboot? Once it boots up, it re-caches all those dialogues into memory. So what you have now is two instances, so the VIP correctly moves over to the backup instance but now you've got this setup where they are both duplicating the data, so okay, you're saying that might not be that bad but how about this? They will both start writing CDRs now. So you're starting to get duplicates on that and not only that but half of the CDRs will be good on the backup because they probably close thanks to the SIP signaling when you hang up the call. But what the master will happen is that they will hang there and all of them will time out on whatever you set the max time out call duration, typically like two hours and you will get this duplicated CDRs with max duration. Try explaining that to the client when you say okay you have this call, it costs 200 pounds or something like that. So another thing that the problem with these data duplication is that oftentimes you need to continuously ping your clients in order to keep the net binding alive and that the master node also will attempt to generate pings and he will not be able to send it because the operating system doesn't let it since it does not own the virtual IP address and it'll flog the logs with operations, not permitted, blah, blah, blah and you want to prevent that. So now enter the open SIPs clustering. What is the idea? The idea is that we need to give the setup more state make it more stateful, make the nodes more stateful with each other and the idea is to replicate the data live to the backup registrar such that when the master catches fire we are able to instantly failover to the backup. As fast as the virtual IP solution can do it, but people have the typically fails over within one second. Another, so this solves the failover problem. The other half is with the data ownership and solving the dreaded, duplicated CDRs. So what we came up there is, and you will start seeing in the documentation starting from 2.4 open SIPs, is the sharing tag concept where we each, so although both instances have the data, have the same dialogs duplicated, they are using this shared tag, let's call it VIP that basically says the following thing. If I own the VIP, I also own the data. If I don't own that tag, I pretty much don't have anything to do with that dialog. So in 2.4 you can change these tags with the DLG set sharing tag active command and in 3.0 we move the logic over to the clustering module. I guess this is just detail, but you can hook these commands into the keep alive D switchover procedure and you can automate this such as when the switchover happens, it also switches the tags. So it switches the proper ownership of the data to the respective node. Enabling this, I mean I talked a lot about this, but enabling this is done in like three lines. So all you have to do is switch the user location into the persistency mode into the full sharing, give it a cluster and the same for the dialog and after you create the dialog just tag it with the VIP tag. That's all you have to do. Okay, so is this enough now? Well, actually it's pretty good now, but customers always want more. It never stops. They will say okay, I want 20 of these. I want 20 of these setups. So how do you start handling that now? And that's where the federated idea came in where we don't, as Roosevelt was saying we don't fully replicate all the data to all the nodes. We rather keep it local and all we share is like some location pointers I put in that no skill metadata part where if one call comes in in London, it looks up the metadata says okay, where is my exit node? It's Hong Kong. Send it over there because that node will be able to handle it. A cluster table with these six nodes would look like this. Notice that we have a global cluster made up of six nodes but they are grouped using this SIP address column and that's the only use case for that column so the federated user location is the only one that uses it and you just use it to group these two by two. And as a wrap up, I went through the examples that handle the reachability problem, the hotbacker problem and so the solution to sharing profiles and counters is pretty much the same thing. You just enable the cluster for that respective module and the module will do all the work for you and similar to any CAST, probably a couple more script calls to do but it's all well documented. So I guess the take out of all this is that we've put in a lot of work into the cluster features both in 2.4 and 3.0 that's going to happen by May hopefully and yeah we highly it's well tested, we've deployed it to a lot of our customers and a lot of companies are using it so we highly encourage you to give it a spin and also we have this promo code for the Amsterdam Opposite Summit this year which if you are interested in we would more than welcome you over there and yeah, you can use that to register and you'll receive like a 15% discount. Thank you very much. Do we have time for more questions? One question? Okay. Any questions in there? Hello and the user location synchronization is working also when you pick it back by the mid-register? Yes, the mid-register is compatible with the user location cluster and all of that. Any other questions? Hi, in the past I've also had a look at your FAQs, your how-tos for building a cluster including asterisk as media service. My question is because in every how-to you approach this as building it from scratch what about PBX functions that are not purely SIP voicemail and all of that if you already have existing customers is it feasible to move them to this model or the model with asterisk as media service? Maybe we should have put in a slide towards the beginning we don't do media at all, so we just handle the signaling part. The solution, although it's clustered here over the signaling it would also use 50S servers in the back or a free switch, whatever. It would be to have one instance of what it's putting in front of all these servers and be able to dispatch or balance let's say partitioned clients based on each instance that you have but you might have that on your core network, so we will use the nodes that I presented earlier just as an entry point inside the platform. For your talk, my question is when you deploy multiple nodes over a geographic region and the network capacity is rather limited and you have stations roaming from one region to another are you able to handle roaming and are you able to handle limited capacity in the network? Yes, as long as you replicate information about one node, about the space within one node to the others they will all know the same thing basically at every single point all nodes will know the profile and the location of the user, so it's very easy to move the clients to one node. In terms of hardware or what? The network depends on the data that you are Is this a question of how many traffic can it handle? I'm sorry, Louis, we can continue the discussion It's highly scalable it will handle tons of packets, thousands of CPS Thank you