 OK, donc, thank you for coming. So today I'm going to introduce a bit of HAProxy. HAProxy is a very famous and it's one of the most used open source load balancer. Everybody use it for load balancing more or less and I wanted to show you that it can do much more than load balancing. So that's the purpose of today's session. Since I trust HAProxy and my computer, I'm going to do live demo. Hopefully there won't be any problem. So who we are? First, I am Baptiste Asman. I'm one of the maintainer of HAProxy. I work in the company whose name is HAProxy Technologies, which is a company which develop, maintain HAProxy community and make business with HAProxy, more or less. So we have two main products, an application delivery controller which is an appliance. You may know some famous F5 or net scaler, so we compete with them on this one. And we also have the enterprise product which is a software that you can use to load balance very easily any kind of application. If you want to get connected with us, we have website, Twitter, a lot of stuff. If you want to take some picture, don't forget to copy myself or to copy our Twitter account. That way, I can show them to my children later. Then we'll be happy. So, one point about open source. Usually, people consider that contributing to open source is contributing code. And since code is in C and furthermore in HAProxy, which is purely event-driven, it's quite complicated to produce some code. But I just want to remember you that contributing to open source is not only contributing code. Contributing to open source is a lot of things. You can participate to the mailing list. You can write blog posts on whatever open source software you love. You can help people. You can report bugs. You can anything you can do which helps whatever open source and I'm not speaking about HAProxy only. Whatever open source you're using, anything you can do is very important. As soon as you do something, then you are a member of the community. And this is how your product, the product you love, is going to be better and better. That's how open source works. So, bear that in mind. Not only code, whatever you can provide, it's very important. Something important as well, I'm going to introduce a few stuff in HAProxy so don't blindly apply on your architecture what I'm going to introduce there. It's very complicated. I mean each application, each network stack, each operating system you are using are different. So the settings I'm going to propose there, the values I'm going to use are not maybe the ones that you may need in your environment. That said, if you need help, don't hesitate to contact the mailing list or if you are a happy customer from us, don't hesitate to contact the support. I hope you have your 3D glasses because you will see that the picture are very amazing. If you don't have the 3D glasses, you will see something else which is amazing as well. And remember that this is a non exhaustive presentation. HAProxy is very wide as a very wide feature set that can be used in a very wide type of usage. And so there is a lot of stuff that I'm not going to mention but you could do with HAProxy. So about the lab, I'm using Docker on my computer. I had some problems with Docker. So each time you upgrade Docker, Docker change a lot of stuff and sometimes it breaks everything. So that's a pain. When I compare that to HAProxy, if you take a configuration file from HAProxy 1.0 which is 14 years old now, it works with HAProxy into 2016. We guarantee the backward compatibility. So Docker is a very new software that's why it's evolving a lot. I think they are learning from the beginning, from the beginner side. So I am using Docker. I have HAProxy. I will have a few web servers and I have a Docker container that will be my syslog server. So I can send all my logs and I can do some stuff and I'm going to demonstrate you all of this. Quickly, how HAProxy works? So in HAProxy, we have HA and we have proxy. And proxy means that I have one connection with a client and another connection with a server. And HAProxy will stand between both of us, both of it. This is two separated TCP connections. So you can do whatever you want on the client side and whatever you want on the server side. So basically you can speak IPv4, IPv6 on the client side and you can speak IPv4, IPv6 on the server side totally decorated. We use it a lot for TLS as well. Imagine you have a weak SSLV3 server that you want to put on internet. Put HAProxy in front of it. Configure HAProxy to do TLS 1.2. HAProxy will do TLS 1.2 with a client. SSLV3 with your server. So you can apply this on any type of protocols that we can deliver with HAProxy. So HAProxy is split in two parts. The front end which is the part which talks to the client and the back end which is the part which talks to the server. Between both of this HAProxy will forward the hot potato. I call it the hot potato because usually people think that HAProxy sees packets. So HAProxy does not see any packet. HAProxy is a proxy. The client will aggregate the packet, create a buffer for HAProxy and HAProxy will see this as a session. And this is what I call the hot potato. So basically on the front end this is where you are going to define where you want to specify type of logs and everything the statistics and on the back end this is where you are going to configure the health checking the load balancing algorithm the queuing mechanism all that stuff and much, much more. So beer this diagram in mind that will be very important for the next stuff. So HAProxy we saw that it stands between the client and the server. It's a very important point to report a lot of information on the client side and on the server side. We report information first on internal states. So for example what is the state of the server how many connections you have on the server how many connections you have on the back end are you queuing are you blah blah blah all that stuff can be reported can be used within HAProxy. The connection layer we can report some information about the number of failed connection attempt and at the session layer which is more or less the HTTP so we can report number of request number of bytes number of everything which has passed through HAProxy. There is one notion which is quite interesting as well is the time since the last session so that's important when you want to put a server in maintenance and you know that the session is alive in your application server for 10 minutes if HAProxy tells you that no session has been forwarded to the server for the last 11 minutes you know that you can shut it down because basically you won't kill any user session so we have also some information about the queuing algorithm load balancing algorithm well a lot of stuff I mean this will be on a side share so you will be able to collect all of this and then we can read the statistics using two different methods or yeah two difference one is a unique socket so it's a unique we call it a unique socket but we can make it listen on TCP as well and even over SSL so if you want to use HAProxy and or Telnet and SOCAT or whatever to use it you can and we also export statistics through a stats page web page more or less so I'm going to configure HAProxy to show you exactly how it works for our web for our HAPi and Aloha customer you have SNMP which is enabled as well quickly how we configure the unique SOCAT stats SOCAT pass to the SOCAT and that's all that's very simple you can do the same with TLS connection so if you want to cipher traffic to your stats SOCAT then you can do it and we can also have a stats page so that's the latest code snipplet I have here that's the same if you have you can send the bind line I want to cipher the traffic and I want to listen on port 443 and use that certificate for this purpose of course HAProxy support the client side certificate so if you want mutual authentication between HAProxy and the client to ensure that nobody is collecting your statistics for you you can configure HAProxy to ask for a client certificate and you configure a client certificate in your own client so this is the time of the first demo I'm just so simple since I don't use OpenStack on my computer OpenStack is a bit heavy for my computer I am old style DevOps let's say the DevOps from the latest decade and we used to use makefile a lot at that time so everything is automated within my makefile so basically I run the third demo exerces one and it's going to start up kill the existing container so I should have now the stats page somewhere which is here we don't see any traffic yet but we can see that we have some back end configured some front end configured and we can see that we have one application server enabled it is green which means that the server has successfully passed the health check so I have a couple of commands so basically when we do benchmark some applications when we are customers we have two types of benchmarks one is I hammer the website with a number of queries per second as much as I can so usually it works and another test which is interesting is a spike the sudden spike so with the inject tool I'm going to send as many queries as I can per second and with my Httpf tool I'm going to open 1000 connections or 500 in that case a huge bunch of 500 connections the same time to the server and send one query on each of the connections and see what happens on the server so basically well I think I don't need to do that I have this inject so my inject is running inject is a do me tool it's available on a hfroxy.org website so the most important column is a green one it's a number of queries per second on my application so just shut it if I refresh my stats page we can see that some traffic was passing through the website everything seems to work smoothly so let's try the next test I'm running Httpf so here Httpf is going to open 500 connections send only one query per connection and you can see in red that some of the queries has finished in 500 so your application which seems to run properly at a few 100 connections per second fails when there is a huge spike coming to the application server so we'll see how we can fix that something important is that in the stats page we don't see the error so hop you move the mouse everywhere on the stats page where there is a number which is underlined you have some more statistics I mean more and more and more and we can see that the 244 254 errors we had from Httpf are here in the stats page oops so that's the conclusion of this slide is a devil is hidden between behind the underlined numbers so that was in case of my demo did not work so we don't mind so now I'm going to speak about logging so we saw that Httpf is able to report a lot of statistics at runtime that said we can see that Httpf is also able to report a lot of information within the logs so the logs of Httpf are very verbose remember it stands between your client and your server usually we have some customer who disables them so you can do it it takes around 10% of CPU sometimes when we want to troubleshoot we really need the logs and one of the first statement you will have on the mailing list is give us your logs I mean you report a bug or you report a weird behavior and then we are going to tell you please we need the logs so enable them we see that so the logs can be configured in the global section of Httpoxy in the global section you configure where your log servers are located so IP address TCP port UDP port for now TCP will come later and in the default and the back end section this is where you configure what type of information to log so usually everybody uses the default Httplog format but you can also configure your own log format using the directive whose name is log format makes sense so when you define your own log formats try to respect the Httplog format because we will saw that later we have some tools which allows you to report to get a lot of information about your life traffic so quickly the global section we have a log server we have in the default section the option Httplog so the advantage of putting it in the default section is that you don't have to repeat it everywhere as soon as it's in the default section it will be applied to all the front end and back end up to the next default section so you can have as many default sections as you want just put your option logging here usually we also put the time out that way we are sure that everything will work smoothly we can also split the logs and this is usually what we do so you can split the logs traffic logs and event logs so what we mean by traffic is any traffic passing through Httplog and the event is when health check is failing stuff like this why? that's because we have customers with hundreds of thousands of requests per second and if you have one server failing this is looking for one information in a file gig so definitely it's better to put this line in a dedicated file so when we split the logs like this you can configure also your syslog server to log local0 in a file and local1 in another file we can decide to log on easy errors you have the option don't log normal so yeah this is the famous I don't disagree we can also decide to not log the you know the browser the new features the pre connect feature which connect to your application server without doing anything and so you have resources used for nothing HAProxy will report 408 and we had some users complaining on the mailing list saying that their boss complains because they have a lot of errors in the statistics which are not errors because that's only a connection a connection which has been established without any traffic so we have the don't log the ignore probes that you can enable so this statistics these errors will disappear from your from your graphs and so your boss will be happy because you won't have any error anymore in your application you have also an ACL which allows you to decide what you want to log so I put a simple example I decide to set the log level of each request to silence so sorry unless I'm French unless the hand of the of the URL is.php for example that way it allows you to log only your dynamic traffic you can apply this to your content you can collect from the HTTP traffic so what we are going to find in the log line main information so the path in HAProxy the path means which front end which back end which server in the back end client information the number of bytes read stuff like this the most important part is the termination status so the termination status will let you know if a problem occurred or not and if the problem occurred was it on the client side on the server side was it because of a time out was it because of a network connection issue was it during the header phase the data phase blah blah blah a lot of information that's why when you have a weird behavior the first thing we ask you is send us a log line and within the log line we are going to tell you exactly what happened we using the log format directive we can log also any type of information from the SSL or TLS traffic and we often use it to run some kind of audit on your TLS you know that IPv4 exhaustion is coming or has already occurred and so now many people want to mix many TLS certificate on the same IP address you can do it only if all the client from this certificate can send the SNI information so HAPROXY can be used to run this type of audit and you can ask HAPROXY to log the SNI host it or user agent and you will know exactly which user agent are sending the SNI or not and if you can mix multiple certificate on the same IP or not safely so we are going to make HAPROXY logging a bit inject some traffic and we are going to see what happens purpose is just to show you some log line hop demo 2 ok so I am going to generate a bit of traffic so my syslog server is here this is not interesting for now this is a test from this morning so I have a get I can see that I have a few request flowings through the server and this is my HAPROXY log line so that's very important most of you you don't understand anything I guess but my my brain I would say is now formatted to this log line and if there is an error in this log line I can spot it to you and I can tell you exactly what happened so basically the termination status I was mentioning is here you can log any type of feeders you can log any type everything will be integrated here so this log line is very important one thing I did not mention sooner is that within this log line you have the TCP connection time between HAPROXY and the server you know how long it took for the client to send the HTTP request how long it took for the server to process the response and start sending the response you know all of this we also know the number of connections on the server at that time all that stuff this information because when you are going to contact us saying oh I have a problem in my application it's slow well it's slow we are going to tell you exactly where when why ok so let's let's run again some queries so same as sooner that the same type of content let's have a look so we can see that we have a lot of queries coming so my inject is using random URLs and my PHP server is configured to answer positively to random URLs we can see here so this is the TCP connection time so my network seems to work properly on my loopback fortunately and here this is how long it took for the server to process the response so I'm cheating a bit this is not a real application this is a small PHP script I have written myself and it computes a square root sorry why square root because it's very easy to compute when you have only one to compute but when you have thousands to compute in parallel it will be very complicated for my CPU and that's that's why the second test is very interesting so the second test is when I inject is when I open my 500 connections and I send my one query over each connection so we see that we still have the error ok cool but now we have HAProxy logging and we can see that this is the time for HAProxy to establish the TCP connection on the server so basically the server was saturated more or less here we can see how long it took for the server to process the response 863 ms what happen is that my application performs well when I test in number of queries per second which is fine but my application does not scale at all when I try to send that amount of queries in parallel so when you run testing when you test your application number of queries per second is interesting but this one will kill your application if you don't test it and if I am a hacker this is what type of attack I am going to run I am not so yeah we can have also a look at the stats page so the stats page says that we have some traffic we still have some errors same as before now let's consider that it's a capacity problem so what I want to do since it's a capacity problem we are open stack we just ask for more servers to come in the game so I have another demo where I have four servers and I am going to run exactly the same test that way I am not cheating so HAProxy my script has started four servers if I go back I now have four servers in my farm and I am going to run my HTTP per test and let's see what happens oh errors have disappeared wonderful so it means that it's a capacity problem well this is what you think so I am a fellow HAProxy user and I am reading the mailing list and I am reading the blog and whatever and I I have heard one day that HAProxy has a magic feature whose name is Maxcon so I have prepared another lab and so the Maxcon parameter I am going to explain in the slide after I don't want to to reveal any secret so I create a new lab where I am going to have only one server and I am going to allow HAProxy to start queuing queries at its layer purpose of queuing at its layer it's not to saturate the server with a too high number of connections because what we saw currently that our application does not scale in term of number of connections in parallel so I am resending exactly the same traffic my 500 connections oh now I have one server only I configure queuing at HAProxy layer and my server does not report any errors anymore so basically it was not scalability issue it was more protection or limitation of resources on the server that I have to protect now this is not magic so we don't mind it works this is not magic sorry this one as well so let's explain again remember we have HAProxy one connection with the client one connection with the server HAProxy can I mean it's purpose it's to be your network stack your public facing network stack so HAProxy so first we have to configure properly the network stack of the kernel so we can propose some cctl etc but HAProxy itself is purposefully written to accept and to handle properly a high number of connections per second this is what we do when we open the 500 in one microsecond and when we send one query on all of them on all of these connections your application server is not made for this whatever application you are using it's not made for this HAProxy will accept all the connection and since we are telling it to queue the queries it's going to open only a few number of connections on your server what happens on your server your server does not process a lot of queries in parallel so it will be faster to process each queries and HAProxy will be able to send more and more queries in parallel but with a small number in total so if I go back here so basically you see a limit which is here at 8 connections so my HAProxy has prevented the server from processing 8 requests in parallel or more than 8 requests in parallel we can see here that we had up to 492 queries in the queue the fact we queued the traffic allowed us to protect the server allowed us to make the server to answer all the queries we had so that's very important no errors anymore so the other reason another point which is very important so when you have here we are on a land so the client is very fast HAProxy is also able to do what we call TCP and HTTP buffering so what is it HTTP reads at the speed of the server and delivers at the speed of the client usually your client now more and more traffic is mobile so your latency on the client side is from 30 ms to 200 ms maybe more the latency on the server as soon as the server is answered it's a few ms so you don't want to block a connection for a few ms 300 ms on the client side if the server has already processed the response so HAProxy will get the response from the server store it in a buffer recycle these connections for another client during that time it's going to push the response to the other client so CHAProxy as a I don't know as a like squid but more modern and performing very well it's purpose it's to manage the TCP connection and the buffer between all the connections and it makes it very very good so now we've seen that HAProxy is able to improve or to protect our application server we're going also to see how we can monitor the response time and how we can find out where the problem was exactly in our application so let's go with the demo number 3 oops, that's not this one so I'm still having a few servers just to play with no, that was not this test so I'm just injecting some traffic like before performance is a bit slower and so with this tool now I can sort my servers by response time so we can see that they have more or less the same response time some of them are a bit slower but that's fine it's not that huge and with this tool I can sort the URL by response time so we can see that some of them are quite slow and some of those the static ones are very fast for example so we know that it's not a problem on the server itself because the static content is delivered very quickly, it's more or less a problem within the application yep, next one so we don't need that yeah, I'm just keeping the next demo just to split the traffic static dynamic and just showing the difference the idea was to show when we split the traffic from dynamic to static we can see that the response time of the application is for example much slower on the server application 9 so basically some clients are complaining about small latencies and this is the one where we should have a look for those who use Elasticsearch, Kibana for the stack you can even graph URL response time per server per backend, per whatever and they collect statistics and everything, we have some video on youtube about that now since the topic is monitoring performance and improving performance I'm going to show how I can improve performance so basically I now have my application server running in a KVM container VM, so why the KVM VM is that because I have some kind of networking between both of them and in the VM I have purposefully made a weak network stack and we are going to see what happens so here I'm starting the VM I'm starting HAProxy as well so if I go back if I refresh my VM is down that's the demo effect let's do that and let's restart ok so the VM is starting normally ok coming up so we fight the demo effect so I have now my server and what I'm going to do is that I'm going to send my 1000 connections on the server directly and then I'm going to place the same traffic on HAProxy and we'll see what happens so the name of the server is server1 it was not supposed to do that ok that's it so I sent I opened 1000 queries on my server and it took for the server 3s 200ms to answer all the 1000 queries now I'm going to do the same on HAProxy and for the exact same amount of traffic through HAProxy and thanks to the queuing mechanism the response time is faster 1 second for something so purpose of this small application is just to as I said saturate a server in real case it could be more, it could be less it could be bottleneck so I don't have any database between my PHP script but there is always a bottleneck somewhere in your application it could be your PHP code it could be your database server it could be the network connection between your PHP or HAProxy and PHP or PHP and your database it could be the file system behind the database it could be whatever HAProxy will be able to protect your application protect your response time faster and to protect your weak application I mean as an attacker it's easy to browse your website and say oh I see that slash search engine takes 10 times to answer or to response than other URLs so if I want to attack your website I'm going to use this URL send 1000 queries and kill it so usually at customers we sometimes have a back end to protect those weak application weak URLs you route your weak URLs you do some queuing mechanism to protect your application based on this so yeah some conclusion quickly HAProxy can protect your application it can improve the response time and yes there is some magic in queuing yeah that was in case of I think that's yeah there is also something important I did not mention thanks to the logs usually what we do what we explain to customers is that we won't fix the problem in your application that said when you have a problem somewhere the longest part is not in fixing the problem is in troubleshooting the problem you take 1 hour to troubleshoot 5 minutes to fix HAProxy will reduce this troubleshooting part to either a few server, a few URLs a few whatever and as soon as you have reduced then you can ask either the network engineer the system engineer the developer, the DBA whatever to have a look at the pattern that has been reported and fix it all those magic features are available in the community version so as a company even if we do business with HAProxy open source we a commitment to the community that all the basic features related to load balancing high availability HTTP processing will be always in the community another point HAProxy doesn't lie it is not here to say usually we use it at customers to sort the problem because sometimes there is a fight between the developers and the network engineering team everybody is saying that the problem comes from that's quite common so usually we say don't worry, don't fight let's have a beer, let's have a look at the logs and HAProxy will tell you what happens it doesn't care if that network or application it will tell you exactly what happens and it's free and open source and once again it's non exhaustive presentation we can do much more than that so if there is any question nope so it was exhaustive yes so about what type of cipher we recommend for SSL performance that's the question more or less so normally in your CPU and this is a good thing whether it's a VM or not it does not make sense it is not important in your CPU now you have the AESNI instructions your hypervisor is supposed to expose this as well in your VM if you want to run HAProxy in a VM which is good because thanks to that you can do hardware acceleration that's marketingly excellent so when you have this AESNI instruction the best is to use the GCM whatever ciphers because the GCM ciphers will tell you OpenSSL library to use the AES instructions ok in our lab we can cipher 5GB of traffic using AES 256 GCM whatever with one core of HAProxy and there is a new dev as well which has been made it's a contribution from Cisco so you know that RSA keys are very complicated to compute from a server side and now there are some new ESC DSA EC DSA sorry which are quite famous and the advantage of EC DSA is that the keys computed half by the client half by the server and at the HAProxy layer it means that you can compute 15 times more keys than with RSA so not all the client are compatible but the advantage of HAProxy is that you can put your certificate and your key in RSA your certificate and your key in DSA and automatically HAProxy will detect if the client is able to speak RSA only so it's going to deliver the RSA key or it's going to speak EC DSA and it's going to deliver the EC DSA oops that's a 40 minutes RSA certificate to get much more key computations per second so one last question because after it's 40 minutes it's a small one by default it logs everything in one log file right for all the front ends can we set it up to log in like each can we have a different file for each front end so we don't log in a file we log in a syslog server that said you can add the log IP address and you can force one front end to go to one server if you want you can do that per front end yes any other question ok otherwise I have I have to go sorry but if you have any question you can discuss now