 I've been in open source for more than a decade, and my latest avatar is my co-founder of e2v networks, and I had the operations there. And today, what I'm going to do is like share a brief overview of H&A proxy, and the various deployment scenarios that has been done with H&A proxy, and share some best practices that people have done with H&A proxy, and in terms of what has been scaled with these tools and so on. Basic deployment, some best practices, and one of the major grouses of today's web deployment, which is like everyone wants the shade, nothing architecture, but unfortunately, it's not very easy to do so. And let's look at some of the add-ons that you can do along with H&A proxy. So the line speed that H&A proxy can give is very awesome. So H&A proxy was one of the tools that was developed to do that 10K connections problem. Of course, it's a moot idea these days, like everyone knows how to do 10K, but a few years back, I think like seven, eight years back, like 10K was a big target without having to do DNS load balancing and stuff like that. If you want to do single machine 10K, it's kind of considered a big challenge. And H&A proxy is probably the only tool that I have seen in my experience, which doesn't need a restart to fix things. For example, people would have heard about event-driven monetary starts of a lot of tools, like for instance, your Apache, your Mongo. Everything has to be restarted at some point of time to bring it to a clean state. H&A proxy is probably the only tool which hasn't had to, I mean, you don't need anything to restart H&A proxy because it's like users so little resources, like extremely fast. The only thing that it probably needs is a lot of threads on the CPU, but other than that, like very low on system requirements. It could probably run even on your mobile devices. And hardly any bottlenecks. If you have a well-tuned TCP stack, that's all it takes and you can go ahead and deploy your, I mean, you can basically scale anything. And the connection management is very good in H&A proxy in the sense that you could control the rate of ingress and egress to any of the front end and back end. And one of the way it helps us, you can have a controlled downtime for a few of your customers rather than having a catastrophic disaster for the whole site. So what H&A proxy can do is like, so like 10% of your users could be seeing indifference or like sluggish response, but it'll ensure that the rest of the 90% is kind of having a consistent response because it can control how much each machine should serve and so on. And one of the major advantages of using H&A proxy is that it's able to intelligently query the state of the back ends that are there. For example, if you have a tiered architecture with H&A proxy where you have an H&A proxy and then you have a caching layer and then you have a web layer, it's very difficult to understand the state of the web layer at the proxy level. So things like a lot of people complain that we see a lot of five or two errors, five or three errors with a lot of load balancers, even an Amazon and any other load balancers. You will get a gateway time out, you will get a service unavailable. All these are happening simply because the proxy or the load balancer isn't able to intelligently query whether the back end is up or not. It just keeps on sending them requests on a predefined ratio, which is kind of very bad and a lot of users start reporting all these errors. So what H&A proxy can do is it can do health checks at various layers, 3, 4, 5. It can do an L3 check, it can do an L4 check, find out if that port is open or not. It can even do an L5 check, load a page and then basically determine that this application is responding instead of just, I mean, taking for granted that if a port is up, the application is up. H&A proxy doesn't take it that way. And one big problem with SSL termination is that if you do not have SSL termination, you cannot parse the headers of HTTP at the load balancer. So when you do a TCP kind of a load balancer, what happens is you lose all the intelligence to parse everything because the very ground rule of SSL says that you cannot do a deep packet inspection or you cannot do basically a parsing of all the headers. You can at most get L2 and L3 headers only, but with the SSL termination in H&A proxy itself, you could do all the intelligence with HTTP headers that you would probably do with a non-SSL connection and you can indicate to the backend. So this, you could potentially convert any non-HTTP kind of a tool into an HTTPS tool. Your backends could be like very simple NGINX. You don't have to, you are basically offloading the SSL processing to the proxy node. You don't want your application to be worried about how do I do SSL, how do I config SSL and stuff like that. Very simple deployment scenarios with a kind of a misaligned graph. So you can have an H&A proxy with, it's the simplest representation to web servers and you could put up a simple load balancing ratio depending upon whatever kind of box you have and HTTP mode is the preferred mode for H&A proxy, that is it understands and parses the HTTP header and you can write a simple URL check to find out if a particular backend is responding or not. For example, if Apache runs out of threads, it will basically go into a waiting state or if there are too many time waits in the TCP stack, what happens is like your proxy will keep on retrying the request by a redispatch but you won't get a response and eventually it will time out by a 5.0.2 or a 5.0.3. So what you can do is quickly have a method of checking an URL and then determine that, okay, if this doesn't load in two milliseconds, I'm going to mark my application as down, that particular backend as down and then redispatch the request to the other web servers so that the response to the customers are like much more seamless that way. X-forwarding support. You can also offload Keep Alive to the proxy. Generally, it's the application's job to be handling Keep Alives but you can offload it to the H&A proxy itself. It logs in Apache format so you can do very complicated stats with H&A proxy. For instance, if you have five web servers, you want to know how well is each of the web servers performing and is there an error-prone web server which is giving a lot more errors than the others, you can find out all that using H&A proxy logs. Very simple SSL termination, you could convert any non-SSL application to an SSL service. The only support that H&A proxy provides in this case is like you can use an environment variable to signal to the backend Apache that this is an SSL and then your application could use those environment variables to detect that it's SSL. Why this is required is a lot of times, for example, the Tomcat containers, the incoming request, for example, if it is HTTP, the application typically assumes that the connection is HTTP and it could give out HTTP URLs, all the relative URLs. So if you indicate by an environment flag that it is actually an SSL traffic, what your Tomcat can do is give out all relative URLs in SSL so that your browser doesn't complain that, okay, like some mixing of SSL and non-SSL is there, that kind of issues you can avoid. And this can also work in TCP mode. By TCP mode, it's like layer four. So you can potentially use any application and do a layer four. What you can do is put multiple MySQL masters and then have a true right scale instead of doing complicated things like sharding using JetPants and all those stuff. You could do a very simple master-master and then use HAProxy to load balance it. Or you could have a lot of slave nodes, like say for example, you could have 10 slaves and then use the TCP mode to distribute traffic to them on a very simple ratio of whatever you want. And you could also imbibe health checks for to find out if one of the slaves is not responding. For example, let's say that one of the slaves has kind of locked itself up and exhausted all the max open connections that you can have. Now, if HAProxy keeps on sending requests, eventually you will get, say for instance, a communications link failure on the DB. So what you can do is build up a web check using a simple 10-net interface which does a, you could actually emulate a ZinedTD kind of a query here and basically get a success message and then HAProxy will stop sending requests to that particular box. Shared nothing architecture. Why this is so important is because the moment you start talking about a shared network, even with provision IOPS, for instance, as Amazon, we have seen that the performance can be really bad. So what we want to do is have an independent web server and anything that has to be shared can be intelligently synced using L-Sync. So for example, if you're running workloads like your Magento or even your WordPress, like if you have a huge site slash admin areas, all the uploads areas. So you could essentially trap all the post requests in HAProxy, send it to the first node and then set up an L-Sync mechanism which is based on R-Sync, say that these folders need to be mirrored across all locations. So this shared nothing architecture actually helps in the IOPS because if the IOPS are constrained on the web server, then that itself leads to the queues building and the processor loads like increasing drastically and so on. And you can also use HAProxy to ensure that any special URLs, for example, one use case is like there was one print company and they had a special Go script kind of a script which was only working on one server. So you could write some custom rejects on HAProxy. You can parse the HTTP header and say that this URL will only go to this particular server. So that way you can control where my nodes are and where my resources need to be concentrated and the rest of them can be commodity boxes where you can just sync all your images or codes or anything using a simple L-Sync daemon. Basically like users I notify library to watch out for all the changes that happen in your file system and then intelligently syncs them by using a custom R-Sync command, send it to the other boxes. So that way you're able to have all web servers in sync and HAProxy doesn't need to worry about the state of the box whether it is like fresh or not and it doesn't have to worry about the stainless of the content on that box. So with this, I think I'm done and I would like to have an opportunity to thank everybody and I'd appreciate if there are comments and any questions here. Yeah, hi. I mean very interesting which is proxy I was also exploring this. So there's like couple of questions because I also support web hosting environment in my company. So one is how does the SSL termination happens? Does it understand SNI or because we have name-based virtual hosting so we can't have different IPs for different SSL certificates? So currently it doesn't support SNI but it can just serve a single site. It doesn't support SNI for now. And another thing was that is it possible for example, I mean if I automatically scale the backends, number of backends server then you can automatically add these to HAProxy and they start. Yes, so HAProxy has a graceful reload mechanism. You don't have to bring down the entire HAProxy to add a node. So you could write simple bar scripts and then add the nodes and then reload the HAProxy to the corrections are cycled and then the new nodes can be seamlessly added. And there is also a SOCAT command line. You can just basically like control how the nodes are, I mean add it to the system and so on. And this is the last one. So the backends that I have is mainly USG and speaking and AJP. So does it understand something like HTTP and then redirects to USG or, you know. Yeah, I can do that. So you can potentially parse any of the HTTP headers if you are working in a HTTP mode. But in TCP mode it will just basically rely on simple metrics like port numbers and so on. But if you are into a pure HTTP mode, you can parse any of the HTTP headers and then you can decide what you want to do with this particular package. So it's almost like a deep packet inspection you are doing at L5. Okay, thanks so much. Thanks. Hi. Yeah, hi. One comment you made about it runs a lot of threads. What is that? What I meant was like if you want to have a really scalable HAProxy solution, you need to have a lot of CPU cores on the box. So that helps the faster responses. So HAProxy doesn't need that much resources in terms of RAM or something. But if you have more number of threads, the scheduling in the CPU, sorry, in the Linux kernel, it's like much more faster responses. You can even run this on very commodity, one GB RAM kind of boxes. But if you are running a web scale HAProxy, it's better to have a machine with a lot of CPU cores available to it. So I'm worried about it taking up thousands of threads specifically. So it doesn't take thousands of threads. How does it compare against AWS, round robin DNS? So round robin DNS is slightly more simpler than this in the sense that the round robin DNS takes a very simplistic view of up and down and then sends requests to that particular thing. Now there is no connection management here in the sense that AWS, for example, it cannot stop connections to a server once it reaches a certain limit. See, every web server has a threshold, right? For example, let's say that theoretically this web server cannot take more than 1,000 connections. Now the moment you start sending the 1,001 connection, your web server dies. Let's say for a kind of very simplistically speaking here. So that kind of facility to control the ingress to all the backends is not there. It's an evolving interface, but the ideal comparison I would say is the elastic load balancer. But there are some slow cache warmup problems there. So my colleague Ashidjit here has run some huge fantasy game kind of workloads on there. And his analysis is that a lot of times like when you add all the web servers and the players start hitting, there is like the cache like really warms up very slowly in the load balancer. But here we are not talking about the cache. We are only talking about plain connections. So the caching layers can all be offloaded and we are only bothered about load balancing here. Elastic load balancing tries to be a little bit more intelligent than it should be. So HAProxy doesn't claim all that. It says like I'm good at managing connections. Let me do that. Any other questions? OK, yeah. Thanks a lot.