 Hey everybody, it's Brian. This is not a tutorial, this is a follow-up to a tutorial that I've done and also a personal hobby of mine. I've done a six-part series called the High Performance TCP Server Design and I wanted to revisit this because it's kind of a hobby of mine. It's just kind of like a personal challenge to see how far I can take this. I've always been intrigued with TCP server design. Don't ask me why but maybe I should be employed with like Microsoft or patchy or something. Who knows? But anyway, so we really stress tested this thing and on my Linux box at the time it was hard-capped at about I think 1,024 connections because that was my you limit if you will. Let me bring up a little... I've kind of just you know for the sake of doing this I really wanted to... I changed my soft limit because that's what was really stopping me to 20,000 so I can open 20,000 file descriptors in Linux. Now whenever you open a file or a socket or whatever you're opening essentially well you guessed it a file and that is stopped by the you limit. Now I set this to 20,000 so I could potentially open roughly 10,000 connections. Remember the other program will have to open 10,000 and I open 10,000 and then you got all these files in the screen so we won't actually get that but it'll get close. So I re... I spent... let me back up. I spent the pretty much bulk of my weekend rewriting this thing from this ground up really and we've got a TCP server which you know just inherits QTCP server and we're overriding, listen and we're doing some some socket managing the background and I'm using Threadpool and the reason why I'm using Threadpool is well it's very streamlined it's very easy to work with a pool of threads and I read somewhere I don't remember where it was that it actually leverages multiple cores and it's very scalable. Now I'm not sure if that's 100% accurate I swear I read it somewhere but I'm sitting here going it's 2016 you know programs should leverage multiple cores. I know under the hood the operating system really determines how many cores your program actually uses etc etc but I kept that in the back of my mind during the design. So what this does is when listen is called it's gonna go through and say for I equal to whatever the max of the global instance of the Threadpool count is it's gonna create a new pool. Now it's not a Threadpool it's a socket pool or a connection pool which is this class up here so if we have if Qthreadpool has a max of 8 it's gonna create 8 TCP connection pools and put one in each thread basically. So this is a Q runnable this gets run through the Threadpool and it's also Q object so I can bind into the signals and slots and what this does is this holds a essentially a list of TCP connections. The TCP connection is just pretty much a wrapper around the TCP socket or the QTCP socket so in short what this thing does is when a new connection is made it goes in and I'll actually just show you. Oopsie hit the wrong button there. Compile and run this. So there's wow that is really large for some reason let me size that in. So the max for my system a Qthreadpool is gonna give me eight threads and you can see how it starts each one of those Q runnables. So when we go in and we're just gonna do a telnet session because why not? 0, 0.1 cannot type tonight. We'll connect. You see all it does is it just loads a connection. So what this does is when a connection comes in it checks the number of connections each one of these pools has and says which one's the lowest one and then you know let's just say this guy's the lowest one it'll throw that in there and the next one comes in and it does this you know round robin again and it just keeps doing it over and over and tries to balance the connections out between the threads. Now as the connections are made and disconnected the pools will you know raise and lower in volume depending on how many they get and this was a bit tricky because obviously we're working with pointers which is always a adventure and awesomeness and then you have to do all the memory cleanup and instead of doing the direct delete move this off to the side here actually went in and just did delete later where was it where was it I think I lost it oh yeah started pending blah blah blah except yeah I played around with the straight delete and the delete later and there's really no noticeable difference in performance I'm just kind of curious in practice what you guys use I kind of do delete later just because I think it's a little safer than just right out deleting the object when I'm not myself 100% certain that we're done using it so we've got our little server here and we've got eight of these pools spawned and I'm actually just gonna close and restart it so that interface looks nice and neat so we've got eight little connection pools in there and we're gonna load up handy dandy couple little things here maybe if my mouse will work with us here all right we're gonna load up a little program called siege which you probably remember that we're going to go with the system monitor so we can see our CPUs and everything you can see my computer has got eight CPUs or eight cores if you will and you can see there's really not a whole lot going on that just spiked up for whatever reason there's some silliness going on but spiked as a relative term you see they're all around you know zero three percent core seven for whatever reason spiked up core seven is probably what is running the app but so we're just gonna throw a thousand connections in here on siege and you can see it just goes and goes and goes and this is kind of going crazy in the background here and you can see our network usage spikes up that's all the connections being made right off the bat and the CPU usage starts going and you can see each CPU is sharing in the work here or I should say each core is sharing in the work and if we just control see that and you might see down here this might move a little bit it'll be more obvious when we start ramping these connections up you may see these connections being deleted and that's the delete later being called you can see the CPUs are doing their work and all the cleanups happening and all that so hundred percent availability concurrency about 521 that's really not bad longest transaction was 17.22 that's for I mean for a thousand concurrent wasn't that bad elapsed time 20 seconds so what I like to do is I like to take things to their breaking point notice how we're not stopping the application we're just leaving it in place this is good practice when you want to test things to make sure you don't have memory leaks and let's just see if we can find this Q process stub is currently using 88k of memory here I'm gonna hit the sucker with 10,000 concurrent transactions let's just watch our cores now I should know that this will start breaking but you'll notice the software itself doesn't crash and it's gonna actually take a few seconds to spin up 10,000 connections I honestly don't think Siege is capable of spinning up 10,000 connections I think that this server that I've written is probably going beyond what Siege is really gonna be able to help us test with I may have to actually write my own test program to do this we're just gonna let that hammer for a while but you can see memory usage it's really kind of just sitting there it's not doing a whole lot Siege isn't really doing a whole lot either well actually 500 bags but that's one question I had for anybody out there is the Q process stub this guy right here is that's not actually my program is it because here socket test three this is the actual name of the program and it's pushing 40 megs now that's something that I've really kind of perked my interest here as you can see that Siege is using like 500 megs and my program's only using like 40 megs so there's a big memory difference there that really had me kind of curious and if you've been watching you can see that you know we're getting socket errors and disconnects and all sorts of timeouts and stuff it's because we're really just hammering the heck out of this and you see that's really apparent and looking at the CPUs they're just going ballistic here so that is probably hitting the upper limit of what this computer is physically capable of doing that or I've hit the limit of what Siege is capable of doing we're gonna control see and Siege actually crashed segment fault core dumped so we actually broke Siege in our little test here that's kind of scary my program didn't crash so so this is just something I've been playing around with here let's actually close that and I want to I kind of feel bad that we crash Siege sorry Siege you've been nice to me and I wanted to actually just play around with us a little bit because you know it's no fun unless you play with it right so we're gonna uncomment this out and we're gonna actually make a hundred threads which is complete and total overkill I wanted to address this or somebody inevitably is gonna say well the more threads you have the more performance I have no it's actually quite the opposite threads are a very expensive operation and so when you do cross thread operations it's very very expensive because you're dealing with two different threads and you have to sync them and this code actually does that we're like where is the count function here connection pool yeah count see I'm using a qmutex locker so I mean you have to actually lock this entire section of code just to read the variable so yeah it gets a little kind of nutty but anyways we're gonna actually create a hundred threads and see if the performance improves a little bit here I doubt it will it'll actually probably go down because we're using a hundred threads instead of eight so we're using a hundred threads and you see it actually created a a connection pool a hundred times so we've got a hundred q run q runnables in memory here get my system back up system monitor get siege back up and go in here let's try not to crash siege this time let's do 5,000 concurrent connections and just see what the memory and cp profiling looks like yeah you can see they're really starting to grind away and work at it so I guess the the questions I had to focus out there is is there anybody out there that's knowledgeable about TCP server design and is this like the preferred way of doing I'm not the 100 threads just the using a thread pool and then having multiple sockets on each thread in the pools and we're gonna start crashing siege here I can already see it happening siege is already starting to buckle so we'll just cancel that and you can see that we had 87% availability not great not bad we had 12,000 hits and about 1,800 of those failed most probably failed due to timeouts just because they're sitting in the connection queue waiting and of course if we lower that number we have much better results because we're doing less operations and say a hundred percent availability so I guess some of my questions anybody out there who knows more than me on the subject is thread pooling the way to go is multiple threads even the way to go I know sockets themselves under the hood are intrinsically asynchronous so using a thread is kind of redundant overkill especially when you build a a server that's like one thread per socket it's just very bad server design so that's pretty much all for this I just wanted to kind of get you guys's feedback let me know via personal mail or preferably join the void drums Facebook group I'll be posting this video and trying to start an open discussion in there about this there's pushing 470 folks in there some of them are extremely bright smarter than me which kind of scares me I'm not sure if I'm comfortable with them being in the group and they're that smart but so let me know