 Hello everybody. This talk is about how to design and implement a good TCP server with a tornado framework. So we're not going too deeply in the network programming here, but we more or less focus on how to use existing library. In this case, it's actually the tornado to achieve a scalable and high-performance TCP server. So my talk is divided into four parts. I will start with the design aspect, where we try to find out what we want to achieve, and discuss some techniques we should consider when we decide a server. Then we go through the system architecture and system flow. In the third part, we learn a bit about the implementation. I will try to introduce some tips and tips we use in the server, and the presentation and with the small demo of the instant messaging server. Okay. First part is the design aspect. But before we go on, I would like to give you some background information. There is quite a TCP-IP protocol. We all know that it's a connection-oriented protocol, which means connection needs to be established first until the application between two sides, finish sending the message. How about TCP server? TCP server actually is tied to a specific local port, and it starts listening to incoming connection. So when the client connects to the port, the server accepts a connection and waiting for the incoming data. Data usually comes in Cheng. It means the full logic package may not arrive at the same time. And the server may or may not respond to the client, and the client or the server can close the connection. Another thing I would like to discuss before we move on is the performance and scalability. When people talk about scale, they often talk about the performance. They are similar, but actually it's not the same. Performance refers to ability of a server to serve a mile of request or serve a mile of user. Meanwhile, the scalability refers to the characteristic of a system to increase the performance by adding more resources. Very often, people think that if we need to serve more people, we should add more servers. And I can give one example. For example, we have the Apache server, which can serve our, for example, it's 1,000 transactions per second. And if the transaction is short, for example, one second per transaction. So we have around 1,000 scalability connection. So for this one, the Apache server handles quite well. But if the connection is set longer, like 10 seconds per transaction. So for one TPS, we have around 10,000 concurrent connections. At the time of the Apache server, the performance drops a lot. Okay, let's say you upgrade the hardware to make it double the processor speed. What happens? We get the double performance at the beginning, but it still kind of serve 10,000 concurrent connections. Okay, let's continue doubling it. Okay, we have 8 times performance, 16 times performance. It's great, but it still kind of got 10,000 concurrent connections. So performance is not the same as scalability. Okay, so right, let's move to the desired target. What do we want from our TCP server? I pick up the most important point here. We want it's a high performance, but we also want it's scalable. And it must be stable and robust. More than that, it must be very easy for developers to add their business logic. Okay, so we want a scalable server. But how? The example about the Apache server is actually known as a C10K problem. A decade engineer tech go with how to handle more than 10,000 concurrent connection. That's great solution for that. When you have more connection, it means the kernel needs to work through the larger list of the socket or the thread. And actually, this day, the kernel is much better. It can provide lookup in constant time. Another problem here is the thread scaling, still cannot scale. In Apache, for each time there's connection, it creates a new thread. But in the modern server like ancientX or Node.jf, the server scale using the event ribbon IO. It is a asynchronous programming model. Okay, please just mention about an asynchronous programming model. So what's the difference with the traditional way? So for the synchronous operation, it blocks the operation until the operation complete. For asynchronous, you basically tell the system, like you read the data from here and when it arrives, it's called this function. Meanwhile, you can keep going on other stuff because the IO operation don't block, so you don't need to have spawned a lot of threads or each connection one thread. Okay, but the asynchronous come with a disadvantage. It's not really easy to implement or it's not easy to read the code. Error handling is much more complicated and the response time are unpredictable. Well, so to scale up, we use a synchronous programming model. So what's the option out there? I can name the most three popular network library in Python we have. This is the tornado, Given, and Twisted. Tornado is the first candidate and actually this is the one we choose for our TCP server. Tornado has become very popular among Python and synchronous library. Actually, there's a by-con talk about a tornado IO loop. You can take your time and watch it in detail. In this slide, I just summarized some key points here. Tornado is claimed as a great framework if you need to handle a large number of connections and it can scale up to 10,000 open connections, ideal for application that requires a long live connection. And it's very simple, easy to use, good documentation. The code is very beauty and clean. And it has a lot of library support. It can work with built Python compiler. It's pipeline. It's so good, but there's many other Python libraries out there. So why tornado is better? Let's make some comparison. For example, we come back with Twisted. It's actually also a very good candidate. It's actually very similar to tornado. It's both used the coreback style and have built-in, which is a pure Python event loop powered by Epo in the Unix. But the only problem is it seems much more complicated than tornado. And it's not only offered the coreback, but it's offered a defer, which I think is the same way as the go-to in C. You have a lot of problems when you do programming or testing. And when other programmers read your code, actually they just want to kill you because it's very complicated. Another is Given. Given is actually the very new library. This is a core-routine-based Python networking library. That's you read it to provide a high-level synchronous API on the top of the event loop. So basically it's just like object the coreback. It helps you to write the synchronous code and still get the good performance, still get the good scalability. However, we got quite a number of troubles with the Given because not many libraries compatible with the Given. And that's it. And it's tied to the C Python implementation, not support order like PyPyJetern. Well, so let's look back to our desired target. So to make our servers scalable, what we try to do is we will use the tornado for the network component. And to get a high performance, the idea here is we just try to guarantee that we collect the tasks assigned to multiple threads or recessed. The problem is the asynchronous, as I mentioned, it makes your course complicated. So to make sure it's stable and robust, or it's easy to use or add the business logic, what we do is we try to file the way that makes our file the way to let the developer to write the business logic in the synchronous code. Okay, so to make you more excited, so at the end of the presentation, what I can show you is you can wish our TCP server, you can implement a simple instant messaging server with only just 100 live code. And the code is actually written in synchronous mode. How about the performance? For example, you have a server with 2GB memory. Actually, it's heavily 1GB available. And to call a processor, it can support, it's easy support like 100K connection and 1000TPS just in a second. Okay, so from the desired target and some basic techniques I just show you. This leads me to my next point is the system architecture. This is architecture. There's two main modules here. The very first one is the network one. The network component, this network component is run in a single process and contain two servers listening at different port. The first one is connection server. What is the, first is the non-blocking single thread tornado socket listener. It's listened and accept the client connection. And if we see the request package, it adds into the waiting task queue. Another is a worker server. Again, it's a tornado socket listener and it's maintained the list. It's waiting for a list of the worker processor. The task is pick up the task people in the waiting task and assign to own the worker. The second modules in this architecture is the own the worker processor here. Its worker processor is the TCP client. It can be thread or process and it can be in the same or different machine with the network component. It's connect to the worker server or to get the request. After it's processing the request, it can send back the request data back to the network modules. And the next module will try to send back the data to client. So the main idea here is we try to use the network modules as a asynchronous to efficiency control the client connection. And we try to convert asynchronous code, become the signal through the worker processor to get high performance. Okay, so to give you a better understanding about the architecture, let's go through the basic flow. So firstly, each processor will connect to the worker server. And when the processor worker coming up, of course, there's no task at the beginning. So it's add to the idle worker list. And when the client made the connection to connection server, the network modules will generate a very unique key based on the client IP, client port, and the patting number. And it keeps the mapping in the dictionary to easy to query which client need to send back the data later. Okay, when client send the package to the connection server, the connection server will modify the package. It's add in the command and the client IP client ID before forwarding the package to the worker processor. So here we have three main tie up commands. The first one is the command connect. This will when the client made the connection, the module, the network module will auto generate the connect package, which is useful. For example, you want to initialize some section data for a new connection. Similar way for the command disconnect, when the client connection disconnects, it will tell the worker server so it can clear up the section data. For normal package from the client, the command should be command relay. After the package is received and added, it can either add to the waiting task if there's no available worker processor or it will directly pass to the one of the worker processor to handle. So after the worker processor finished processing the request, it will send back the data to worker server. So based on the client ID in the package, the network module can file out the corresponding client connection and return back the data to the client. One notice here is we have two tie up commands when the worker processor send back the data. Command notify. Command notify means I still processing the data, but I need to notify another guy. So the worker is not free. Yes, it's just like midway processing data. Or if it's finished the request, it can use the command relay. So if you set the network module, it can send back the data and also add the worker to the IDOL worker list so later it can get a new task. So the network modules and the worker processor can run in the same machine. But we also can make it the cluster. This means we can deploy a multiple machine, run the worker processor. Most of the time we just need one network module because the tornado is very powerful library. It can easily handle 10,000 open connection. So in this picture I just suggest one of the cluster we have. Like we have two network module and a few machine for the worker processor. Okay. Okay, so we get our architecture. So let's move on to the next point is the implementation. Firstly, we talk about a network module. As I mentioned, the network module use the tornado. And we all know that tornado is asynchronous library. So the question is, will it be very complicated to write out the network modules? Because I just tell us the asynchronous is not easy to implement. Actually it's super easy if you don't have the core back inside the core back. And yeah, it's very straightforward to write out the network modules in our TCP server because we just need to fill up some event. For example, when the client make connection, we create a connection package sent to worker server. Or when the client close connection, what we need to do or when a client when we receive a package from client, what we need to do. Okay, so another point I would like to highlight is the some restriction in CPython. In CPython, we have something called global interpreter log or GIL. It's actually the mutex that revents maintenance threats for executing the Python code S1. Note that the potential blocking like IO blocking or image processing will not occupy your GIL. It's only the CPU bound. So if your threat, for example, threat 1, it requires a high CPU and it will not release the GIL. So the threat 2 cannot acquire the GIL. So actually it's learned in the single process. So that's the idea. You may have to implement the worker as a monthly threat if your task requires a CPU bound. Another thing in the implementation I want to highlight is the partial data transmission. At the beginning, as I mentioned at the TCP connection, the full logic package may not arrive at the same time. And the technique most frequently used is you add in the header. It is created as in the 4-by integer at the beginning of the package to indicate the side of the whole package. So what you do is at the beginning, you just create the first 4-by to know the full side of the package and you try to get the full package before passing into the higher layer. Okay, beside partial data transmission, there's some tips I want to share is if on the worker processor or network module run in the same machine, you may not need to use the TCP IB socket. You can use Unix socket, which is much lighter and faster. Why? Because the Unix domain socket is not as you execute in the same machine. So this can avoid some checking or operation like routing the package. Another highlight here is the keep-alike. Keep-alike is useful when you need to advise when the peer dies before it can notify you. Of course, most of the time you want to set the keep-alike, but the key point here you need to mind about the option we have. Of course, you don't want to get notified when the peer already connects for one or two hours or you don't want to set the interval is so short so you get a lot of traffic, redundant traffic. So just feels like I mentioned about the network modules. So now we move a bigger picture. Let's talk about how the message transmission between the client and the worker processor. Okay, each package from the client has this format. This is the size of the package, the size of the header, the header buffer and the body. The package size is to solve a problem with the partial data transmission, which I just mentioned. The size of the header tells you the header size and what you need to do is you deserialize the header and the header structure we have here is contained in a few attributes. It is the ID, version, command, result and timestamp. In here the most important one is the command because based on the command you will know which structure to deserialize the body. For example, if the command is the login command, so you just use the login structure to deserialize the body package. Or it's a chatting structure or it's an update structure. Okay, so for each worker processor, each will have something called worker-recessor-manager-class. What is the... it's maintained a dictionary of the work of the processor handler. And each processor handler needs to be registered its action and request structure and reply structure. So when the package comes in, the worker-recessor-manager first it will read the header in the client package, get the command. And based on the command it will find a corresponding handler. And because inside the handler you know what is the request structure and reply structure. It can deserialize the body and pass back to the function. So it will process the data and it gets back the result. After that the worker-manager can construct a relay package and return back to client. So what is the nice thing here is after you have the worker-recessor-manager on all the worker you can return code in the synchronous mode. Here's one example. Like if you want to write a login handler, the easy way is just register the function with the command login with the processor-manager indicate that the come-in structure is locked in request and the return package is locked in reply. And inside here you can write a synchronous code. Okay, okay, well finally we get here for a demo. So for this demo I would like to show you a very simple instant messaging server. The client can can can register his username and can send the message to another username. And through this demo what I want I would like to show you is you can see how easy we can use our TCB server to write out a business logic and we also get a very good performance. Okay, let's see for a demo. Let's see. First I'll bring up my server here. Okay, maybe now I move on. Maybe I should change this to WDK. Let me make it WDK display to make it easier. Okay, okay, we have one server here. This is not really a power server. It just has two cores and one one gigabyte RAM available. And okay, this is our folder structure for the instant message server. Maybe we use VIM to easy to see. So you can see this on the left hand side. We have GTCP, this folder contains all the core library files, which is the network modules I just showed you. And all the IAM server structure here, we have a config file, which is you set like listen at which part and worker listen at which part and you set some config like some keep a live option, time out, counting. Okay, you also, okay, for the CLI protocol, we use the Google Protobuf. So in here you define what's the command from the client. It's a command list. You define a result constant. And here's the package header I explained inside the slide. And for example, for user register, this is the structure of the request package. The client need to send the username for the message sending is how how the package look like. You can indicate the list of a target and the message. When the server receives the message, it will file out the receiver and forward the package to receiver and this structure for receiver package. All the logic actually is just in two functions here. Okay, user when it register is how you write out a code. It's very simple. It's just a synchronous code here. You just add into the memcast, remember the username and mapping with the client connection. And here's how you send the message. Okay, well, let's look back to the demo. Okay, first let's do some stress. Let's add more pressure on the server. Maybe mess a few thousand connections. Maybe I have 10 threads, each thread 1000. 10 threads each thread. We try to this thread. We try to make 1000 connections. Okay. Okay, the CPU is high. Let's see how many connections we have. Let's start. The server isn't a port 1.8.0. Okay, we have 10,000 connections here. So the server have 10,000 connections. You see it's not big problem on the server right now. Let's try to bring the client back. Point 2, I bring one client. I register username like P1. Okay, the server return okay you register successfully. I bring another client. Maybe I register with P2. Okay, let's try to send a message to P1. Yeah, the P1 can receive. Okay, let's do more stress on the server. Now server have 10,000 concurrent connection. Everything is still okay. Let's try. Okay, in this testing we try to have a few sender and try to send as much as possible message to a server and we have one receiver to see how many packets the server can handle. Maybe we have 10 sender. It sender send 600. Okay, it successfully register. Let's wait for while to see. Okay, we can handle around 500 messages a second in the receiver. And let's look the server. Okay, the CPU is just slightly higher. It's still very good. Okay, the connection still, we still can maintain 10,000. Actually, it can maintain up to 100k. And if we use like the lighter command of internet work, we can get around 1,000 TBS. Okay, so that's bring me to the end of my presentation. Thanks. So I open for any questions? No question. So if you have further questions, you can send me an email. Yeah, or you can check out the demo code in my Github. Let's show here. Okay, thanks.