 So hi, my name is told or demo I am a developer at East vision systems a company from Romania And we do a lot of things there But we mainly concern ourselves with at tech big data video technologies and real-time bidding For those that do not know what that mean that really means This means that we must handle a lot of user events every second which must be continuously persisted into the database So what is an event simply puts an event is represented by a simple get request Which is sent by a user from a browser which views advertising and that is sent to the event processing system That must analyze the data and then push it into the database The amount of data that can be generated by a single user is quite big a single user viewing a single ad Can see can send about at least about ten events of data These events are not sent all at the same time, which means that the user must must Keep a continuous connection with the server and that means that you have a lot of users which send a lot of events From a lot of concurrent clients This problem has been seen long before this talk. It is called seat and K And it basically means how can you handle ten thousand concurrent connections on the same machine? During my time in the company. We had our averages of about twelve million requests per minute That means about two hundred thousand requests per second that must be handled by a single server cluster So the question is how can you handle such a traffic and moreover? How can you handle such a traffic by using as little resources as possible? So the name solution will be to implement a system that responds to each request and sends it directly to the to the database this works for a small amount of data and is quite easy and fast to implement but it's ultimately maintainable and Consumes a lot of resources Another alternative solution would be to use the Apache trio Kafka storm or and zookeeper or some alternatives to that However, if you want to configure and to tune the systems as good as you can Into a coherent whole that takes time and it's often non-Pythonic Although some some work has been put to make it a lot more Pythonic by the ones at by the people at parsley So could those to them? Initially when we had to implement such a system We had to ship it and that meant that we implemented a simple solution to naive one which handled streams of events and Then simply sent them to the database In order to check for data consist Consistency though in order to be sure that we haven't dropped any event that reached the server But was not inserted into the database. We checked the access logs of the web server and We checked if that data corresponds with the data from the database because all the data that you need all the events are there in the Access log So this led to a simple idea. Why not use the access log as a simple queue for an event processing system? The idea was that when the request reached the machine They're received by our system by the engine X web server The engine X web server solves the C10k problem. So that would solve a lot of problems So between the access logs and the database there had to be another service Which would take the data? Analyze it transform it and then push it into the database So we began to think about the implementations of such a project And if it would be resilient and feasible enough for us to do it After some prototypes and some new ideas We came up with a clean structure and we come up this with a service that we called log bunker Now this is a data flow diagram Which shows a simplified schema of the data flow through a single virtual machine in the cloud As I said a single HTML request is sent by the user by a browser Through a load balancer and then to the engine X web server In order in order to ensure that the data was easily swappable between the virtual machines We used the Amazon web servers EBS elastic block Blockstore service to sort the access log data Then the access log was read by our service by log bunker and then it is absurd to be into the database Inside of the log bunker service. There are actually three processes that work at the same time So so of these processes are the parser and the absurd the parser reads the access log Continuously just like a tail a unique scale and then analyzes these events and caches them into an internal cache structure This internal cache structure is using the Python standard types They're actually quite beautiful to use and very simple to use After it has cached this data for a fixed period of seconds configurable, of course It then takes this data and pushes it into a multiprocessing queue that makes the connection with the absurd The absurd then pops the queue takes the data from that and then pushes it into the database Besides pushing it into the database the absurd actually logs into a special file into we call it the bin log file Every event that it has actually pushed into the database and with the offset of the access log So that in the event of a crash in the event that you you actually want to reboot the system to restart it The service within know Which point was the last point that was introduced into the database and could restart from that point on The third and the final process is the admin process This process checks periodically if the other two processes are active If they are not it shuts down the whole system now. Why did we do that? Why didn't we just reboot the whole service? Well, there is a tiny probability that if you have something that can crash a process that can crash the data processing service Then you may have corrupted data If you have data that does not reach the database, but it is there somewhere in the system Then that's bad. But if you insert corrupt data into the database, then that's really much worse So we try to avoid that at all costs Another function of the administration process is is to have sync the data files the access file and the bin log file The f-sync function is a function which synchronizes the virtual memory of a file With the actual disk content of that file, which means that the file is persisted after you make the f-sync call That is extremely important if you have if you want data persistence And the last important function is to serve status data at a configurable port That was done with a single simple protocol. It just Accepts every single request that comes and then serves the JSON status data. This status data is collected from the other two processes Through some shared virtual memory that is controlled by a multi-processing log a simple read write log So One of the first issues that we thought about was how stable would the service be When it's there's the access log So turns out it's very easy and completely stable It's very easy to implement But there is one thing that I any problem the offsets are hard to calculate efficiently And like I said, we need those offsets to put them in the bin log file so So in order to do that using buffer to text files is almost useless because if you use buffer text files You cannot know the exact offset of a single line of text Unbuffered text files are really slow and could not help us So it's actually easier to open the file in By the byte mode Just read a number of bytes add a number of bytes that are contained in a lie and there you go You have the offset so between the parser and the absorber process. We have a multi-processing queue The queue is using a pipe in the background the unix pipe And that pipe never corrupted data. We never had problems with it But there is a problem with the data transfer speed Because when you actually insert some data into the queue it actually spawns a thread Which then inserts the data Gradually from the buffer its internal buffer into the pipe If this thread does not have the gill you are going going to have data in the buffer Which is not actually inserted in the pipe So the connection between the parser and the absorber is broken There are ways to minimize the damages and I will talk about them in a later slide So how could the catastrophic crash be handled securely and efficiently? Where if that happens if you have a catastrophic crash Then there are two files essential files Which can be corrupted or incomplete the access log and the bin lock and you can manage that somewhat By just f-syncing them as often as you can So then if you restart the machine or if you have a crash then you can recover the data You can just go on and then insert the data that you have there Okay, so is Python fast enough to ingest this data? Well, yes, actually with some performance optimizations See Python could get about 20,000 requests per second on the same machine on a C4x large Amazon web service virtual machine that means that you have This processing power on a machine with four virtual cores and eight gigabytes of RAM However, after we inserted it into production, we re-implemented the data processing system in CIFM That was quite easy to do it just took a week excluding testing And without any prior CIFM experience and it's doubled the performance So CIFM way to go There is another problem though, like I said we called f-sync on the two essential data files and we Use the network file storage offered by AWS The f-sync does affect the performance of the read from from the files and we had a problem at some point That's a periodically once every two days or so the network lags in the Amazon web service And that meant that the f-sync call took From beneath 0.1 seconds to almost 25 seconds, which meant that the service was blocked at that time But it was actually pretty rare and it did not affect the system too much so During one of the testing phases we observed a strange behavior with the absorber The parser was reading continuously from the access log file. It cashed data happily The cash was full And the absorber was not having the same efficiency as it would have under normal conditions So what what's happened there was that the parser process the parser thread was starving the feeder thread Was starving the thread that would actually that had the purpose of taking the data and then inserted into the pipe the only way we could manage until now to To avoid this problem was to force the parser thread to sleep periodically It could not be fully fixed and it's just sort of problem Right then so how can you maintain how easy is to maintain such a system a Single machine that answers to this amount of requests like I said the 20,000 requests per second Contains only two essential processes the engine x service Which takes the requests and just needs a configuration file and the Python demon log bunker Which also needs a single configuration file Those machines are small. They only contain four cores and they are quite cheap. They are commodity hardware and they You can't serve with a server cluster 12 million requests per minute by just using 15 virtual machines That means that there is no single point of failure when you have machines that are this small You have enough processing power to obtain such such an improvement and If you lose a machine then the system will not be affected as much as Let's say if you lose a gigantic 32 cores virtual machine with God knows how many gigabytes of RAM Another thing to say about maintenance is that these machines are not throttled to a hundred percent these machines are throttled at a maximum of 50 percent in a normal situation and Let's say 75 percent on peaks This reduces the possibility of low availability and harder failure because if you throw the machine at a hundred percent Then you will have some harder failures And even if the peaks reach 100% of system capacity that you have a queue you have an event queue That is represented by the access log file That access log file will just keep it be appended to buy the engine X web server The engine X web server will happily continue to serve the requests and if the Python process does not Does not cannot? Pace with the engine X web server It doesn't really have to it will have a delay from from the point when the data Reaches the virtual machine To when it is inserted into the database by about let's say a couple of minutes, but it will not be critical You will never lose data You will just continue pushing it into the database When the peak is gone? And the same thing applies to the database connection the database can actually lack quite much and If you have at some point in time some problems with the database Let's say that you have a cluster of database servers and one of them or several of them crash Then you will have less right performance on them and if the database connection lags Then the absorber will just lag in Getting the data from the queue But what that means is that the data will actually Add itself into the cache the cache will just add the data more and more and more And after the database is restored to its full power It will just continue to insert it into the database and it will just Reach it's a hundred percent okay, so That was it. Thank you for our attention There are questions, please One question actually have probably too, but the first one is why between your parser and your Upsetter you don't use rabbit MQ or any message queuing like I don't know zero MQ even would solve your problem of Q That you have in the multiprocessing between the parser and absorber. Yeah Well firstly because it was easier to implement and because it didn't not affect the performance as much as we thought so even if Even with that performance penalty, we still reached 20,000 requests per second 20,000 requests the limit of the engineering's web server in an Amazon C4x large instance is about 20 to 24 requests per second and we did not reach that limit and Like I said, we kept our servers to serve about 12,000 requests per second so that we kept low the possibility of hardware failure So we could have implemented a message queuing service. We thought about implementing something with Redis And we may implement that in the future. We don't know but for now, it's sitting quite Quite good. Well, just zero MQ is a five lines and it solves your problem. That's it. Oh, okay Thanks. Hey, I just want to ask about you talked about how even if your log bunker Process crashes or doesn't work. You're not gonna lose any logs because they're stored in the access log And you have a different engine X on each virtual machine Yes, if you have a virtual machine crash or go down what happens to the log messages that are in the access log on that machine Which have yet to be stored in the database. They're gone They're not gone forever because they are kept in the access log and that access log will be kept on the EBS network partition So what that means is that? Temporarily while the virtual machine is down and you cannot have access to the EBS network partition You will have some data that does not reach the database But if your support team is ready Then it will reconnect to that EBS system to another log bunker instance Maybe or another tool and just reinsert it Thanks. I knew I had a second question Why are you not using things like log stash or equivalent to do your passing? A log stash between what basically to pass your logs what you do manually by Reimplementing what those utilities do is just pick up the information directly from the log via dedicated utilities I mean there are many already existing solutions to do that precise job I think I mean I'm not expert in log stash But I've heard many things about logging and log passing and whatnot and to me it seems that log stash or something Equivalent would be a very good solution to extract information meaning fooling through information from your logs because it's a structured information And probably you can inform logs stash To read how to read text basically so that means that we'll have a cluster of servers Which have the engine x on each one of the machines and? But it's the same thing It's just you you read your access log somehow and just instead of Reimplementing something in Python you use a standard utility that is well maintained by professionals That means that it will be redirected to another cluster of servers which actually has log stash on them, right? Well, you can install I guess log stash on the same machines as where you run your log bunker Yes, that is true. However, I am not I'm not really sure how is the performance on on those services We actually could implement this really quickly and had a really good performance and we thought that we could use this So I don't know ultimately. I don't know how the performance would be any other questions one more question Okay, thank you. They are