 Welcome back. Now we have a session regarding which messaging layer should we use if you want to build a loosely coupled distributed Python app by Mr. Maira Hari. He is a developer who loves building large scale messaging and low latency applications using Java and Jaipur. He is going to talk about messaging layer and which one to use to build a loosely coupled distributed Python app. Thank you. Thank you very much. I hope everybody is able to hear me clearly. My name is Hari and you might have noticed that in that one single line there are a lot of big words, messaging, loosely coupled, distributed and all that. Don't worry about that. I would say this is the single most heavy piece of this PPT. After this it will be very light. Don't worry. It won't be very confusing or anything. So briefly about myself. I am a Python enthusiast. I am learning web development in Python. Learning has built a website which works and I keep on adding features into that and expanding my knowledge. On my job I work with messaging systems. So anything, server side messaging, putting it across a bunch of worker nodes, worker queues is what interests me. Hence this talk. So this one, we will have this talk for about 25 to 30 minutes if everything goes according to plan. It will be mostly talk. I don't have any code examples here but I try to give as many examples as possible from my work every day and something which you might find it easy to understand and at the end of it we will see if you have any questions on messaging or any thing in Python which we use for messaging. So with that let's get started. So what is messaging? So when I say messaging we could think of it in various ways but the messaging which I am going to talk about now is between applications. So all of us work in various software companies and we have a lot of internal systems. Some of them are large, big, monolithic systems which we built over a period of years. There will be one big application which is doing everything. Or we will have a bunch of applications. We will do some work in one system. We will pass it on to the other system in some way. So if you want to make all these applications to speak with each other, communicate and collaborate in some way then you need a sort of messaging in between. Similarly if you have a big organization which has one big application and it is not scaling up the requirements and follow for certain it's going very slow then one of the ideas is to break it into pieces. Each piece is doing a manageable part of work and then running it in one instance. Then you would find that these four pieces act with one other and they work faster than what a single system can do. So the first slide here is an example of a system or an example of an architecture where maybe there was some organization, maybe it was a web app outfit, maybe it was some sort of corporate entity which had various in-house systems and they had to communicate with each other. And everybody built a simple ad hoc system. The payroll system needs a list of all the employees. So the HR said we will just make a CSV file out of it. We left TPF file to you. We did them. Each month 15 people join. They make the file again. They send it again. Something like that. Very rudimentary. Maybe another application requires something else. They built a file interface. Maybe somebody said we will build HTTP interface. Something. Each system has its own interface. It exposes an interface for a particular functionality. Everybody calls that. Everybody calls everybody else. So you have a lot of calls going across the system and it maybe sort of works but it's a mess. Everybody talks to everybody. Suddenly there is a problem. You don't know if it's because of the HR system calling the payroll which is wrong or payroll calling something else which is wrong. There is nothing which will help you find out what the hell is going on there. So if you apply a little bit of logic on this and apply the principles of messaging then you would end up with an architecture which looks something like this. Here we still have all those systems in the company or we still have all those parts of the application. They want to speak to one another and they speak through a common message queue. You always have heard words like rabbit and queue, redis or any other pain queue broker and all these are messaging brokers, messaging queues which are available. They have APIs in various languages including Python. Very simple to use, very simple for integrating into your applications. So if you build a proper messaging system there is a little look. This is only one of the ways in which you can build a messaging system. This is a scenario where there are a large number of applications in your organization which want to speak with one another. That's the case which is being solved here. But I also mentioned another case where you have an application which is not able to scale and you want to put some work, you want to remove some work from that main processing thread, put it on the side, work on it asynchronously. You can think of anything like you upload a video to YouTube. The moment the file is uploaded you get a message saying your video is being processed, please come back after sometime. Then come back after half an hour or something your video is processed and it's not able. That's like an asynchronous processing. Same thing happens with any large application. So if you build something like that then it would look like this. So you would have the main master application which was doing all this work but it wants to offload the work so that it can be picked up by the workers. So it will create individual items. So I upload a picture, you upload a picture, you upload a video. Each of them instead of being processed one after the other in the same application you would put them in a separate queue and then you will have a bunch of worker processes which come and pick up from there. So they work on that, they finish it and sometimes once a worker is finished it will have a bunch of results which it has to share back to the application or in case of a website, let's say you upload a picture on to Facebook, the picture gets uploaded, it gets appropriately resized or something, then it has to be put on to the feed for your friends. There is no result there just that it will find out who are your friends, put a task on a queue, it will get picked up and that particular friend will start seeing an item in his news feed so and so upload a picture. So somebody can comment like your picture. So these are the ways you would see an application if you built a proper messaging application. This can be done in any language and we are going to talk about a few message brokers which you can use in Python. Now before we do that what was the main point of this? You have an application which is in a big single block or very highly coupled with all the various parts and you want to build a loosely coupled app. So why do you want to do that? Like I mentioned you either want to make it fast or you want to make it very easy to manage, you want to be able to put and remove parts from the application easily like you want to change how you process the pictures in your web application without impacting your entire application itself and all that. So you will have a bunch of reasons where you feel it will help your application and once you know that everything is you just go and write some code and your application is ready but once you have figured out that this is what you want to do and before you go and write the code you need to make a bunch of choices because if you just pick some messaging library and you just write some code onto that it might not work for you. You might not get all the results which you are expecting. So like in the morning one of the sessions Anand was saying that they process something with pure Python code and then they use the library but it becomes 80 times slower than what was originally working or maybe 80% slower than what they did in Python. So if you want your application to work faster or work properly then you need to make a bunch of choices. So next few slides about five or six of them we will go through what you would need to think about how you make a choice and then once you made the choice you can go and write your code. So the basics will the application benefit at all? So let's say I have an application web app and I am running it in Flask or Django and the user uploads a picture. If within the user session I try to see if the picture is under 5MB resize it to a thumbnail size, put the thumbnail somewhere upload the picture to the back end and find everything then by the time user hits upload and until he gets back a response he will get bored he will stop using your website. You will not be able to sell your website to people, you won't get users and all that. So that is something that is obvious. You should not process a large image in the same thread where the user session has been managed. So you need to move it out. Maybe through something else like you need to send an email as soon as the user signs up. You need to send him an email, login for activating his login. Now that does not make sense to put in a separate thread, you could just send the email finish. It will go in half a minute. So you need to think of your use case and does that really benefit and where to split it. So now you already made a choice that you want to split it off, you need to now think on where to split it. Which part? Which part can live as its own container? Which part has to be part of the application? Maybe you made four parts of your application, both of them need to access the same ID in the subjects, same row, same table, whatever you call that. Then there is no point in making them separate, those kind of things. So that is the second point. What are the components and how are they related with each other? Do we need more than one instance? So if you are building a very popular website, some components of your website will require more processing, more throughput. People upload more pictures, maybe people send more messages to each other. So you would find that in order to make your application handle the extra load, you need to have multiple instances. So if you are running a website, you are getting a million page views or more per day, obviously you won't be able to run with one instance of your web server. Similarly, if you have something like this which is being hit more and you are separating it out into a completely independent process, then you need more instances of that. And then sometimes when you make something into multiple pieces, you suddenly come up with new use cases which you didn't have before. So now I have an application which has let's say one logic dealing with photo uploads, one logic with sending some message to the user once the photo is processed and everything. So now if I split it up, now while it was a single application, this would finish, this would make a function call here, it would work. Nothing can go wrong. Both of them are running in the same process at the same time. No way it can miss. But now I am ready to separate processes. So the user uploaded the picture, you processed it. You put a message on the messaging system saying send him an email but that message got lost. He never gets the email. He doesn't know if the photo is uploaded or not. He can come back and check but that's not what you want to do. So now you have another use case that there should be something, maybe a monitoring script which looks at this queue and sees if there is a pending messages, messages being processed or not, messages being lost and all that. So you get a new use case there. So these are the basic things which you should think about first. This will help you draw that box diagram where you have A, B, C, D lines between them and you know that this is my flow. Yes, this looks good. That is something your architecture will probably sign off on. Yes, write the code for it now. So that's the first thing you need to do. Now then you have the Athens. Now whenever you talk about messaging, you have things like one kind of messaging, you make a request, you put a message on a queue and then it will get picked up. The other kind of thing is you make a request, you put a message on a queue but you wait until you get back a response. Request, reply. The first one was queuing. You just put a queue. Second one was request, reply. That is almost synchronous but the only advantage is you are moving the processing out of your main thread. And the third one is publish, subscribe. You just publish a message to all the interested parties. They will subscribe. You don't have to worry about whether they got the message or not. Your job is to send user logged in. All the other components will pick up user logged in, log in to chat, log in to the news feed or whatever you call that. So you need to identify which one suits your application. For most of them the queuing mechanism works because if you have a transactional system like you uploaded something, you want it to finish. You don't want it to get lost. Then you put it in queue. That message will wait. Worker will pick it up, work on it, finish it. Let's say you published. Nobody was there on the other end. Your message is lost. So most of the systems require queuing system. Some go for request response depending on how critical it is. I work in a bank. I cannot go for a publication, subscribe for all the systems. Most of my work is based on queuing. Some I can do publish, subscribe. Then you have messages of various kinds. Sometimes let's say you have uploaded 100,000 photos. Everything is sitting on a queue. One of the photo is corrupted. So your worker thread picked it up, trying to process it, hung. If it is not moving. So one option is to restart the worker. But again it will pick up the same photo again. So it will fail again. So if you had some sort of control message that you put another message on a queue that and you bounce the worker, worker comes up, looks at the control message which has a higher priority, looks at it and say I need to ignore this picture. Ignore that. Continue with everything else. So you might have to have some sort of command and control messages and your own application messages. That's something which you need to worry about. Then the basic things like whether you should send message in a particular format. You can send a JSON which can have any fields at once. Or if your message was very specific to your line of business, you might have an XML format, maybe what do you call it? I think it's called electronic interchange or something. PDI, something like that. PDI formats. You could send in that. Basic things. And sometimes you'll have some components which talk a lot. So this component sends a lot of messages into the system compared to one of the other components. So maybe you want to separate them out so that one of them does not flood the other. And both of them are able to work and manage. So you need to identify those things. Latency, routing and priority. So I've been working for many years in this and not even once. I've heard anyone say that I've published a message but I don't care what happens to that. Everybody says we need that message. So once a message is published, it has to get to the other end. Whether it gets in 5 minutes or whether it gets in 15 minutes or whether it gets in 50 milliseconds. That is a problem which it's trying to solve. So once you've taken an application and built it into components a and b, then you need to worry about how fast messages are getting from a and b. If a is the publisher, you need to see if b is able to keep up with it or not. So those are the things you need to do it. Sometimes you get a message saying so and so event happened but that is valid only for one hour. So after one hour, there is no meaning processing that. For example, you get a message saying user logged in but you have not been able to process it for half an hour. User logged in 5 minutes. What's the point of processing that message? So you might need to put some things like if I send a message, this is valid only for the next 15 minutes. After that, if you get the message, please don't worry about processing it. Ignore it or something like that. And like I was saying in a previous case, some messages require highest priority. These are the control messages. You have to tell your worker that please stop. I am going to perform a software update and restart all the workers. So you send a message. Worker sees the message. It will not pick up anything from the queue. You finish your software update, bounce all the workers, then they start working again. Those kind of things. These are use cases which you would see only when you have multiple moving parts in an application. One big application, you will never see this because it's very simple. Not many problems there. Build, manage, and support. So all of us are programmers. We write code, we put it into production and we let users use it. But then somebody has to support it. Some of us do support work also. Some of us have a bunch of people doing support work. And if the application is so bad that it fails every few hours, then the support people will call us and shout at us. They won't let you sleep at night. So you have to build an application which is easy to manage. So it should have very good monitoring. It should be easy to provision and manage. Let's say one of the components fails. It should be very simple to bring it up again. Restart it. There should be no problem with... Before restarting, this was working on something else. Now it has been restarted but it has no clue what to do. It's just waiting there. What do I do now? It should not do anything like that. There should be guaranteed message delivery. This is applicable to transactional systems. Crash recovery, like I was saying, let's say you are processing something and you failed in between. When you start back up, you should be able to pick up from where you left. So when you write your code for a messaging system, I was saying that writing code is easy but you need to make the proper important decisions. So you are processing on something, you failed. Now you need to restart. What do you do? You need to record that this component was processing this particular message. So some of the messaging systems which you will see provide a means of making an acknowledgement back with Q. So until you make the acknowledgement, the message will be on a Q. So even if you fail, restart and come back, you will find the message there. But if that is not the case, your code has to handle this. So in Python, you can do it in various ways. To code this session, from morning again, Anand was saying that by clashing the page, they were able to do it faster, write it into a file. Something like that. You could just write it into a file. And the same thing applies to the last one which is in case of issues, how do you clean up and recover from wherever you were. So these are the things. So the last few slides, the basics, patterns and this. This is for how you write the code. And this is for whether you suffer for your code or not. If you write your code properly, if you are handling like this, you will have less issues and you can concentrate on the core business of building users, making money, whatever it is. So what are the few options? What are the few options you have? So now in Python, these are the options which I find most useful. In my job, I use a lot of other things. I can't use them with Python. But these are the options which I have used in Python. RabbitMQ. This is a full-fledged A&QP message broker. As in, there is a centralized message broker which is responsible for writing messages from the publishers, to the subscribers, provide guarantees in between. This follows the A&QP format. So you have exchanges, fan-in, fan-out, routing, queues and everything. Full-fledged messaging system. So if you have something where you need all the guarantees, you don't want to write your code to do all these things and you want something which is rock solid and you can trust, you should go for something like this. Next one is 0MQ. This one is similar to RabbitMQ. I think this is something which was forked off from A&QP, I think in version 0.8 or 0.9. I don't remember exactly. This does not have any message broker. So each client starts an instance of 0MQ and it publishes on that. It's like raw socket connection with some extra functionality on that. So you would not have a broker. So the publisher would have to publish. It would have to instantiate 0MQ and publish. There would be a subscriber which would instantiate 0MQ and subscribe and they would work without a center coordinating mechanism. This is also good. This also provides all the messaging semantics like queues, pub-sub, requested plan, everything. The next one is Redis. Redis is not some messaging system at all. It's just a key value pair. So I mentioned about a queue. What is a queue? It's just putting something in a list. So within a Python standard library, if you're using the multi-processing or the threading libraries, there's a class called queue which works in memory and you can pass messages between two threads using that. So one thread takes a queue, puts a message on that. It will appear in the other thread and it will process that. So if you think about it on a big scale, in Redis you can create a list which can act like a queue. So one process goes and puts a message in that. The other process pulls that, picks it up. That also works. Redis actually has a pub-sub mechanism as well. It internally simulates publish, subscribe in a way. So you could publish messages onto that and it will be available. So your publisher puts messages into Redis, gets stored in Redis key value pair, Redis crashes or you restarted it, maintained based on how you configure it, then the subscriber comes up, it will find the message. That also works. Now these three are ones which have some form of messaging built into them. Next one it says you can use idbms as messaging. That's possible because you create a simple model which can contain message id and message and you put your messages in that. Message id, message, process, yes or no. Put messages inside, simple insert statements, publisher only inserts here and the subscriber does select from messages where process is equal to no. It will get a bunch of messages, work through them, mark them as processed, yes, finished, go and run the sql again. That also works fine but you would have to write a lot of code for yourself. There are a few libraries like if you have ever used Celery that has an underlying messaging library which actually supports this. So you can use that. That's a separate project now so you can use it. The third one is RPC, remote procedure called. This is when you say I don't trust any of these, I will build my own messaging system. You build your own message broker, you build your own means of queuing and everything. If you are so confident that that's the only way to solve your problem, you can. We have built this in a few cases but we have never used it for critical things because we don't trust things will work because for all the other things if something goes wrong, there will be community documentation, help, support and everything. But if you bring something on your own and it feels in a big critical situation, not a very good position. So you need to watch out for that. So these are the few options which I have used and they work well. Haven't had any major issues with it and things are okay there. Now, future growth. So there is a little story I want to tell about that picture Tokyo Metro. I used to live there a few years ago and that line is the biggest and most popular line. Lots of people set onto that to get from one end to the other end of Tokyo very fast. Every day the line gets so full near midnight that you cannot get into a train and people push you into that. You get in the train, people have pushed you, you stand like that, you get to your station, people push you out, you get out. Now, the reason I put it there is each train which is for let's say 100 people for one or two hours every day, it will accommodate 150 plus people each compartment. So it's able to handle that load and then come back without any accident. So when you're building a system, one of the things you should consider is future growth. You have 100 million messages today. What if it becomes 150 million tomorrow because you have the most popular website on the internet. You need to worry about it. So what if there is a sudden spike of 15%. Will you be able to handle it? What if your business grows very well and until you can buy a new server and put your stuff there, you need to handle this load for 24 hours. Can your code handle that or more like when you write your code, what can you put there so that you can handle this load and come out safe. Obviously that comes with other things like storage cost, where do you put all the messages, where do you put your system, logs and everything. And last but not the least, let's say you started with RabbitMQ and RabbitMQ is not working for you. You need to move to something else which is also an AMQP broker. How easy is it? If you have coded everything according to the API, according to the spec and it's just easy to replace a config URL for RabbitMQ with a config URL for something else, it should work. Those kind of things. So those are the things which you would need to worry about. And once you consider all these things, then you can go and write some code. Now you've answered all the questions which you would face once your application goes live in production. Now you write your code considering all these. The chances are that you will get the performance which you're looking for. The system will be stable and it will work without causing any problems. So that's all I had. Any questions? I have two questions. The broker does it keep open connections with the publisher and subscriber always or whenever the message comes into it, it opens a new connection with the subscriber. It will always have an open connection. So if you have a broker that slimes are always connected with the broker. There's an open socket. Is it the same with 0MQ as well? Yes. But you said 0MQ will open a new raw socket. 0MQ opens a socket and it starts publishing. So let's say you have the publisher. In 0MQ the logic is that slimes can come up before the publisher. Slimes have opened a socket waiting. Listen. There's no publisher. Then the publisher can come up and publish message. But if the publisher comes up first and the client comes up afterwards then it will not get the message. Because at that point it was not connected. Now if you use something like Trebit and Q because of the broker, you can configure the broker in such a way that the message will be stored on the Q. 0MQ that will not be there. So the connection has to be open in order to get the message. Again the subscriber will have a worker which I think is a thread pool. Yes. So all the subscribers will they be using a single thread pool? Like each one will have their own pool that will be assigned a task. Each subscriber works like this. If you use a fallback, each subscriber will start a thread and it will have fallback function. It can be a single thread or multiple threads in a pool like you said and it will pick a message and it will invoke the callback. If all the subscribers are in the same process environment in the same program then they might be sharing the pool. But if each worker is separate process then obviously they have their own pools. Hello. So my question is like if the other messaging systems that you talked about, if I have put my message in the Q and there are tens of thousands of messages and something goes bad like maybe the server crashes so is there a persistence of the messages or how does it like you mentioned maybe redis has a persistence. What about other protocols? RabbitMQ is a message broker, proper message broker so you have options in there to configure persistence. So you could say that each time a message is preceded it has to be written to a file only then the subscriber you will get the message. Or you could say send the message but write it to a file asynchronously. That is possible than RabbitMQ. It's possible for redis because it's a cache and you can put in a configuration persistent disk for recovery. ZeroMQ that is not possible will have to write that logic on your own. Similarly if you do it using some form of an IDBMS because it's all transactional it will be there. I have couple of questions. ZeroMQ there is no message broker in between right. So how does the client and server discover each other? Like the client has to discover the server ZeroMQ as far as I remember it will discover using broadcast messages so when you start publisher you will start on a port and then you will start subscriber using that same port so it will ping messages between them and then figure out I don't know the exact logic right now Does it use multi-casting or something? Yeah use a bit of multi-casting I think it's a mix of UDP and ECP exact handshake I don't remember right now. The other question is you said redis right like it also supports the published subscribe mechanism. So in that case does it use the message broker style? Redis published subscribe is you could say similar to a broker it's a similar implementation but you won't find things like for example RabbitMQ have a concept of an exchange where you can say any message coming into this exchange has to fan out into 4 or 5 receivers. That's not possible if you use something like redis you just have to say this is a simple subject. I am publishing on that if I have 4 subscribers listening on that they will get the message otherwise it will be lost Any questions? I have a question on redis. Does it support transactional mode of say I have a message and I want to process it but I am not sure whether it might take some time for me to process it so what I want to do is I want to keep that inside redis but I don't want to remove it but if another consumer comes up I don't want to serve that either to that consumer till I say it's complete. Something like that is possible in RabbitMQ I guess but is it possible with redis? No. In redis you have a way of picking a message completely from the queue or you can do a peek so I think what you are trying to do is something like a peek but when you do a peek there is no way in plating the other application now but if you do something like this then you could build another element like this last peek message ID or something where you pick a message at the update this value any client which wants to peek will first go there, look at it, find the next message and then pick it up. So you will have to build that guarantee in there. So what will you say is your favorite messaging system and what will you suggest somebody to use in production? Okay. Production. So from this list if you wanted to do something locally on your laptop I would suggest you start with redis unless you have a very strong requirement of things like an exchange, routing and all those things but even then there are libraries which simulate AMQP on top of this. If you look at the website for celery I forgot the underlying database name but there is a messaging layer in that which is built by the same author that provides some sort of those things and for production you have to use something like tributemq because there you need message guarantees but if your production information is something like logs if you are just monitoring your website traffic and doing some sort of big data thing then if you don't mind losing one log here and there and you can use maybe redis, maybe Apache Kafka or anything for those things. Some people are talking about it and it's not there so what is that giving? Apache Kafka have not used it that's why it is not there on this list but as far as I know it's like a distributed log so when you write something and then IDBMS when you insert something it makes a transaction log in the back end so Apache Kafka is something like that but it's put on many servers so if you have large amount of information like I was talking about website traffic then you get in that case it makes sense to use Apache Kafka because it's into a file so if you want to open the file in some other application or just not type open the file work with it all right thank you very thank you all so thanks for you all such a wonderful session