 Thank you for coming. My name is Braulio, and I work for AgBlue Technical Services. I would like to start with a question. How many of you have used AgBlue to make a donation? Wow, quite a few. How many of you tipped? Not so many, but for the ones who tipped, thank you. I stole the title for the presentation from Bernie Sanders. He was saying that we need a political revolution. But Sanders made small-dollar donations popular, but he's only one of the more than 17,000 organizations that have been using the service for the last 13 years. And it's not only political, we also provide the service for nonprofits. In fact, AgBlue is a nonprofit. In the first quarter of this year alone, we have 3,000 organizations using their platform. And our Rails application is 12 years old. So how does it work? Let's say that Jason, in the back, who works here, he wants to run for city council. And I'm sure he would be a good city council, but he needs money to promote his campaign. He would like to get donations. Of course, he cannot process credit cards by himself. So what he will do is he will go to AgBlue, will set up a page, and from that point, we'll process the credit cards using that page. Once a week, he will get a check. And we also will take care of the legal part, which is very complicated and will also do the compliance. There are multiple reports that have to be sent when you are doing political fundraising or for nonprofits. So we also provide additional tools for the campaigns, statistics, A.B. tests. We have someone who donates, can also save the card information in the website. And the next time they donate to that, to the same organization or a different one, they don't have to enter anything. It will be a one-click, a single-click donation. We have 3.8 million people with AgBlue Express users. So far, we have raised $1.6 billion in 31 million contributions in these 13 years. And we'd like to see ourselves as empowering small-dollar donors. How many of you don't know what Citizens United is? Ah, all of you know, okay. Well, in case someone doesn't know, I didn't raise their hand is, Citizens United is a ruling by the Supreme Court and it allows unlimited amount of money to promote a political candidate. So a few people with lots of money will have a lot of power in the political process. In the political side, we also do non-profits. But in the political side, this is how we started. We would like to have lots of small dollars, which means a lot of people with little money having the same power. This is how the contribution page looks. This is for John Ossoff. The person here has never visited the website before. It will be a multi-step process. It's getting the amount first. Then we will, the next step, we'll get the name, address, credit card, et cetera. This is a non-profit, by the way. In this case, it is an Express User Donor. It says, hi, Braulio. So it recognized that it's me. And if I click any of those buttons, the donation will process right away. I don't have to enter a card number. That's what we call the single click. During February or last year, Bernie won the primaries in New Hampshire. That was the second state with primary elections and he won big. It was after Iowa. He gave a victory speech that night and I'm going to show a little clip from there. Right here, right now, across America. My request is, please go to berniesanders.com and contribute. Right away, we felt the burn. So the first one is, requests per minute is about 330,000 per minute. And the second one is contributions, credit card payments. 27,000 per minute, it's about 40 per second. That's a lot because credit card payment you'll see it's expensive to do. And the reason why I show these graphs is to stress the fact that improving performance is a continuous process. It's not something that happens from one day to the next. We were able to handle this spike pretty well. Some donors didn't see the thank you page, they only saw the spinner and after donating they never got to the next page. But we never stopped receiving contributions and you can see that there is no gap so the service was never down. And that's because we're doing all these 13 years every time we have high traffic for any reason we have been analyzing why do we have that? Is there a bottleneck? Can we improve it? So this presentation is about all the experience we have gathered during all these years. The first thing we have to do is define what we're going to optimize. And that will depend on the business. In every case it will be different. For an in-commerce website for example, it's very likely that it will be the response time on browsing the catalog. In our case it's very simple, it's a contribution form and we have to optimize two things. One is how we load it and the other one is how we process. Loading contribution, the contribution form is no secret is what you think of loading a form, it's very simple. But processing is a little different. In the center I have our servers and around that I have all those are web service calls that I have to do in order to execute a payment. We have a vault for the credit card numbers outside, the vault is the only place where we have the numbers. Outside of that it's all tokens. So the first thing we have to do is we have to get a token. We have to tokenize the card. That's number one, that's a post and a response. Then I have a fraud score and I have external service which will provide that service for me. Then I have with the bank I actually have two. The first step when all the credit cards are processed this way you first make what is called an authorization which is a post again from the bank will respond whether it's approved in that case it will give me an authorization number or it will say decline. But there is no money transferred at that point. I have to do a second step, another post and send the number I got if it was approved of course and the bank will respond with a confirmation. Also I have an email receipt I want to send and every organization wants to know what most of the organizations want to know right away when they get the contribution. So they want also to be informed. The thing looks like this. Do you remember those? How many can you do in a minute? It's too many young people here that you have never seen this. Only the old people can remember. Okay, so we have a, this is a high volume track. We have, because we have so many donations we have a scaling challenge. We have to be able to process this fast and efficiently. What I'm going to do now is I'm going to present one approach I will show several. So for each one I will explain how it works, how it is implemented as maybe some code and there will always be a cost and we'll give you how to solve it. The first one, metrics. Here is the part I'm not, can you see the graph or no? I thought it was going to be the case. How about that? Well we have dozens of this. I'm going to show only a few, the most important ones. This is contributions per minute on the X axis is the time on the Y is the number and something happened there. Actually what happened there was a burning one in Diana and there was a spike. With this we called burning moment by the way. So that one. Now I have another one here which is traffic. It will be correlated by the way correlating when you have metrics, when you have metrics numbers are not enough. You need graphs like this and you have graphs you can correlate and between traffic and contributions there is a correlation but this one is the number of contributions that are being processed because we have so many web services we have to touch. There will be a certain number that will be always in part of that process. So I'm counting those. And if you see between these two, the contributions and the pending, there is no correlation which is great. That's how it should be. If for some reason pending was also going up in the same way contributions are going up, it means the service is saturated. I cannot process as fast as I receive them. In this case it's wonderful. That's how it should be always and sometimes it wasn't but that's the goal. Then the last one is that one is latency. That is a time interval. It is how long it takes between I create a contribution and I receive an authorization from the bank. That's also an important number. In this case it's about two to three seconds. This is how I do metrics in Ruby. There is a gem called statsDRuby. I call the class statD, I create a new object, I pass the host name, I will have multiple hosts so I need to know where this is happening. The second instruction is a gauge. The gauge will generate a data point which is an integer. In this case how many pending authorizations I have. And the timing method. Both gauge and timing are static D methods. I have a timing interval which as I mentioned before, the distance between when it was created to when it was approved. Very simple but if you have lots of this you will be able to have those graphs and the way you render the graphs is always something called graphite. You will have Postgres, Postfig, you have all sorts of things. You also want to measure CPU, memory, disk. So there is something called collectD that will have plugins to gather that information easily. I mentioned logs, they are not really metrics but they are very important, don't forget them. Good, we covered the first one. Now, multiple servers. Multiple servers is, if you start with one host which is normally the case, even the fastest computer in the world won't be able to handle all the load. You will have to put a second, a third, et cetera. So this graph shows on the right, I have three machines running. Inside each machine I have a little circle. Those are representing threads so I can have a web server running in each thread. I have multi-threading which is simple but the important part here is I have different computers and because I have that, I need to have this piece in the middle which is a load balancer. The load balancer can be a piece of software or hardware. Well, in the end anything is software but don't get too technical with me. The browser will, the DNS will resolve into the IP address of the load balancer. The request will get there. The load balancer will pick one host and it will pass that request. How you implement will depend on the hosting company. I have here, this is called the poor man's version of load balancer because it's free, it is doing NGINX. I can configure NGINX to do load balancing and I see two blocks. Can you see, well, yeah. The first block, it defines the IP addresses of the three hosts. And the second block, the server block, it's telling me that I will be listening on port 80 and all the requests should be passed to the backend block. There is an algorithm that will define how they are picked but in this case whatever sequential random round robin, it doesn't matter, you can define it if you want. So there are costs involved when you are doing this. The first one, if you have used Heroku, the first surprise when you start with Heroku is there is no file system and this is why. You upload in the browser, you upload the file, it will be on one host. Later, there will be a different request on a different computer and that will, you will try to see the file on that computer but it's not there. So you need to provide a mechanism to fix that. One way is use Amazon Web Services S3 which is a memory, it basically is disk. Or another option is something called sticky sessions which is the load balancer can pick always the same host and send those requests there but that's for the second problem but I got a little ahead. I'm talking about persistence. You can replicate the files, you can do that. With persistences, I don't share the memory and I can have the sticky sessions I was talking about but Rails is very good. You have out-of-the-box Rails sessions, that's how you share state. You can also use Redis if you want your own data store. This third problem you're going to have is because you're having more servers running, all of them connected to the same database, you're going to start running out of connection. The database has a limit on how many you can connect. What we do is in Postgres, you can easily define a replication that means there will be copies of the database, that the data on those database will be a little behind but not too much. They're read-only but I can still use them and if there is a host that doesn't need read-write access, that one doesn't have to connect to the main database, it will connect to that one, to the replica. The last one is it doesn't matter if all the servers or all the hosts are up, it doesn't matter how many I have, if the load balancer is down, I have a problem and in our case, the solution is a combination between our CDN, I will explain what the CDN is, but it's a combination between the CDN and the load balancer provided by the hosting company. Good, we did two, we have next one, caching. Caching is the most popular one. Every time you hear about performance, you will hear, hey, you have to do caching and caching basically making a copy, keeping a copy somewhere to save time, you will have caching between, there is a cache in the browser and there is a cache in the web server and there will be caches in between as well. And if you have money, you can hire a caching service and have something up and running very quickly, that's why I say hi, it's viable, the effort. We use something called Fastly, which is a content delivery network, there are several, Akamai is another one, very popular, CloudFair, and I will explain how it works in the next slide, but this is very good, this is something that works very well and the loading part of the form is the part that gets all the benefit of doing this. And we one time had a distributed denial of service attack and we handled very well only because the CDN was there for us. We couldn't have handled that with our own servers. This is how it works. I have a browser in Boston and the boxes on the right are pop's point of presence that belongs to the CDN, it doesn't belong to Act Blue. I have a pop in New England, so the browser in Boston will make it, if you follow the numbers, you will follow the sequence, will get a request because this is the first time I request this document, the pop will make another get to our own server, the Act Blue host, to get the document, we will respond, that's number three, we're adding two headers there, we'll explain what it is and the pop will in time respond to the browser with the document. There is another header in the last response. Now later, there is, Providence is a CD near 100 miles from Boston, so that CD will also go to the same pop, will make a get, but because the pop has the copy, it won't request the copy from us, so we are not going to see the second get. The pop will respond with the copy they have. There will be other pops, the pops are distributed all over the world and for example, in this case, I'm putting one in the West Coast, so if someone in LA is browsing Act Blue, they will go there and they will have their own copy. This is a map where the red dots represent pop pops. This is the dashboard for fastly and the size represents how many hits I have. Hit, it means that I have the copy. It's a cash hit and the biggest are in the US, but there are also red dots in Europe, Asia, Australia. The gauge on the left indicates that there is a 97% hit rate and what that means is only 3% of the request will get to my server. 97% of all the requests that the CD is receiving, all the, you have to keep in mind that all the requests goes to the CD and first. 97% of them will stay there and it will never touch my web server, which is great. How do you control cash? You need to control two things. You need to control how long the copy will live in the cash and how you do the purge. Purge means you force a refresh. You specify how long it will live in the copy but sometimes you want to do it right away. You don't want to wait all the time. So, do you do this with headers? I'm showing a few here. Cash control is the first one, which is the most popular one. It tells me, maxH400, it tells me that, I will put the slides online, by the way. You don't have to worry about it. The 400 seconds, any place, it means that in the browser it will live 400 seconds or in between or in the CDN as well. Surrogate control is longer, 3,600 seconds and that one is the same as cash control but it's only for the CDN. There is a specific specification for how CDNs work. It's called H architecture and that's where this header is defined. Vari another one. Surrogate key, for each document, I can define that there is like a tag, which is great because at some point I might say, hey, I want to force a refresh on all these pages. If all the pages have tag key two, I can do it in a single call. There is another thing called Varnish Configuration Language, which is a script. The script has access to all the requests and the whole request and also the URL. I want to show something. The VCL, the script will run in the place where the pop is getting the request from the browser and it will also run in the part where it's getting the response from the host. And this is an example of what can I do with the VCL. I can check the URL and in this case, it starts with videos. In that case, I would like to use a specific backend for videos and that's the name of the backend, fvideo, it's a load balancer. And I don't want to, this is what I'm trying to avoid. I don't want to mix videos and the contribution. I can also respond right away. I can say, hey, return 400, for example, without touching the servers. There is also an API. With the API, you can purchase one or all of them. The cost is very expensive. If you want fine control on how the copies are kept and first, it will get complicated. Second, the third, if you are doing SSL, in our case, we always do SSL. It's complicated because all the pops will need to have a copy of the certificate and also the private keys. I need to maintain that. The other thing is, if you remember from the slide, what slide is five? Yes. It says, hi, Braulio. I can bet that checkout pages in regular websites are not cash because you have this personalization. If you cash this and my neighbor is seeing hi, Braulio, it doesn't work. So I need to handle that. And the way we do it, in our case, we cannot follow that approach. We have to cash because it's the most important form. So we have JavaScript. We cash everything except those little pieces and they are filled with JavaScript. Great, we have code three. We're going in time. Separation of concerns, also known as SOA or microservices. It's a very simple idea. I will have different applications to handle different parts of the system. The first example here is the tokenizer, the vault, which is an application written in Node, completely separate. It's even in a different hosting company. I can have also multiple copies of the database that way and one of the advantages of having separation of concerns is for compliance. The fact that we have this vault means that if I don't have access, I don't need to comply. I don't need to have an antivirus on my laptop. So that's why I have never seen a critical number because I don't want to have an antivirus on my laptop. The cost, as anyone who has done microservices of SOA, is the fact that it's very difficult to implement and very difficult to test. Great, one, two, three, four. We have covered four. We are now going to cover now different tasks. In our case, we're going to talk about tasks that are slow. So I don't want the web server to be doing something that's low because it will hold it for a long time and that server won't be able to handle other requests. It means I need to, if, for example, this is slide number nine, and all of these things are slow. So let's say I'm talking with the bank. That shouldn't be done with a regular web server. It will take several seconds. So what I do is I save that job for later. If I want to do it later, I need to save it somewhere so I need a queue. So that's all of, in our case, almost everything will be a different task. Extra benefits of doing this is for isolation. If the bank is down, for example, I cannot process authorizations or settlements because it's a different task, it doesn't matter. The customers will still be able to donate. They won't know if it's approved or not, but they will get the thank you page and they might even get an email saying thank you for the donation and we'll tell them later if it was approved or not. The other advantage is in Christ's reliability if for some reason something failed on the authorization, for example. I have all the information saved and I can't re-run it. In our case, the third batch system, I put that there because it was a big gain for us. We decided that the contribution was going to be paid, was going to be realized at the authorization point, although we hadn't the money yet because we haven't done the settlement, we're going to consider that paid right away. If we do that, we can do the settlement, we do it deferred always, but we can do it in batches. Instead of having one settlement per post, we can send one post with 400 settlements, big gain. This is how a QU system looks on the right. I have the processes doing the work. They are called workers, authorization settlements, sending email. I have the Q in the middle where I save the jobs. On the left, I have the web servers putting the jobs in the Q. We use SideKick and we have two blocks. This is how we use SideKick. We have two blocks. One is you define a class for the worker. In this case, we're doing the settlement and you have to define a method called perform and the settlement.find will get an idea of the model settlement and the method that will talk to the bank is settle exclamation point. So that first block is who is doing the job. In the second, the line that's in the middle, it's how to put it in the Q. I use the perform async method which is a method with a SideKick and I give the idea of the settlement of the record. I put it in the Q, the other one will process. The Rails 4.2 has ActiveJob Incorporated and you get ActiveMailer has it integrated out of the box so you want to send an email asynchronously, you say deliver later. The line at the bottom. Very simple. If you're doing different tasks, you have some costs, the Q in systems are unreliable except SideKick. That's because Mike who wrote SideKick was around. So, just in case. But it's not that. If job can die, the computer can die, the bank can have a problem, or the communication gets disconnected, all those things and you're having a different task, the job didn't run, what are you going to do? SideKick will do retries automatically but maybe you run it all the time, all the times you're supposed to retry and you never succeeded, what do you do? If the settlement, let's say $100 settlement and you never settle, you're going to lose that money because you didn't transfer. What else? Coordination, you cannot do authorizations after doing settlements that will fail. And it's difficult to debug. Things, it's kind of crazy because things happen anytime because we have multiple hosts if they happen anywhere. You remember I told you about the logs? This is where they are important. That's the only way to know what happened. Great, we're doing great. That's the last, we have covered everything but the last one, scalable architecture. This is the most important one. And I put it at the end because I am a developer and as developers we always overlook architecture but it shouldn't be like that. The idea is when I'm writing software I have to be thinking it has to be fast. I'm sure all of you write fast software but that's not enough. You also have to think how I am going to scale this in the future. And if you don't think this way you might make mistake and it's going to be difficult to fix because you have a whole system written that way. So I'm going to give you two examples. Let's say that you saw the first contribution form I have these amounts. So I might say, hey, I would like to have a process that on the fly will calculate what are the best amounts to show depending on the organization and depending on the user. So I can say, okay, let's start developing this and I'm going to use machine learning. Oh, great. And right away I say, you know what? If we have a central system to do this it would be great, things would be easy. Oh, great, we do it that way. That doesn't scale. At some point I will have so much load on this system I would want to have two of these and I can't because there is a central one so I cannot have two. There is another example, the deferred, excuse me, the batches for the settlements. We decided it was a decision. It's an architectural decision. We are going to consider the payment, the contribution process at the authorization, at the end of the authorization and that is huge because I don't need to do the other part. I can say right away after that step I can say, hey, we're done. And that's the list, scale of architecture at the top. And that's all I have. If you are interested in what we do, come talk to us. We have stickers also somewhere there. Those are my colleagues by the way. But we have time. We have seven minutes or six minutes for questions. Anyone have? Okay, the question is when you have multiple servers how do you handle the logs? You will have many computers generating them. We use paper trail and it works very well. And with paper trail you define, you have to, the systems generating the log will connect with their system and through a web interface you see everything in a single page. You can filter if you want of course. But that's the way to go. Okay. We have, we could do that. We don't do it on purpose. We don't have a system to simulate load. Excuse me, the question was how do we simulate load or how do we prepare for the future record like this one? And we don't do it on purpose because we have something called recurring contributions. Every day at four in the morning we run, you can define, I make this contribution and I want to make it every month or I want to make it every week. So we have lots of them. And at four in the morning we run them all together. And that's a lab in itself. So it's pretty close to reality and we can analyze there how the system is handling. In fact, in some cases the bank cannot handle the load because it's all in one single time. We can do it very close. We have to gauge it, throttle. We have to throttle it and make it a little spread. Also we have the end of quarter. On every end of quarter the organizations have goals and they all push until midnight. After midnight all the traffic will go down. So we also use that. So basically we don't do our own simulation but we are very careful to study those cases. Another question? Great. Thank you so much.