 So I guess we'll start. It's my first time in rail scones. And it's the first time I'm seeing snow in my life. So this is the warmest set of clothes that I could find. Because back where I'm from, even if you have a t-shirt, if you don't have a t-shirt, it's fine. Because that's hot and humid. I'm from India, by the way. So, sorry. So just a shout-out to my company and my team back home, Red Panthers. We are Ruby on RealShop based in Cochin, India. We have clients all over the world, and we help companies scale their dev team when they need it and also help build their MVP. So if you need any help, just come talk to me. This is me. Everywhere on the web, I'm known as Code Riches. So GitHub, Twitter, anywhere. And if you find some place where Code Riches is not registered, please tweet out to me so I can be Code Riches there as well. I organize a lot of Ruby events back home in my state. Recently, I just organized Rails Girls Cochin. Thank you. And I'm a Ruby developer mainly, but I do flirt with Elixir. And I'm one of the crazy people who have a crystal application in production. So if you guys are into learning something new, try out Crystal, and we can definitely run it in production. Another thing you notice about me is that I allow stickers. This is my old laptop and this is my new laptop, so you guys have to help me fill this laptop as well. I'll give you free branding across the world. I'm going to Taiwan after this, so. And if you need stickers, I have a few stickers with me, so please meet me after the talk. This is where I'm from. You can see a lot of elephants there, and that's a real picture. We have them every year. That's my city. It's a coastal city towards the south of India. And if you guys ever come to India and happens to be around this place, just tweet out to me, and you have a lot of fun, friendly Ruby developers that you guys can hang out with. There's a fun fact about India that I always like mentioning is that we have 22 official languages, 1,653 spoken languages, with 50,000 dialects, and I know three of them. So if you see two Indians talking to each other in English, don't be surprised. So my talk is actually a story about a newest developer that I am who wrote the first lines of code of a web application and everyone was saying that it's gonna just be a thousand rows of data, 10 users probably, and it drew so big that we are processing gigabytes of data every hour, right? And this was me in 2014. So this talk is about the last four years of my life. I'm giving to you guys for free. So all these things helped me and it helped our client as well because we didn't have to invest a lot in the hardware because we didn't have that much money. So we are all craftsmen and we can't do anything without our tools. So I have to give due credit to my tools. So this is what I call my tools, the Avengers. And they're really good if they all work together. Ruby is Captain America. Rels is Iron Man. And database is the Hulk. The thing is Hulk can smash, right? The problem is we make him carry his suitcases. That's the cost of everything. So we have a Hulk and we shouldn't be afraid to use it. Now this talk is about how we can offload a lot of things that we do with our Rails code to our database so that we're actually making full use of the stack that we have and I've done all this in production. So you don't have to feel scared to do this. So we're gonna talk about a lot of concept related to the database, like some of this might be familiar to you but it will be a refreshing, I guess. So we're gonna talk about the query planner, indexing, attribute preloading, the views, generating JSON and the synchronous commits. So let's start with the brain of the database. Database as we all know is a general purpose software, right? It wasn't built for the use of a single industry. It's built for healthcare. It's built for aviation, the banks. It was even used by Twitter. So the question is, how does it handle all the scenarios? A general purpose software, which we do nothing more than define a schema, is supposed to take care of all the data of all the industry. And the truth is, it doesn't. The DB doesn't know what all scenarios is gonna be put under and it's up to us to optimize for it. There's cool syntax that most of you are afraid of. Doesn't define how we should get the data, it just define what the data should have. At no point of time, we are actually telling it that I need users email ID and name, that's it. I'm not telling it how to find it, how fast to find it, or I'm not even telling it if you don't find it in time, I'm gonna lose my job. So where the decisions are made, right? We are programmers, we learn about algorithms, we learn about data structures. So where are the decisions made? And that's what the query planner is. It's the brain of your database. So we need to understand how that works for us to optimize our system, right? We need to know how it thinks to manipulate it. Sorry, okay. Now, the plan is created by the database, whenever we run a query, the plan is literally just the cost of running the query. The problem here is the last line that I'd like to mention, the plan thinks that it has the perfect one. It assumes that there is no better plan for it. But we as humans know stuff better. So we need to see what the query plan knows, right? To change it. And actual records, I believe it's from 4.2 or something, came with the dot explain method, which gives us the plan. And this is two examples from my life system. And as you can see, the cost of one is more than the cost of other. And it also says about sequential scan and index scan. So it has assumed that the best way to do is a sequential scan rather than an index scan because there is no index. So we checked the query plan to find where we are slowing down, right? And we fixed them and make the plan choose our faster methods. That's basically it. Sounds simple, right? So let's see how we can do it. Oh, and a tip by the way is that what you saw is a SQL. We can get the same format in YAML, XML, JSON, whichever you guys are comfortable with. And the golden rule is let's index everything. So that we do index scans. But I found out that it's not better to index everything. So index, it's a refresher for you guys. It's a special lookup table that the database search engine can use to speed up getting the data. The index is like a pointer that points to the actual row where our data is. So what you know something, this is something that I found out is that databases are smart, really smart. And even if you have indexes, they might not actually use them because they will find out the cost of actual doing a sequential scan is much better. Imagine that you have 100 million rows. It's better. And let's say if you have not just a unique item, but multiple like 100 rows with the same item, then for the database, it's much better for you to do the sequential scan. So if you have a 10,000 rows that says short, medium, and long, and you've indexed it, database will just go to the sequential scan because it finds it to be faster. So the things that we should index is the primary key and the foreign key. The primary key is indexed by Rails default. And index all the columns that we are gonna put into a warehouse, index the join tables. Like me, if you're joining, not just using the foreign key and the primary keys, but joining with some search terms as well, then you need to index them. Index the date column because it will help speed up your reports. And if you have a condition in your work loss, where in which I want everything which is more than $90 price or something that's less than $30, we can build something called a partial index. So anything in the work loss, feel free to index. Do not index stuff that you are gonna read and write a lot. Do not index tables that you know is gonna remain small forever, like a setting stable. And do not index columns in which you're gonna manipulate the rows, sorry, the columns a lot. So do not try to index a huge bulk of text which you know you're gonna edit again and again and again. So that's about indexing and it helps us. And next is attribute preloading. Most of the database comes with arrays right nowadays. The postgres arrays is widely used. Now I'm gonna talk about how we're gonna do in the real suite in which we need to find all the tasks that includes a tags, which runs two queries together and we get objects for each task. Now what I'm gonna suggest is that we write a bit of SQL, load everything into an array so that we end up with a single object and a query that runs three times faster. Imagine that you're actually populating a view. It doesn't make sense for you to have a hundred different objects because you're not gonna work on it. You're not gonna run a method on all of your objects. You just need data to populate your view. If you have reports, if you're printing PDFs, more objects is not good, especially if you're working with a huge database. So we load them everything inside an array so that we get a single object and then we loop through it. So this is one good use of arrays and making your system faster. The next thing what I'm talking about is the views. And we all know what a view is because the thing that attracted everyone to Rails is the design pattern of Model View Controller was easily implemented. I remember back when I started, we learned about Model View Controller and I'm like, no, it's too much. And Rails came, oh, it's all taken care of. You just put stuff here and you're only following the pattern, that's good. And the thing about views is that you don't run your business logic in the view. If you are doing it, then don't do it, right? So the view is meant to just display what we want. So imagine that you have a view, it's a report of all the transactions, sales that happen on a day. And to generate that view, you need to run three different queries in which one is the orders, then you need to get the products, then you get the deliveries and all those things for your manager, right? So every time you go to that particular page, you're running all these things. So in databases, we have something similar in which if you have a frequently running query, you create a view so that you don't have to run the query again and again. It's like our HTML views, but why we do it because it makes our life easier. We can do it the other way, but keeping everything separated makes our life easier. So if you have something like this in which you are just, you have a company's table or user table and you are selecting ID, name, and email with role managers, that's how you find all the managers in your company. You just create a view for it so that next time you want managers, it selects staff from company managers. Now the thing about views is that they don't take physical space in the hard disk. It's something that leaves in the RAM of your database server. So the speed that you gain here is that when you have a view defined, the system doesn't have to calculate the cost again. The plan to run it is already taken care of, right? So all the pre-steps that the database do to run the query is taken care of so it can directly execute the plan and get you the data. And since they are virtual tables, they're called pseudo tables, they don't exist. Which felt like a real loss because we are getting this data again and again. So people start thinking and they came up with materialized views in which data is stored in a physical space in your hard disk. It was first introduced by Oracle and then it's now in Postgres, Microsoft SQL, IBM, DB2, everywhere. MySQL has a lot of extensions that lets you create it. Now the question is, how do you use it in Ruby? I know that I'm coming to RailsConf and talking more about databases than actual Ruby code. So let's come back. Now thanks to Active Records, we don't need a fancy gem or anything to create such views. We need to write a bit of SQL here because we need to write the migration for it that creates this view. And then we create a model in which we just set the table name to the name of our pseudo table. And that's it. You can do everything else that you normally do with the model. You can do select, you can do where, you can do find. Everything will now happen on our pseudo table which has the data from our view. So if you have a report and you want to run filters, it's done. The system is much faster. Now you don't have to go through eight queries to find who made the maximum number of sales in a day. It's all taken care of. The first, the last, the find, everything can work. So you need to define the ID of this table because this table doesn't have an ID and first, last and everything else works on ID. So you need to fix it. And this is a benchmark of with a view and without a view. And as you can see, it's freaking fast. It's thousand times fast. Now the takeaways here is that it's faster to fetch data. You can capture the commonly used joins and filters and push data intensive processing from Ruby to the database. So we've been doing a lot of filtering and all those stuff on the Ruby side. Now it's all offloaded to the database side. And we can have fast and live filtering of complex associations. And we can index this. That's another thing. This views can be, this myth less views can be indexed because they now actually take a physical space in your hard disk. So it's even fast. Now the loss here is that it will take more RAM and storage which is fine in nowadays because you can easily pay for it. And another sad thing was that the data it stored you have to refresh every day. And until all the version of Postgres, they didn't, when you refresh it, the table gets locked. No one else can read from it. But the latest versions, it's all about parallelism. So when it's being refreshed, people can still access it. So making it much better. Now the best way to use, place where you can use this is for history. Because you know the history is not gonna be changing. It's a future that's gonna change. So that's where we used it because we used it a lot in our history. So sales data seven days ago, sales data one month ago, sales data till the end of time. Oh no, since the dawn of time, everything is in views. Now we just need to work on, so when you need live data, you just work on that small subset of data and your system is faster again. Everyone is happy. And you get live graphs and everything and you didn't have to use NoSQL. Now this is a really important thing that really saved my life. And if you learn one thing from my talk is I hope this would be it, is that we moved our APIs to the database. Yes, we did it. So JSON is really important because it's now acting as a glue between your friend and your back end. All modern JS frameworks, SPAs, everyone is happy to work with JSON. So it's natural for the evolution of databases to support such a format because the world is moving towards more and more formats and now database even supports stuff like XML and everything. You can even create your own data types if you want. So if you want to convert a single row in your table to JSON, you just do row to JSON, right? And it converts. So this is what we use with our serializers and everything. There's one interesting thing and when it comes to serializers is that active records breaks up a query, creates a tree, then forms the SQL and sends it to the database and the database breaks it up into a tree again and then gets you the data. It comes back and reels then again converts it to JSON and sent to us. So we just have a whole process in which both the systems are doing the same because the best way to do all this is with the tree and we just eliminated it because now the database returns with us with the JSON which we can just literally send to our end users. So this is what it will look like if you try to convert an entire row to JSON which is pretty much useless and I will be crazy to display my password. But for more practical use, we do not convert the actual row to JSON, we just convert our query to JSON, right? The query that the response that we're supposed to give, the ID email from users and we just give that and we can do more complex stuff. And here if you have a view, just pass in the view in which you convert the, so here in the users have projects and I'm just sending all the projects that users have. And we did data preloading here, so everything goes in a single step instead of having to run another separate query to get the first one and all those things. Now imagine that you don't have anything in the database to create this JSON request. So that's when I said that we can just create random stuff in our database right now, even if it doesn't exist. JSON build objects will help you just add, create something dummy, let's say foo bar, anything. And it doesn't have to exist in our schema or in our database. Again, this is the question that most of you have in your mind, where is Ruby? And there are several gems that test this. There are, this is a gem that we used called sures. It helps you, adds a fine JSON method to your models and help you get the data. But for more complex stuff, you will end up writing SQL and there is no escaping from it. No matter who ever says that we can escape from it, I don't believe so, we will. Because back when we started, I mean, when I started, I was a PHP programmer first, like most of you guys. And the thing that sold Rails to me is that I didn't have to write SQL. You do user.wear name equals harris and that's it. Life was so much easy and it was mind-blowing. That's what got me into Rails. That was literally made me tell, okay, I'm not gonna write PHP from this day forth. I'm gonna be a Rails developer. But as the system grows, our complexity grows and we have to find alternating waves to solve this. And that's when we have to end up writing SQL for more complicated stuff. But like me, if you want to keep as much stuff in Ruby then create a view for it so that we can just pass in that and use the gem to generate the JSON. And the benchmarks is we bought down our API request that was like two seconds to 200 milliseconds, which is what everyone wants. And if you take the benchmarks, it's like 25 requests per second became 467 requests per second. Everything shoot up. Now the last thing that I want to talk about is synchronous commits. So we have a system in which we have a lot of writes, right? A lot of data is coming into our system. We persist whether we want into the database and rest we just leave to something like Kafka or the logs to take care of it. But let's say that you have to show everything into the database itself. So the thing about databases is that it's billed to be reliable. That data has to be stored no matter what. That's the underlying principle that all databases try to follow. Because even if the matter is that you can't lose data if it's a financial transaction, you can't tell people that I've lost your money and I have no idea where the money is. So it's really important that everything happens synchronously, that something is committed and we make sure that it's done. So database like Postgres sacrifices speed for durability and reliability, which is not good for us. Postgres school is known for slower writes and faster reads because of this. Now, the slower writes is because it's wait for the confirmation from the hard disk. And if you're using SSDs that will make your system much faster, but still it has to wait for that response that says it's done. And you can disable this confirmation check to speed up your system. If you do not care about the confirmation of whether something is done, but the fact is you have a lot of writes and you just want to keep on writing. And you don't mind losing stuff. It could be raw data from, it could be user transactions, it could be anything that you find that's not valuable enough. For in my case, a lot of IoT devices sending me a lot of junk data. But the fact that I seen the data is more important for me than what I saw. So we ended up use disabling synchronous commit in a transaction, not for an entire table or anything. We just disabled it in parts of our transaction and it makes the stuff run faster. Now the only issue now is that in case your DB crashes, the data that is not saved to the hard disk is gone. And it won't corrupt your data, but you will lose data. But in some use case, it's fine. It could be 10 megabytes of data or six megabytes of data. It's basically the size of the log that you decide to keep. If I say that my writer had log is to be one gigabyte and I lose one gigabyte of data, I might lose a one gigabyte of data. That's the thing. And this is the configuration of our database that we store along with our Rails app. So in case you guys are interested, there are a lot of configurations in your system that you just choose to ignore, right? But if you just keep a tab on it, it can make your systems faster and it could offload a lot of things that you otherwise have to take the pain of writing and it will still be slow. So just summarizing all the things that I just spoke about. Indexes are good. Areas are good. Simplify all the frequent reports and everything using a view. Move your complicated, I won't say mood all your APIs to the database like I did, but move more complicated ones to the data of your system to the database. Disabled synchronous comments where it's fine. The most important thing, the most important takeaway that I want to tell you guys is that know your tech stack. You guys are Rails developer, but you guys are not just using Rails. You're using a lot of tools along with your Rails. Get to know your team. You should have control of all your moving parts. If not, your system will reach a bottleneck at one point. SQL has been around for 40 years and it will continue to be here, so it takes some time to get to know it. And there is no golden rule. What worked for me might not work for you, but these are the things that you can do to scale a monolith. So I blogged about all of this in more detail in our company blog, so if you want, you can learn about most of this and thank you. These are the 22 languages. Thank you. Thank you so much!