 I'm just going to turn it. That's a tall one on this side. Ooh, yes, thank you. Okay, cool. Should I stay here on that side? I still haven't decided. It's better there. Perfect. I know, for your side, yes. Okay, so hi everybody. Sorry, I'm still a little bit sick. So my voice might go down as we go. I have some hot to drink. So I know the name is a little bit harsh to RM or not to RM, but just stay with me. We'll talk about this. So most of us, I assume, are doing web development, and web development is mostly about building crowds, right? We have our logic, we have our users, some inputs, we have our database, this interaction between the data and whatever is our inputs. So ORM is, as you might know or not know, object-relational mapping, and it allows us to build this nice abstract layer between our application and our objects and the database itself. So we're going to talk about Active Record as one of the implementations of ORMs, and this one is actually a name of a pattern, so Active Record is actually implementing Active Record pattern. This one was first introduced in the book Patterns of Enterprise Application Architecture in 2003 by Martin Fowler. Here's the link to the website where he talks about it briefly. So the rough explanation is that it's an object that wraps a row in a database table or view and it encapsulates the database axis. It also adds domain logic on that data, right? So we're going to be mainly concentrating on the database axis and less talking about adding domain logic. We all know how to build Fed models, so I don't think we need to talk about that one. Okay, so Active Record. In a nutshell, one object represents one row from the table, okay? And each column will have a property on that object, reader and writer. But what if we need to get values from multiple tables? So this is going to be the main highlight of this topic. What should we do if we are actually talking about selecting something from multiple tables instead of one? Can we use ORM for this? How can we use it? In which cases it's a good idea, in which cases it's not a good idea? So we're going to be querying multiple tables and we're going to do some Ruby coding. The code is very, very simple and it has to be simple for the sake of presentation. We're going to do some SQL, but this is not a SQL talk and I'm definitely not a SQL expert. So just a bare minimum to show you some of the ideas. And we're going to do some real-world tasks as I'll highlight in the next slides. So I work in this company called Pocket Math and it's a mobile advertising platform. And we have a lot of data in our system and I decided to use it as an example of what you can do when you're querying multiple tables. So I'm just going to quickly show you the workflow of the example that I have chosen. So we have multiple customers who can create advertising campaigns. Our platform receives traffic, so the possibility to actually show this ad to someone on the Internet. And we have an auction. So our bidding system makes decision whether to bid or not to bid and sets a price for it. If our bid won, then we track the impression, the actual fact that our ad was displayed somewhere. And if we are lucky, we also track if someone actually clicked on our ad and got interested. So without further ado, let's jump to the database schema. So these are campaigns which belong to customer. It has a name, total budget, and country code. It's a very, very simplified version. Our system is way more complicated. And this is the core table that we will be looking at today. So hopefully you can, guys, spend some time looking at it. Not memorizing, of course, but at least. So each stat is just one role where you have information about what happened to your advertising campaign. So it has a link to a campaign, campaign ID. It has our APOC, which I'll explain in the next slide. And then we have four possible metrics which we want to track. How much have we spent on this advertisement campaign? How many impressions actually happened? How many clicks? What we found is how many times we actually tried to bid, but haven't actually won. So we aggregate our stats hourly, right? So this is where our APOC comes in place. Our APOC is just a number of hours since UNIX APOC. I'm sure you all have been working with timestamps, which are usually seconds from the UNIX APOC. This one is just same, but for hours. So our task is very, very simple. We want to build the API endpoint, which would return campaign details plus aggregated stats. So I want to collect, for every hour that this campaign was active, I want to collect how many clicks, how many conversions, how many bid requests, how many spent, et cetera. So this is the example of the output that we are trying to achieve. The name, the total budget, how much have we spent, how many clicks we have received. And on the right, I highlighted the tables where this information actually is available. So the first two are just straight from the campaign, very easy. The last two are from the stats. So this is our starting point. We just have two completely empty pretty much active record models, where we have campaign that belongs to customer and has many stats. Stats belongs to customers, two campaigns, sorry. So the first implementation, we're going to go as simple as it's possible. We're just going to throw some methods on our model itself. So we're saying, okay, every time we're going to talk to campaign, we can also ask campaign about its spent. And we will just go through all of the stats and just sum up all of the spent, right? Okay, cool. We are ready to go. This is how we can use it. We just select the campaign and we created JSON representation by calling all these methods on the campaign. Good. So then suddenly a new requirement comes in. And with every new requirement, we are like, whoa, whoa, whoa, what's going on? But the requirement is actually very, very simple. Instead of selecting one campaign, we want to select multiple campaigns. But the output should be pretty much similar. So yeah, why not just reuse the previous implementation, right? All we need to do is just instead of doing this on one campaign, we go through all of the campaigns, just map over them and do exactly the same thing. Right? Simple, job's done, but is it? Maybe we are doing something weird. Like, are we actually doing something weird? We might. So yeah, here we just introduced a very, very stupid thing, right? First of all, for every campaign, we will load into memory all of the possible stats. Second of all, we'll do this multiple times. And third of all, it's just an n plus one query because we'll do this for every campaign separately. So we all know how to deal with this, right? I'm sure I'm not showing anything new to you guys. We can just use includes, which would just create these two nice SQL queries for us. So instead of doing n plus one, we're just doing two. Awesome. So can we make it actually better and use one query? Of course we can. We all heard about joins. What can be easier than just to join two tables? So we just add joins, but look at this. Is it okay? Is it actually going to help us? Turns out not because if you looked at the select statement, even though we joined the stats by default, ActiveRecord would only select attributes for the main column. Sorry, for the main table. So if we're calling this in campaigns, it will only select all attributes of the campaigns. So how can we fix it? Well, we can just provide our own select, right? We can always redefine and say exactly what we want to select. So we can say, yes, I want everything from campaigns and everything from stats. Boom. Unfortunately, this produces a pretty big result where we have the intersection of all the possible campaigns with all the possible stats. So still no luck, right? We still need to select all the values and then somehow manually join them, sum them up or whatever. So not good. So again, nothing very new here. We can use some and we can use group by to fix this issue. So instead of selecting all of the fields for simplicity, I'm selecting just one bit count where I sum it and then I group by campaign ID, meaning that as the result, if I have only two campaigns, this joint would actually produce two results and it would automatically create a bit count field. So as you can see in the end, you have to put something like as bit count or whatever arbitrary name that you want to use so that ActiveRecord knows how to map this and create the appropriate reader and writer for this attribute on the model. If you don't do this, it's going to be tricky to actually get access to the bit count field. So the query is pretty nice, but maybe we can move it to scope so that we don't have to type all of this and we can reuse it in multiple places. So this is how this can be done using SQL or if you're not a big fan, you can always switch to RL and in the end we'll discuss whether it's a good idea or not, but pretty much these two are completely identical and will produce you identical results. So achievements so far. We are able to select from two tables and technically you can use this to select from as many tables as you want. All the heavy lifting is done on the database side. We don't actually select too many information you don't have to sum up the values on our site and we are combining raw SQL with ActiveRecord DSL, which means we can chain other scopes or we can do whatever we want. It's still going to be ActiveRecord objects and are we satisfied with this one? Well, actually there are some additional things that we can do. For example, as you all might know, ActiveRecord is a pretty heavy library. It's pretty big, it's not very fast, it uses a lot of memory. So in tasks where you actually have to select a lot of records, it's not a good idea to create instances of ActiveRecord, especially if you don't actually have any methods that you are planning to call on them, if you're just selecting them to throw into JSON. So we all know how to do this, right? It's very easy, we can specify which columns we want to select and it's going to return us the pure values. It's not going to instantiate ActiveRecord, it's faster, less memory, boom. So let's do that. As you can see, I'm using the scope that I created in the previous slide and based on that I'm just going to pluck and I'm specifying all the attributes that I want to pluck. Unfortunately, again, we see there's something wrong with the query that's generated by this. As you can see the spent in the select statement, it doesn't actually correspond to any of the tables, right? Because it's a virtual attribute, I've just created it and give it a name. So unfortunately we cannot do this. So how can we help with this? Well, again, we have to actually specify what are we selecting. So instead of just saying spent or bit count, I have to specify that I'm selecting the sum of those attributes. As you can also see, there's no longer a need to say as spent or whatever would be the name of the attribute because we are no longer creating the ActiveRecord object so there's no mapping happening. We're just going to select the numbers itself. So now as we can do this, we just go through the campaigns. We call the FAT stats, which does the query and return us an array of arrays where we can just simply use those values. Very simple, very fast, nice. So let's recap again. We still can do queries from two tables. Everything is still handled by the database, which is awesome. We can do queries to return ActiveRecord objects, which might be very, very useful. And we can also do queries using pluck to return the lighter responses which would consume less memory. So are we actually satisfied now? We're pretty close, right? We're pretty much able to do everything we used to do. But can we do this querying actually simpler? Right now we are creating custom scopes. They're pretty big. We're manually specifying all the selects or we're manually specifying all the attributes that we want to select in pluck, which is verbose and what if the name of the attribute changes or we want to add a new one, it's going to be pretty painful. So what we can do is a very, very simple thing. We can just create a view, right? So the view is kind of like a table. We can treat it as a table, but it doesn't actually correspond to any value. So it's going to run this query every time we are fetching anything from this view. So I did exactly the same things that we did previously in our scope, but now I created a view. I gave it a name, which means now I can just create a corresponding actor record model. It doesn't have to know that it's not a real table. The behavior is still completely the same. I won't be able to do any inserts or updates on that view. And now I can use a very, very simple interface that we are all used to. I can just do where, I can do pluck, and I don't need to do any magical stuff anymore. So our final solution, we use database, we use join and group as usual. We can manipulate with the data from multiple tables. We use actor record because it helps us to build queries. Yes, we did a lot of raw SQL, but still in many cases we just had to specify the aggregation function or the name of the field. Rest of it was still done by actor record, which is great. We can use a rail or we can use the raw SQL for this. If needed, we can use pluck, which is again great. So the summary is very, very simple. We sometimes keep forgetting that SQL is a very, very powerful thing, and we try to use rails and active record too much to try and fix some issues that we might have. So let's figure out what's the right tool for each job. If the right tool is SQL, don't be afraid, just use it. Prefer pure SQL over a rail, especially in a complicated case. So I've worked with different teams and every team would have a different approach regarding this. Some people would say, like, never use raw SQL because we might migrate to a different database and what would happen then. Well, I was lucky enough never to work for any company that would actually successfully migrate to a different database. Because if you're actually using your database, you will for sure want to use some of the specific features of that database, even if you don't think about it. Like if you're using MySQL and you're doing grouping, believe me, in most cases if you switch, let's say, to Postgre, your grouping will fail because the rules for that are completely different. So if you need to, of course, stay with a rail but otherwise prefer a pure SQL. Another reason for this is that a rail is private API. And I know in Ruby, saying private API is kind of tricky because you can do whatever you want, right? You can get access to anything. There's nothing private and nothing hidden in Ruby. But the definition of private API for any rail stuff is you just go to apirubionrails.org and try looking it up. If you don't see it there, it means it wasn't been documented, which means it's technically a private API and Rails Team can change this whenever they want. Actually, this happened multiple times. If you are using a rail for a longer period of time starting from Rails 3, you are doing upgrades, you already know all of this. So try not to use it. So ORAM is very good because like in our infrastructure, right? In our ecosystem, we have so many gems already which are using ActiveRecord. Sometimes it's a huge pain, but sometimes it's a big win because it's so easy to find a library that can do whatever you need and you should just start using it. It's so simple, right? So anything that returns ActiveRecord relation is usually better than just returning the raw result because you can use gems, for example, for pagination or something like PaperTrail or other soft-delete libraries combining with your own scopes, et cetera, whatever you want, right? So for this, at least for this reason, ORAM is very, very useful even if you are doing a very, very hacky query where you have a bunch of raw SQL, it still kind of makes sense to wrap it with ORAM. Thank you. The fact that gems like Caminari or Wood Function 8 use ORAM isn't actually the opposite because they use ORAM, you cannot use them with, I don't know, SQL or some other ORAM. Yeah, I completely agree. Yeah, exactly. Yeah, it has two sides, of course. It completely has two sides. Some of the gems would actually allow you to paginate any objects as long as you just have a particular interface. But yeah, in most cases, like the whole Rails community is just around ActiveRecord, which is, again, very, very great and it's just so easy to find libraries that would work together seamlessly. But yes, if you don't want to use ActiveRecord, you're kind of stuck and you would have to build a lot of infrastructure on your own. Yes? How do you migrate that? Yes, it's a very good question. So by default, if you do the DB schema load and DB schema dump, views are not included in that, right? So maintaining that is kind of painful. I know there's a gem that helps with that. On top of that, you can always use the SQL for your schema. So yes, when you do your schema, it can be Ruby code or it can be SQL. If you do SQL, it would actually include all of the views. This is a very, very good question because actually maintaining views is kind of painful, right? Because as you progress, you cannot just easily change the view. You have to re-create it every time. Let's say I now want to add another column to it, right? You would have to completely re-create it. Usually you can do this on-fly in production, but then migration is kind of different than with the regular table and holding the schema is a bit tricky. Yes? Does an ActiveRecord relation and the same API is error or is it too different? No, it's too different. So technically, ActiveRecord is built on top of error. From what I know Rails 5.1 is supposed to digest those two separately. I'm not sure what's happening there, but yeah, it's two different APIs. It's a way lower-level API. So for example, when you say some columns, right, you can actually specify the error action that needs to happen. You can do like if null, divisions, multiplications, whatever. So anything that can be expressed in SQL would have a representation in Errol. But it's usually a way lower-level than ActiveRecord. So how do you get from... I mean, maybe you showed that in the talk. How do you access Errol from the relation? Usually you don't. So Errol methods are available if you are in the context of the model. So if you do this in the scope, you can just say Errol table and you have access to the Errol representation of the current table that you're dealing with. Or you can say name of the class, let's say stats.ErrolTable and you get access to Errol of any of the tables. So you can call it outside or you can call it internally from the model. But once you use ActiveRecord without Errol, you can't mix it or something? You can still mix it. So usually what you do, you build part of your query and then you put that Errol part of your query into the ActiveRecord method. So you can build the Errol thing and put it to join or put it to where? Yeah, yeah. You can look up... There's a lot of articles where people go to the lower level of Errol because they want to do something fancy and they don't want to write the raw SQL. Because yes, officially the documentation is no good. So you can find it, all of it. You can always just read the source. It's not too complicated, but yeah. I don't actually think it's a good idea. Like, SQL is not that hard. It's easier to just stick with SQL. Yes? Well, usually you just store like a reference to a file. You don't actually store the whole file. So the integration with ActiveRecord is just a simplification of how you deal with the attributes. Your form would have an attribute for the file. So in your controller, you just instantiate the ActiveRecord object and you pass the file. It's not actually going to be saved in the database or anything. So it's just for convenience, the author of that jam decided to integrate it with ActiveRecord, which can be very easy to use, but it also can be problematic because now you're having some issues with the database performance. I don't think it's actually database performance. It's just your object becomes bigger and bigger, because it has to deal with all the additional gems. I just hope you don't store that file in the database. It's not a best idea. If you don't, then yeah, it's not a database issue. Don't use the libraries that integrate with ActiveRecord if they don't have a good reason to do that.