 My name is Robert Marble. I'm going to try to talk to you over here. And I'm with Ruby Focus here. I'm consulting at a crucial firm for real wealth. I'm going to talk to you today about a plug-in that I've developed. An act that's most popular. It's a caching solution for most popular lists. What's the most popular list? The most popular list is a list that tracks the user activity. And we see it everywhere in social networks. For example, this one is a social network that I'm building with a couple of business associates about entertainment talent and entertainment business coming together. And people like to know whose content has been viewed the most, for example. So they see this list here and typically involves something like an upload or some entity that's viewed, but like a user profile and video and image. That's the thing that you want to be ranked, have ranked. And then it's ranked by some user activity, which is typically a number of viewings or commenting or writing or something to that effect. So this is a common problem in social networks and web applications. It's also a problem because it typically involves a join between this view of entity which could be the user profile, the image proposed or something, and this activity. And these are two different tables usually. And you have to join this together to come up with a couple of list of statistics, the analytics of it. Why is that a problem? Because it is slow. When people hit this tracker page or analytics page or whatever you call it then you need to access your database and do that every single time the person does that your database will go to its needs very quick. Don't do expensive. So I've been trying to figure out, well here's an example query actually. So this is a query for example. Upload is the thing that you're viewing and you're joining it on the viewings and typically this involves a join between the viewings table which is my example I'm going to use here and it calculates the number of rows of the viewing tables that refer back to this upload and then you can order by that and you group it by the view of the entity and then you get SQL will return you a list of uploads ordered by the most popular one. Hence the name. So this needs optimizing whether or not you should be optimizing this that depends on the application. The 80-20 rule applies here. For me this is a cool concept. We'll see later in the talk by New Relic a great tool that you can use to find out whether or not you should be optimizing something so it performs properly sweet. In this case I've already determined that I have the need for this because it's slow and I want to optimize it. So my solution that I'm trying to get to is how to make this scale better and enter popular. This is playing. The most popular is from cache. So I want this somehow in cache. I also want the cache populated from database only once. I want this pretty automatic so I want this query to be run at the beginning of the initialization of the cache and then every subsequent access I want to come from the cache. Now meanwhile as the users are doing activities on the site they're viewing the object and they're commenting on it or something I want those things to be automatically updated in the cache so that it doesn't have to go and regenerate the... Somebody's not on it? Two. Maybe it's a magical moment. To read access in database obviously that would be making the cache. Making a simple arm is something that hopefully I don't have to rebuild from scratch. The whole idea here was to use something that is already based on existing components and to create a plugin solution so that if you want to scale the most popular list which doesn't matter if it's commenting or viewings or ratings you can use the same thing again and don't have to come up with a custom solution. I'm curious how many people here are familiar with caching frameworks. I've used them. Shell fans. Don't know what they are in the first place. So maybe like a third of the room is interesting. So I was interested in the memcache database framework and there's a couple of them out there and one that I found recently is... It's all about money. It's developed by Nick Callum and he's now with Twitter and I understand that cache money is extracted from the Twitter core and it's one of the things that makes Twitter together and it works with active record. It's a very clever tool. The thing that came is the same generation of caching frameworks. The previous one was done with a solution called Cache 2 that came a couple of years ago and Cache 2 made the design choice to have your caching be very explicit so you have to call modelobject.cache key, blah and then you get your cache energy back and if you know that it's in the cache you can do that. If you don't know it's in the cache you have to say modelobject.find, blah and it goes from there that is. And so in your code you always have to distinguish manually between what is in the cache or not. Do I go to the database or do I go through the cache? So the code became quite ugly and Nick actually organized blog posts about this where he illustrates the syntax and the awkwardness of the semantics of using an explicit technology cache point word and I think this is a larger form of technology in itself that the evolution of things are such that initially we like things to be very explicit kind of like manual transmissions which I'm sure we have some in here and as technology evolved these people become more comfortable and their focus shifts things become more automatic and transparent behind the scenes such as automatic transmission I still love it but I rather the part come down to LA and it's automatic and it's great and so then you can use explicitly you can say user.find ID those expire out of that it's super powerful tool and if all you do ever is single model single object finds you have a drop-in solution that will cache the entire application automatically which hasn't been done before and I was super stoked to find this and Nick is a genius for doing this in detail here the way this works you have to call such as find or update or create or update or insert or something and the cache money thing is in the middle either uses the get to get it from the cache if it doesn't find it in the cache it goes to select and brings it back to the cache and cache money abstracts it for you you don't have to go on easy the automatic mean so you specify what's index the primary key is automatically indexed for you the ID usually but if you have other things that by like a name like if you look at the user by name you do find conditions name equals blob then you can also do that and cache money will maintain the index for you and it also monitors the deletions and the key-handed automatic like change the month before you delete it and it also expires for you so it maintains the index so you don't have to worry about it the one thing that cache money does not do is it joins and this is where my plugin comes in for a very narrow specific solution a join that is common in the social network besides for the most popular thing and the instance methods I'm going to take it back upon that cache money provides me so every model now gets in addition to the find and create method that already active record gives it it now has the get set and the repository method and the get set actually acts as a cache which normally you shouldn't need to if you just do find ID cache money handles it for you but in addition to that in the lower layer now you have the opportunity to go to the cache what the data is and each model comes equipped with these methods now and the key handling is quite automatic also you can just say user set you provide new key and then it turns internally into into a deserialized includes the model class name and also a version in this case it's the very bottom there version 1 and these versions are used so you can track migrations of the projects I think these tools in place now have the cache money framework the solution is next and what I've done in this with the plugin is the access was popular to my head now, I'll take it this is not Windows but I'm realizing the trouble with the graphics that are there is the amount increased well it came to the workplace almost I took the cache money framework and I need to create additional index I want to keep my uploads sorted but cache money does not do cache money has the ability to order the primary key but it cannot do any additional secondary key, you cannot keep that sorted actually I'm quite curious if it's possible to add that to the framework and I didn't fully understand problematic behind it or why the limitation existed but if you have any ideas I'd be grateful for the comment and the feedback so I created an additional index which keeps the activity count in sorted order and it gets primal the first access to it and then also it is maintained every time you add a viewing to your image or your video or your upload that is tracked through the after add and after remove of the active record association it has many associations and when you all plug it all together it looks like this so your class upload is the viewable entity like your video for example it has many viewings on it which is the child class that tracks the activity and here's the plugin access is most popular after the activity class that you're monitoring you have to specify a limit typically you don't want the most popular list to include 500 entries but a certain limit that you design will tell you and you can do kind of nation on it as well but 5 or 10 is a good number and then the last one is the db finder argument and what that is that is used to prime the cache so it will run the find on viewings for these attributes and one thing you have to have in there you have to have the activity count if you are not able to select attributes to the finder arguments but what it does basically it tells you it tells like the record what to return and in this case the attribute don't actually have to correspond to the data disk problems it just makes up stuff on the slide if you run this finder if you run the finder these parameters then you result objects contain an entry called activity count and you can then say result of activity count is your number which is the count from SQL and that is then stored in the index which goes into the cache and once you once you do that you basically done all you need to do then is upload the most popular which is the method and then you get the results and everything else automatic and that's all I'm really going to say about it it works beautifully it's a reference which you can read it's transparent and the plugin will be on guitar and that concludes my talk all from Arnold we have a small consulting firm that does consulting and recruiting for who we are if you are curious about scalability or around best practices if you have a conversation come find me if you want a job in Rails or you offer jobs in Rails come find me as well thank you so much