 My name is Cain Batch-Gloufi, and I'm doing Aquila, which you know, or you wouldn't be here to be over with the daily people. I'm actually a co-human. That's where I work. This is Alex Chafee's new starvep, and he's my front row, or second row heckler. And it's a really small company, and we're working actually in Sinatra with all sorts of weirdness, and it's awesome, because we're actually doing object-oriented programming all the time. So it might be a good fit for Aquila in the future right now. It's doing other things. But before I was a co-human, for a couple of months I was unemployed, and I had this insane notion that I should start this talk by saying, Hello, my name is Cain, and I'm a Rubius, because it's like this weird addiction, right? I mean, in the positive sense of the word, so as soon as I was unemployed I was like, I know exactly what I want to do. I want to learn about CouchTV, because I've been lurking kind of on the list forever, like for over a year. And every once in a while I'd go in to Kutan, and I'd kind of experiment, like create some records, do a math review, see what happened. And I just didn't really learn anything in that year except like basic big picture stuff. And so I wanted to kind of figure out, during this period of, I didn't know how many months I was going to try and stretch out as long as I could, what could I learn about CouchTV? And the reason is, because it's awesome, right? I mean like, you know, for those of you who aren't CouchTV aficionados, and most people aren't, it's schemaless. And she hears a lot of the qualities, actually a lot of the qualities of the MongoDB database that was talked about yesterday. So the things that were appealing about it to me though is being able to store data that actually belongs in a domain object within that domain object instead of normalizing it out into other tables, and then like repackaging it. You know, I just looked at that whole scheme of, and the truth of the matter is I love relational databases, and I love ORMs. ORMs are great. I mean I use them like in my real work all the time, and I use them because I've done the alternative, which is to roll your own, you know, like socket, you know, and then like start like packaging up like similar syntax, and eventually it becomes an object-ish thing, regardless of what language you're working in. And these are much better. They do a lot of work for you, so I'm not like dissing ORMs at all. And I've used both of these, and I love them both. But the thing is you can see here has many legs, right? Or has N legs, right? Legs like what's the likelihood that you're ever going to see a leg outside of an animal, or you're going to have need to check the toenail polish on a leg? Like ultimately legs for a mammal or a bird are part of the animal and should not be like separated out, and that was the appeal for me to document-oriented databases. So I started working with CouchRest, which is a pretty phenomenal project, and it has like, you can see the property leg, and it's a collection, and you can make it any kind of a collection. It could be a hash, so you have a right leg and a left leg, or front right, front left, you know, or it can be an array. There's just these four legs hanging out, but it's an ultimately really powerful thing. The thing about CouchRest that was really puzzling at first is they made this really interesting performance decision, which is that objects, these models, are hashes, and so they convert very easily to JSON, and then back again. So there's not this delay, like active record and data mapper. They have like this segregated area that is where your real data is, and the whole object kind of acts to convert, you know, on this like little space out to your database and back again, and this is the opposite where it's like this huge wrapper, and that's what's getting sent back and forth, and the interesting implication of that is that instance variables, which I've come to think of as storing state, are discarded in the process, right, because you're serializing a JSON object out of your hash, and so anything that's not in a hash key is gone, you know, once you save it. So that started me thinking about, like, well, what is the actual state of Ruby objects, and if you were going to persist a Ruby object, what would you do? And it's stored kind of the fundamental nature of an object and where it is, and how it looks at a particular moment is really in the instance variables and also in like fundamental data parts. An array has, you know, various values in each index, and the hash has various things, and then there's all sorts of other fundamental parts to it. But there's a problem with database abstraction in that you're always focused on the database, and I've heard this argument before that active record made a really huge design decision that was wrong when they, I mean, I know, but truthfully, like, if you think about active record and then you think about building your own socket and stuff, like, I really love it, even though I have issues with it, but seriously, like, the inheritance model, there's been an argument that I've heard around that it shouldn't have been inheritance, it should have been a mix-up, like Datamap or did, right, because inheritance is all like saying to the object, I am a type of this. So in this case, the bird is a type of active record base, and there was this, like, design discussion that that's really incorrect, and I actually argue the opposite that what we're seeing here is a type of database abstraction, and that particular database abstraction is active record base, or data map or resource, but it isn't actually about the domain object. Your main concern when you're working with these things is all about packaging your data in a way that the database behind the scenes can understand. So Aqua's goal is object commit. You create classes, and then you can just save them. And this isn't, like, an unusual kind of a concept. I mean, like, lots of languages like Smalltalk did it, and they're reporting it over to Ruby with Maglev, and the thing that inspired me to do this is I kind of wanted it now, and I wanted to see if I could do it, and also this scales. It goes to any implementation of Ruby that you want, because it's really just using Ruby right now to package things up, send it to a back-end storage, and then get it back again. And while I was originally inspired by CouchDB, and it's currently using CouchDB, like, I built an abstraction layer that's the storage, and you can attach kind of anything that can store a hash. So it's the responsibility of Aqua to kind of the front-end of Aqua to say this object is going to become a hash, and it's the responsibility of the storage engine to say, like, what happens with that hash? And right now I've got just the CouchDB kind of storage engine going on. And why would you want to do this? It's because Ruby is awesome, right? And we were, like, yesterday at lunch or something like that, there was a group of people talking about improving the design of coders, because there's a lot of people that come in as hackers, and they don't get the value of object-oriented programming, and I think there's a good reason for that, and that's that if you're saving anything, you have to, like, work with one of these data storage objects, and that's, like, fundamentally not working with real Ruby objects. You know what I mean? You're working with a data abstraction. So I think, and actually I wanted to say something, like, we had at work Brian Merritt come in, and he was talking about, like, oh, way back in the day when he was talking about test-first development, and everybody was just like, that's too hard. And basically he tried to make this little movement, but it failed incredibly badly, and so when there was this resurgence in the Ruby community, he was just like, oh, Poo-Poo, that is never going to work. But the culture ended up changing, and it was a good thing. I mean, it was a good thing for everyone. And I think it's possible to move into this direction of better design, which there was, like, a lot of negativity at the table. Like, you can't really teach design, but I think you can shift the paradigm towards using objects, real Ruby objects, most of the time rather than just using a database abstraction layer that has its own peculiarities is a step in making coders better designers in the object-oriented world. So here's the reason that I thought Ruby was really awesome, and I mean, just one of the reasons. But the fact of the matter is, like, for those of you that know ranges really well is that this is kind of like a not real case, and I would like to do that change, too, because ranges were conceived of as, like, these throwaway objects. You initialize them, you use them, they disappear, right? So there's no way to replace the initial, you know, the start and the end, and there's no way to just kind of re-instantiate the whole thing without, like, creating a new object. So this would be a way to create it and never use it or change it again. I mean, you could use it, but you'd never change the date values again. And that kind of, to me, that's never been, like, an issue that people brought up, so it's something about the fact that people aren't, like, extending from these some of the fundamental base classes because all the time that you're working in the domain space, if you're saving objects, you're using a data persistence layer, right? So, like, it's hard to see, like, how awesome something like this would be because you're always delegating to a range and then packing it up, and, you know, like, maybe your database is saving the start date and the end date, and then when you need it again, you package it re-into a range. So how does AWK work? I tried to make it as simple as possible, and that ended up, like, opening a tiny little, a little bit of methodology and object, which I know people hate, and I think you could probably, like, turn that off if you wanted to monkey-patch it because basically what AWK does is it inserts a module, so it includes a module that includes a whole bunch of other stuff. And then after you've kind of created your class, you can go new and do some stuff to it and commit, and you can do it with the exclamation point if you want the exceptions, and if you want the exceptions to be swallowed and kind of return a false, then you can do it without. So behind the scenes, what's happening is it's made into a serialized hash, and it's, like, kind of a sane... I mean, there shouldn't be any real surprises. It's just the description of the object. Class, IVARS, the other thing that you'll sometimes see in these hashes is an init, which is to say with an array or a hash or a time or, you know, whatever, they need to be initialized not as kind of an empty state. They start with states that's in their, like, kind of intrinsic values. And so in some cases, you'll see an init in there, too, but that's pretty much an object. And then because CouchDB speaks in JSON, you're just able to post that data over to the database. And the reason that this works is because, essentially, objects are kind of like documents. They're Ruby documents, right? But documents, in the sense of document-oriented databases, is arbitrarily, deeply nested things. And so you can see that, like, Yamle knew a long time ago that they were nested, you know, document structures. Sometimes you don't want your state hanging around, and I think, like, I haven't seen an authentication package that hasn't, you know, like, done something safer with the password. I get the password and then I send it off to be encrypted because I don't actually want to save the password. And so this would be a use case where you'd want to hide the password attribute, and that takes a common delimited set of symbols. So you can just keep adding to that. I don't want this to go to the database. I don't want to save the state of these certain instance variables, but I need them to hang around so that multiple methods can see them in a given lifespan. The interesting thing about this is that there's a lot of little instance variables that are used by AQUA to kind of save the package as it's being built, and the same mechanism is used to hide it from being saved by itself. So, of course, like, it's not really that impressive to, like, create, like, a fairly flat table-like data structure and then save it and reconstitute it in a document, you know what I mean? So the whole purpose is to recursively kind of embed objects. And so non-aquatic objects just go in kind of with any initializations that they need, which means that you can start descending from arrays and hashes or any other kind of data object that takes an initialization, and it'll just, it should just work. And then you can add it to another object, and if the, you know, like, ultimately it's an aquatic object that's responsible for saving. So as soon as you declare address... Are we back? Okay. So this is what kind of the serialization looks like. Addresses is just an array containing other objects, and those objects are not aquatic, so they're just kind of nested in there, like, just jammed in there in their own form. So each one of those, you can see that the class array has that initialization value that I was talking about, and the initialization value is an array in JSON, and then within it is another object. As soon as you declare another object, like addresses as an aquatic object, it becomes responsible for saving itself. And so what's, it gets embedded by reference, and the reference is actually like a different kind of class. So when you send this to the database, and then you go get it again, the friend object is no longer going to be an actual friend. What it is going to be is like a lady delegate. So aqua stubs, they're essentially a lady delegate. So initially, they're just hanging out there being little object stubs, and as soon as you get to something that it doesn't recognize, it goes and gets back from the database. And what this also means is that you can start to, like, cache methods that you know you're going to use a lot. The job I had before CoCumen, it was like this massive movie like sequel kind of nest, and it ended up working very, very fast. But one of the things that we had to do is start caching, start caching these like kind of embedded objects like countries, you know what I mean? Like at some point, we have to search and know that these countries are part of this movie. But ultimately, when we're displaying the movie, we just want the strength. You know what I mean? So you can start to do some of this. Like if you know that pretty often you're going to need the user's username as soon as you bring this, you're going to have to develop some kind of way of eliminating this cache. But ultimately, you can cache the methods that you know you're going to use the most. Those will get initiated into the aqua stub object. You can see here, method username is Alex, and this is the syntax for stubbing a method. And the stub can also take an array of symbols. So if you want to like stub multiple methods, you can just tack those on to the end of the array. And they'll just get added in. And so the aqua object or the stub object will then, you know, you reload it and that makes the friend object suddenly an aqua stub instead of a friend. Then you can say, what's the class? And you can say, oh, it's an aqua stub. And the username is Alex, and because that method was cached, it never gets a database head, but as soon as you add it. Okay. So the other thing that you need to do a lot is files, right? Attachments. And there's all sorts of schemes in the Ruby world for like taking an object file and like saving it to your local file system. But CachedDB has the luxury of like attaching actual attachments to your document. So you can have a list of documents attached in CachedDB, and I decided that would be a useful thing. So anything that's a file in your, you know, aqua object is going to be like basically made into an attachment in CachedDB land and back again. And they're all kind of the, the ID is actually the image, or, you know, the file ID, or the file name. So this is where things start to disintegrate a little. There were like, when I, I had put like 10 days into this project when I wrote the Ruby proposal and it was really like, I want to see if this is possible because I would rather be working this way. So it turns out you can make it happen but it had to be completely rewritten for the query. And I thought about that a little bit later if people are interested. But essentially like as soon as you get beyond, you know, like the packing and unpacking things are a little bit sketchy and part of the reason they're sketchy is because I would really like it to be used by multiple engines and as soon as you do that they all have different querients and tags. Like you can say to kind of, on any, you know, like storage for documents like, here's a hash, do whatever you will with it but it's a little bit harder to like marry the different querying techniques. So I'm still kind of thinking about it but like the basics here work. If you reload, you get the object back with the things I've talked about. You know, like the external, externally saved objects become stubs or attachment stubs, file stubs but otherwise it's the same object. User load works in kind of the same way. And then we start to get into indexing, which is, and this kind of goes back to CouchDB and like how it works. And what it does is people get very afraid of the whole map reduce thing and it's not that scary. Like essentially what the map is doing is creating an index and the really cool thing is that the index doesn't actually have to be directly related to a field in the document. So if you had, say, a username that was first name and last name and it was an array of strings and you could, you know, like add Mr. or Mrs. and what you wanted to say is actually the full name, right? Mr, blah, blah, blah, blah, blah. CouchDB can do that in a query and it's like, or in an index and the index is actually kind of an array of values and it's ordered. So when you're creating and you essentially have to create an index to search on it effectively. So essentially like to get this to work with CouchDB as a first step I wanted to like create these indexes. So when you do index on username what it does is it looks up, you know, your particular instance variable and creates a map function for that and that map function behind the scenes in CouchDB in some place that you'll never see creates this index that you can now search. So what that means is that you can go, you know, user query, username is the index and the value that you're looking for is cane and it returns an array of all the users in cane. So there's a lot of what next because as I said it was like this kind of toy project when I submitted it and I'm starting to get a lot more serious about it because it does seem possible and it seems fun and it seems like it would be more fun than always like depending on maglev to be there and well maintained and affordable. So for the querying like there's a lot to think about like Alex and I at our job have been like working on a criteria pattern for a search object basically and that's kind of something that I've been thinking about for this like how would you automatically generate like certain criteria and pass them in. Because the ultimate thing is like I haven't seen like a good BSL for searching in Ruby period. I mean I think Ambition was really cool but unfortunately it's like it's not going to translate over to 1.9 which is important to me. And the other thing is that like kind of goes into the parse tree and does all sorts of craziness which seems a little heavy. And I haven't tried it but it seems like it could be a disaster pretty easily so I'm still trying to come up with like what would be the right BSL. And then of course like right now like you can only search on the first level instance which is not very useful. So it has to go much much deeper than that. And then there's like a whole nother layer on top of like you know like you've got the storage layer and then you've got this kind of aqua interface and then you've got on top of that like the possibility for all sorts of other mixins to do the things that people have come to depend on from their ORMs or you know whatever your database abstraction is and that would be validations. And an interesting thing that I started to notice about validations when I started using couch rest is that when you're working on something that's deeply nested and you're initializing it you kind of have a sense right then and there like something's gone wrong. This email address does not look like an email address and there's no way to kind of throw the validation flag so the next when it gets to the end and you're ready to validate it and save you've already done that kind of work right there but then you have to go through this deeply nested structure again and so I kind of in couch rest made a little couch express plug-in and part of that was to solve that problem so that you have like these staged errors so I think this could use like the validatable library and then add to it so that when you know that you've already like got a problem you can throw a staged error and that the staged error will be consumed by the validate process at the end the other thing is like with these collection objects there's the there's just lots of issues like okay you set up an instance variable and you say I expect this to be an array but what happens when you make a mistake in your coding and it's no longer an array because you just you know instead of like using you know your little angle brackets you just equal sign it you know and that can be like a real pain to debug so I like I'm conceiving two of like these kinds of collection classes that will kind of do that work I was talking about with the lazy loading so as soon as like one of the object throws up an error and says I just got a method missing it'll like take the whole collection and kind of replace it not with delegate objects but with the actual objects and then you can just start using them and I think that would be a lot more efficient than you know on each one like hitting the database which would be a pain but the other thing is that these collections you know you may want only some type of object in that collection or you might want it to be constrained to a certain size or I don't know I mean it's kind of like up to the community at some point like what's going to be useful and I want to make these layers of kind of niceness fairly easy to drop in and the other thing I've been thinking about is an active record conversion right because almost everybody that is saving stuff to databases seems to be using active record and if we can't make it easy for people to switch to using these object designs then it's not really going to happen and so like I made a conscious effort to not use save right and not use the other things like find all first you know those kinds of things because I think those can be added as a layer of niceness of sugar on top of what this is and it also allows an object to be both aquatic and an active record conversion and that's like an interim kind of stage but ultimately and I don't think that'll work because like active record isn't actually it isn't actually creating like instance variables for each field in your set right so you there's some more work that needs to be done to get into that little enclave of like protected data and convert it to an aqua object and some of that is design you know like you're going to have to start adding these attributes to your design and some of it is you can kind of create it and save it hello so the thing that initially got me really excited about this idea was not just serializing objects but also except I'm loud so like I was saying the serializing of code is really interesting like last year at RubyConf there was a presentation about JobQ that essentially like use Ruby to Ruby to serialize processes and then take them off of a list and use that and I think that's really interesting but I also think for object completeness if you're able in Ruby to add singleton methods to an object you should get that back when you retrieve an object from an object database and so that was really important to me but that opens like a whole interesting like range of possibilities like classes or objects in Ruby so you can save a class and that means that you can serve software and not just data like if you've got the Rails way is that you've got like these you've got all these applications but you're going to just pretend it's one application and you're going to have one database even though your use of data is really scaling differently than your CMS data and essentially like an administrative interface uses the same database maybe as your consumer facing you know like application like they're using the same database so what if one of those applications was actually the one responsible for managing a particular class and the other the other one just grabbed it from the database right that solves like a lot of problems that I've had of like sharing code it's just like well do I make a jam out of it so that like they both have the same user model do I have to like use some prepackage program like what if you're saving what if you're using objects that aren't part of a prepackage jam you know like how do you share them between code bases and I think this solves that problem because you say a class save and then you can retrieve it and the other thing that you can start to do is if you have like two databases and one is a public facing database that is kind of your API and it serves like serialized aqua objects some of that could be classes right so that when people are using your API they could request not just the data but the actual object and that would cut down a lot on their coding time they'd have exactly the object that you intended them to have so those are things that I'd like to do the other thing that's kind of come up at work is we're like instantiating like a huge number of objects and all on their own this is a day of bad audio all on their own they're not kind of taking up a lot of space but the whole process of like newing them and then garbage collecting them is ending up taking like a huge huge amount of time like more time than I could have imagined and what we want to do is actually insert something in between the database and the application that stores not like memcached key-value stores but actual objects so that we don't have to new and delete all the time and I think that would be really possible because of the way that it's written to just add a hook into the commit so that commit if it knows there's a repository there would go ahead and pass the object to the repository and the repository could decide like I'm gonna commit this or not and vice versa like a fine I could like know I've been configured to have a repository and so I'm gonna go get this from the repository and that is kind of a summary of what I've got so far I think I've said enough that this is like uber uber alpha stuff I kind of had hoped that I would have an application running on it by now but the way that the pack mechanism was working the way that the serialization was happening it wasn't possible to kind of like query because of those externals and attachments and so I had to kind of rejigger it all so that it worked and so this is uber uber alpha but I think really fun and I think that's all I have so let me take your questions is there anything any danger of like a name collision with other instances? I did a lot of underscoring things and double underscoring things just to avoid that and then in this latest iteration where I kind of refactored everything I ended up moving everything into a translator object so there's actually very few of those left around and the ones that are left around really just have to be kind of cleaned up so that they're in the translator so I was worried about that and I've made some efforts to avoid that well like my goal is the actual state of an object which is its instance variables and its initialization value which is not really an initialization I mean the core values of an array say or not in instance variables they're like kind of buried in some C thing but my goal is that that's the state of the object because it kind of is right so my goal was to have it act like ruby I did not actually use YAML or JSON directly I kind of thought about it and tried it and it was not working and it was not working mainly because while they all convert them to hashes what they give you you would have to dig into some internals that might change to get the hash you know essentially like the hash wasn't immediately accessible what they want to give you is the string and so you would have to kind of like do some crazy stuff and I think serializing objects like once I kind of really stopped and thought about it it wasn't that hard I mean really like things break down to three different things that are needed to serialize an object at this point I think there's going to be singleton methods at some point but at this point you need the class you need the instance variables which can be deeply nested and the last thing you need is anything that you need to initialize the object with like a time you need some way to know that it's a time and not just like 1970 or whatever the initial value is for time anyway else going once okay