 Hello. Hi. So I'll introduce myself. I'm Stewart. I work for a company called Rackspace. Look, giant company load always on that brilliant. I work on the Drizzle project. So Drizzle is what we refer to as database for cloud. So it's a relational database system. We originally forked off the now defunct MySQL 6.0 tree, although it looks a lot different now. So it's best to think of it as more of an independent database that has common heritage rather than a branch of MySQL. So Rackspace sponsors a couple of developers. We have actually lots of developers over a bunch of companies and students and everything. So it's not a giant model culture. It just happens that someone is willing to pay me money to hack on stuff I love. So I wanted to talk about, first, some of the goals of the project. So wanting to write, have a modern relational database system for cloud. So what do we think about when we talk about Drizzle? One of the goals was to have it much more pluggable, to actually have the database server be highly modular and be able to switch in bits that you actually wanted. Bits that you didn't want you could take out. And if you didn't like the implementation of authentication, you could easily write your own. This was also to reduce sort of the core size of the code base. So if you ever look at the MySQL server code base, it is massive. Drizzle, we've now managed to get sort of the kernel, the core of the database server down to something like a single human could understand. It's conceivable that a single human could understand the entire core of the code, as well as improvements in code quality make that a lot easier. We wanted to make it infrastructure aware. So as soon as you have people running sort of more than one database server, more than two, one, two, three is quite easy for a single human to remember sort of what's there. And if you had different users, you could kind of add them everywhere. So you don't really start having huge amounts of infrastructure. But with larger installs, or when you have lots and lots of users, especially in a multi-tenanted environment, you start to have, well, if there's all these modules and bits of software over here that need different kind of auth and different kind of this and different logging, it starts to get a real pain in the ass. So we wanted Drizzle to be infrastructure aware. So if you had existing authentication systems, you'd just plug into that. If you had your existing logging systems, just plug into that and actually become part of the general infrastructure of large systems instead of having to fit everything around the database server. It's a part of your infrastructure that does not dictate it. It is not the one great big database that everyone much worship. It is part of everything else. A big thing we wanted was to be community developed. We didn't want to have a single hidden group of developers somewhere dictating everything. We wanted to have large community around being developing and decisions making. One of the interesting things is when you have, like, ODBMS is used in, like, sort of cloud and large-scale environments, is you really want to listen to a bunch of users and people admitting these things. There's like a gray area between developers and sort of DevOps here. And it's like you really can actually give a lot more craft about someone who's actually running this on a thousand machines what they're saying compared to what I'm saying when I'm spending all my time in the code. So there's really kind of a bit where you will actually want to listen to what people are saying and saying actually if you just made these three changes then it saves us a week, a month is something you really want to hear. We really wanted to pay attention to multicore. It turns out that buying a single core CPU is really hard. Buying a box that's in a couple of rack units that's got a couple of hundred CPU cores actually isn't too hard. And so this getting it's obviously a trend that your commodity crappy-art systems are using to get a large number of them to assemble sort of your cloud and high-level stuff. Now they're a multicore, so we actually need to use these processors in a way that's sensible. Our current big idea is the fact that you do want to do lots of different IO across many CPUs but you run a single connection on basically a single CPU and you do a parallel thing where you're having lots of simultaneous connections. This tends to work out fine. People aren't generally running a website where having a query that only runs on a single CPU is a problem. CPUs are fast enough where that's okay. This also makes a whole bunch of locking a lot easier because if you don't have mutexes around all the individual parts of executing a query, guess what, you never have locked contention. So we also wanted to do concurrency. The fact is that a thousand connections to the database isn't that many. A thousand connections is kind of like common. So we actually will always, what we'll say is we'll always take the option that makes a thousand concurrent connections faster instead of the one that makes 16 faster. If you only have 16 concurrent connections to the database, you're not going to be fully using your machine properly or you're doing more analytic stuff and that's fun too but we have a web focus. We are not constructing the world's best analytic database engine for running a query that takes three days. We sure would love to enable it. We're focusing very much on like modern web apps. So enable others that in saying don't necessarily make it that we'll never accept anyone's patches. I mean, sure, write patches turn it into giant analytics isn't that great but we definitely wanted to be able to have a web focus. This is a nice one. It turns out that other people have written linked list implementations and hashing libraries and UUID libraries and command line argument parsing and we didn't need to have our own separate buggy implementations. Also, perhaps, you know, we could fix the common code base if it was broken and performed as well. We could fix that and then everybody wins. It's like, it wouldn't just be, hey, we've made our command line option parsing really fast and has like this one extra feature. It's like, we could just do that to everything. So we do in fact have a great reuse of libraries and using existing libraries. So this doesn't mean we have some build dependencies but, you know, that's what package management is for. So, you know, if you're running, you know, OS 10, your own fault. But if you're running something like Debian based system then it's really easy app to get build depth. Oh my God, it works. So we're using like a Google ProtoBuffer library for doing some lovely data structures that we serialize and then serialize including we're using that for replication streams so it's like really nice and possible and good fun. But we also use Boost so we modelize the code base and the most cool code base was in kind of the best described was written in C plus sort of taking the worst aspects of everything and combining it into a code base which was great when compilers never actually passed the same thing. But guess what? C plus plus compiler is actually okay. So we went one of two directions. It was do we make this an entirely C base code base or a C plus code base? And it turned out the easiest way to go was actually go to a C plus plus one and there was some benefits with that as well. There's some great reasons to use C and C plus plus. There's some horrible brain damage, stupid ass things. But on the whole it's worked out fairly well. Even Boost libraries are actually a fairly nice collection of stuff. There's some really useful, nice easy to use stuff in there. We're even using the command line parsing stuff and that gets pretty neat. Of course C plus plus compilers are slower but that doesn't affect end users. It just makes me complain which turns out to be the economic thing to do is to have all your users happy and... So we have some values of the team. We're very much into open and documented interfaces. So we want network protocols to be documented and people to be able to write separate implementations. That being said our network protocol client library is BSD license so no more about if you have to GPL this thing of inside your client apps just BSD license client library. Very easy because no one wants another licensing discussion. We wanted to have saner interfaces inside the server. What we wanted to do is be able to have it very easy to whip up extra things that you might need. Also we wanted it very easy for ourselves to add things like that where someone says hey what if it did this to make it really easy to be able to experiment and also be able to prove the parts of the code were correct by saying this is obviously correct code and really a step in that direction was great. Transparency, I mean IRC mailing lists stuff on Launchpad having it all transparent and open means that anyone can go and report bugs and problems and suggestions and it's a real nice place to be. Also because fun is utterly important and requires a 300 point fun I wanted to make sure it was a pretty decent community ham and it is. Collaboration with people. We wanted of course to have everyone talk to each other and make it easy for people to contribute and participate not a closed gated off community. And also we really wanted to make it possible for people to build businesses around Drizzle not just ask them whatever form we wanted to do but we did want to make it possible to people to create extra bits on there use it as a server, run databases as a service type thing and deploy it in cloud stuff and build businesses around it. So why did we start Drizzle in the first place? We wanted to rethink everything. How do we do databases? How do we implement? How do we manage this kind of thing? We didn't want to play catch up with any of the world. We wanted to leap forward. So for example we do not take the option of oh but that will be slow on 32 bit systems. This is about the only 32 bit system you can still buy. Funnily enough people aren't building clouds out of phones. Lots of RAM. We can use RAM for a lot of things instead of ever hitting disk. UTF-8 we do not do EBCDIC. We do not do we'd ask character encodings. Once it hits the database it is in UTF-8 and have valid UTF-8 going on there. And it's up to 4 bytes currently so that's pretty neat. So basically it is binary data or it is text and that is UTF-8. Yes, everybody wins. Everybody, everybody wins. Modular of course was our thing to have lots of different pluggable bits and this means we have of course the same pluggable storage engine interface. I have done a lot of work making that a lot more sane so that bugs are a lot less likely and are much more likely to be spotted. So we also have some features developing around that. One of the things that I'm sort of working towards is even like transactional DDL which will be like uber cool to be able to do multiple DDL operations and roll them back including improving the interface so you could actually have crash-safe DDL. MySQL does not have crash-safe DDL. I fixed it so you can actually pull the plug around certain parts of doing DDL operations and you are guaranteed to come back. So that's a thing that engines can provide which is kind of cool. So SQL functions in there are very pluggable. It's been like that in MySQL for a while too. We've switched it to have the same API when it's built in versus a module so you can actually just easily choose whether you can pile these in or load them up as a module. We don't do runtime module loading. It's restart the database server because it turns out that that makes like locking structures about what plugins you have loaded and what's being executed really easy if you only ever do it at startup or shutdown. So we save locks. Very modular replication system. So the big thing that we're getting to the point now so we're nearing what we would call a GA release, a stable release. I will say sort of next month there's kind of the time scale for that and sort of been in beta for a while. And one of the things we did early on was examine the replication code and went do we want to keep this? We've been ripping a whole bunch of stuff out that didn't quite work properly and we looked at the replication stuff and went no. We can start from scratch and implement something on top of there. So we do row-based replication which means that we don't log SQL statements and apply as replication. Turns out the naive way which is statement-based replication which is say well you have these bunch of SQL statements and you run them on the master. You save off that list of SQL statements and run them on the slave and you get the same data. No. It turns out it is actually impossible to get that right. Like really impossible. But you sort of make it work by doing a bunch of tricks. For example, if you see the replication log oh look there's like random number generated where you see the random number generator you see what dates everything are recording half this and then you have to make sure it's the right SQL mode and kind of things execute the right order and then you actually have to do different types of locking inside the database just so statement-based stuff works and it's like well that's wholly inefficient. We don't want more locking. We want less options for locking contention. So row-based replication it is. This is obviously correct. It's a very modular architecture as in the fact there's the code that starts writing the transaction log just hooks into common points inside the code to say after a row has been written or updated or deleted call all the modules that want to hear about that and you can do something with it. So you could write a plugin that instantly I know tweeted each row. I'm sure the Twitter guys would love that. Someone someone once wrote a Java class that did like logging but to Twitter and like people were following some of their phone and I was still on the Twitter guys and go hey look it was like 84% of our traffic at one point people getting paged with log messages. So you could do that but the stuff that writes replication log common API it's also a very easily parsable replication log there is no dependence on drizzle to drizzle replication you could quite easily replicate this to any other system you want. A simple way is to transfer back into SQL statements. Not necessarily the original SQL statements but one that would have the same effect. You could replicate this to drizzle using some kind of native thing that will come across quite easily transfer it into stuff that you could replicate to Postgres or in fact a system that does not speak SQL at all. As long as you can kind of make it look like a relational database as in you could kind of work out where to put a table and rows you could replicate to it which is kind of neat. Even create table currently works in there and drop. We're going to have alter in there as kind of like this wonderful big data structure because it turns out that alter table operations you think oh that's easy just store before and after description of the table. Yeah that doesn't work there's actually alter table operations in SQL that you can do where that would break you know you can do as a single alter table statement drop a column and add a column with the same name. If you did before and after there did you change the type and preserve the data or do you drop it and add it. So you really want to be careful around stuff like that. So modular replication is going to be very cool. Logging of course one of the earliest plugins we implemented was log to syslog. Oh my god log to syslog it just makes sense. There's also stuff to log to like our gearman which is a job queue system or log to anywhere else you want which is pretty neat. Or to standard error it's the real simple one that actually I think took longer than the syslog one. Authentication why do you want to have yet another copy of your entire user's database in a database server just link to your existing one. So there's Auth HTTP which actually does HTTP auth requests to see whether you can log into the database which is kind of cool which is what someone wanted. There is of course AuthPam so if you did just want to authenticate people to the database by our LDAP you just can use AuthPam and you're done or like use your login password in your laptop right. Yeah I hear both people who know how to configure RAM properly are really into it. Yes Yeah so that's a bit that's not like fully flushed out with so many plugins yet but certainly that is a bit that's definitely going to get there more. I mean we ripped out the whole system initially and said you know we're not sure exactly what we want here we're pretty sure we want it to be plugable and not what is there now. Because people are like you know replicated everywhere. One of the big things we're doing at this multi-tenancy as well which is I shall talk soon. Much of the protocol there is no reason why someone could not write a HTTP JSON REST interface to do it. We currently have sort of a MySQL protocol wire compatible and our in development sort of drizzle network protocol and we'll probably see like you know memcache style ones and HTTP style ones as well to access the same database which is kind of cool. You can plug into certain bits of the parser which is kind of cool. Virtual tables so like information schema and data dictionary so we have two sort of databases that you know don't really exist that you can query about what tables are stored. Information schema is the SQL standard information schema. If you write a query in drizzle against the information schema schema or database would be the way to not repeat the word too many times that will be able to run on anywhere that implements SQL standard. We have no extra stuff in the information schema database. If you write a query that only acts as an info schema that is a portable query. Oh my god. Data dictionary is the stuff where it's like yep you get all the internals and all the extra info we have so you pretty much know that if your query is against data dictionary it is dependent on drizzle and we could change that in a future version as well but you get all kind of cool internal things as well. So we have stuff about you know GB transaction records, what locks are being held by what and all that kind of lovely deep delt stuff where you're going to go oh the locks being held on that source line. Brilliant we also have a plug a plug-in point called Relevance which gets to this really cool idea of just say you may not want to store your blobs inside the table on that machine. Maybe you want to store all the blobs in the database off on like a cloud storage system so there's actually like a blob streaming plug-in that's that can also work so it'd be great if people test it more as well where actually it gets a hook in before you write the root of the database oh we'll rip out the blob and replace it with the URL and so you can actually just easily store this off in other places or modify it and do whatever evil stuff but plug-in points not hard-coded features, right? So what are the sort of key features the one big thing that of course we are transactional by default if you type create table you get a transactional tables. You cannot we keep my Isom as a temporary only engine currently so you can create session-only temporary engines temporary tables mainly because SQL executor will create them as temporary ones to do some queries which we will hopefully make into a plug-in point sooner if anyone is really interested in the deep down delves of that you know how you would normally implement this where you'd have like a function called create table and you'd specify you know this is what you want and you'd call that internally as well so you pass the SQL into it and if you're doing a query you'd construct this data structure and do that, yeah that's not how it works it constructs the data structures for the open table and it's like really hairy to pull apart so that's taken some time to get rid of my Isom completely but transactional by default which is in ODB engine what transactional how far along are you making the Ruby stratification no that's all solid yeah yeah our QA guy can now no longer crash it or get different results we're looking for new and interesting ways to corrupt the replication our big thing was first before we create huge slave infrastructure and fancy stuff was like we should be able to run you know so we have this tool called RanGen which is a random query generator which is basically specify like a lex like grammar of possible SQL queries to construct you then run this program then runs a whole bunch of like 64 processes and randomly generates queries based on this grammar that would be passed by this grammar and this usually means like when the guy originally wrote it first round against my SQL server it was like someone asked are you gonna like verify the results of like you know the inserts and selects and his reply was when I can stop crashing the server within five SQL queries I'll think about it so you can just construct horrible things that do there and we actually are at the point where doing random operations in there which is more than anyone will run by hand you can have this running for hours and there's no difference in the replication log from what's actually in the database so we're pretty solid on that and it does transaction boundaries properly it does save points properly it does rolling back of individual statements properly and all sorts of fun stuff like that including you know huge transactions and doing stuff like that so it does that properly and without you know consuming all your memory if you decide to update two gigabytes worth of stuff so there were some really good fun subtle bugs as well but we use an ODB transactional engine as well as we have in the tree other transactional engines we have pbxt and there as well so there's options plugins work of course logging mentioned this log on the like anyone who tailed my SQL logs can understand it's annoying replication I've mentioned lots of good plugin points for doing that we have a simple one to go to file we have stuff that could send replication messages directly to something else and currently we're going through making the tungsten replicator work so it is a replicator sort of a filtered replicator thing this idea is it replicates from any database to any database the drizzle part of that is probably the simplest one of all the other ones it doesn't involve you know parsing the mysql binary log format or anything silly so that's kind of fun author mentioned Pam strict mode by default we don't have a SQL variable called strict mode it is strict mode by default we do not accept bad data you cannot insert into a you know a battery field like invalid utf-8 you get an error you can't try and insert into like an enum field something that is not in the enum you cannot you know have truncated data you're not sort of just suddenly I'll go insert you know 5 billion into a integer column 30 bits kind of throws you an error so that's kind of cool one of the cool features we have also is the file system engine which is actually a Google Summer of Code project it's basically designed to be the what the CSV engine really should be it's like you have some files that are structured kind of ish that you'd like to read as a database table because it's easy to run SQL queries across 100 machines and collate the data this sort of reads a reads files you say you know rows separated by columns separated by or whether it has titles you can read like prop prop mem stat as a table you can read you know anything in Etsy like password you can kind of read that as a table because that'll be great and any of the prop things you can actually read these out as database tables instead of having to have sort of another way to gather stats if you're just already gathering stuff how your database system you can read that and I even use that for like analytics stuff and people's NV CSV things because spreadsheets scare me if you wanted to make that into my SQL or um can easily like create two tables and possibly mention one CSV and one's the other format just you know insert select would do that actually um if you wanted to run that inside of my SQL thing require code porting and hacking it because the interfaces around part of it are actually quite different but um that's pretty easy to use and you can also just run it you actually start up a server we ever so the network protocols are pluggable we also have one called console which doesn't start off a network server port it just starts an interactive session on the console for one user so it's like specify a data to a console and just run stuff yeah exactly like SQLite does but you have like a lot more heavyweight database that you're not using it's kind of interesting but yeah yeah but handy I've been meaning to write just a little shell script that just sortens what you have to type because I get sick of it from your phone maybe actually we also recently bought the MySQL Unix socket protocol as well just for giggles um we have data migration tools so drizzle dump is like the turbo accelerator version of sort of MySQL dump can actually suck data out of MySQL server translate the column types that we don't have just something that we do support and you know appropriate time things for example the MySQL time type supports negative time which is like not in the SQL standard our time type actually follows the SQL standard so we like translate that into integers and stuff that don't eat your data so we actually have my drizzle dump will actually suck data out of the MySQL server and just transfer it directly into a drizzle server and not have to dump it to disk first which is kind of nice so data migration from MySQL thing is pretty easy multi-chancey stuff is like cool new things we're working on so we may have heard that people will like you know have many users with independent databases on the same MySQL install like shared hosting is one or if you're doing cloud like stuff you will have you know a lot of customers there who do not use the capacity of a machine use a tiny amount of it or you only want small amounts to grow out so there is a thing in SQL standard called catalogs which is basically a separate namespace you used to you know database table so catalog is one above that so each catalog has their own set of schemas and each schema has its own set of tables so if you give someone their own catalog they can write you know create database or create schema and go to their hearts content instead of like the current way if you're doing multi-chancey with MySQL and you let people create new databases as you have to create you know some new abstraction on top of that that prefixes with their username and something there and make sure that they can't screw with you or you say you get one schema and hopefully all the apps you want to run don't have conflicting table names if you wanted to do that so we have catalogs which is a layer across that so you don't do queries SQL queries cross catalogs you can't join between catalogs they're largely independent we're looking at stuff for example to have in the next couple of months there's probably something I have to implement to have like allocate to each catalog a certain amount of the buffer pool so you can say you could say as an offering say this user is guaranteed to get 100 meg of cash instead of having it fall off the end because there's one user on the box that you know uses a lot you have more guarantees around that as well CPU time and like like for example you have a catalog and you can choose whether you want 10, 100 or 1000 simultaneous connection supported and price it accordingly or know when someone wants a bigger one you migrate them to a bigger machine or multiple ones that kind of thing so I mean this is we're getting all the basic infrastructure in place now and then it's adding on these really cool things there but there's even like a lot of base work to get done to actually properly and specify this by the protocols will be all nice, all run on different ports or something easily like that another feature which I love oh my god testing we're really good, we're really insistent about not killing performance with like subsequent commits so we have like hudson building stuff and running benchmarks on special benchmark machines and graphing it over time that kind of thing we do you know build everything with dash W error and a whole bunch of compiler warnings on but the flags are huge kind of thing doing it so we want as much testing as possible with cleaning up a bunch of APIs one of my favorite pastimes is writing new storage engines that actually just returns sort of weird stuff for the upper layer to test other parts of the upper layer for specific error conditions so more want to test error conditions to make sure error handling works and stuff like that as well as having a larger regression suite so you can actually make sure that you don't reintroduce bugs, have more code coverage that kind of thing as well as running things like the random query generator which is just this crazy piece of stuff that's awesome as well as having standard run these five commands and see what breaks kind of thing open source database testing has typically, so the current state of the art that we have is probably at least ten years behind Microsoft's we're sort of doing now what we're doing at the end of the 90s so we're a bit, all open source databases is a bit behind here but we're trying to go up and be work smarter and not harder so all automated systems because there's not infinite number of people we have some pretty cool stuff coming for the future catalog support, that kind of stuff having a lot more cooler interfaces to make it nice to work with that but we're getting to the point where it's actually like we're pretty solid, I would say you can run this and not be a problem the replication log is getting solid which means we can start making sure all the slave apply stuff gets really really nice and moving towards stuff like transactional ideal currently now it is pretty solid and we're going to have like GA release we would say yes you can run this in production you're not completely insane but we have managed to do you know, tub or releases every couple of weeks that have come out there and people have been using that that's been fairly solid and not a huge number of regressions hitting in or the bug inflows been fairly fairly steady which is pretty nice so it's kind of cool that we have something coming up that will actually be pretty solid and the future of it looks bright we hope cool, do we have any questions? In terms of those Plugable protocols, are you looking at anything to make it easier to write drivers because I know for instance, I think Jeff's doing a talk later on about Node.js which is this new thing where one of the things they have trouble with is they have to write their own drivers for things because it's completely asynchronous so the standard MySQL driver is not asynchronous so they can't just plug it into Node.js so would it be easier to have a new protocol that would make life a bit easier for that like say when the next new thing comes along it's not hard to create a driver for it because it has its own little or it's got its different way of working we solved the driver problem in one way of also speaking the MySQL protocol so everyone else has already written that for us and doing more advanced things is like a next step which is part of working on and drizzle protocol of being able to do things more asynchronously which you can already start to do with the client library as well so the client library is a lot faster and more advanced but there's some things in there for the future which are going to require people sitting down and do fancy things but for a moment we've kind of got it covered in fact speak someone's protocol already makes it easier because software already exists but yeah there will be stuff in the future that will be interesting for async stuff and if someone writes a HTTP one that will be interesting at an SQL level how compatible is drizzle versus MySQL I mean your average MySQL app that's not doing anything particularly tricky and special will just work or is going to need some hacking so we don't do SQL triggers, views or stored procedures is the big one that we don't have that is in MySQL so if it's not using that and you're not for example relying on data to be truncated and stuff like that then you're okay then it's basically the DDL that gets to be migrated and the drizzle dump tool can do that for you the main differences around that is DDL but we've tried very hard to have SQL level compatibility except for when it's wrong in which case we will throw the error we won't put your data with truncated error which is the main difference but we've seen for example WordPress take a MySQL WordPress install run the drizzle dump thing and WordPress will still run against it if it starts issuing create and alter table stuff then that can be a bit more interesting but it will run like that one of the guys runs this blog like it nearly game enough to screw with something that works but yeah so we're fairly compatible on that level Does Oracle have much to the day-to-day development of drizzle? Not at all one or two people work for Oracle for a very short amount of time but yeah Oracle is nothing to do with drizzle So there are plans to like I know Rackspace is already doing some stuff with offering software service kind of thing it would be really nice not to have to up your own databases with this whole cloud thing is that you can just have a little console says I want one of these or X number of these That would go a great way towards justifying my salary so there's a number of things around it that Rackspace is looking at including using it internally and for running large clouds and stuff like that so there's several things that will come out in a minute, it's awesome That's probably it Thank you