 Just a quick plug for my employer who is very kindly paying me to be here today. I work at VMWare, we're doing very, very cool stuff. Anybody that wants to learn about this cool stuff that we're doing before August, please come talk to me afterwards and maybe we can see about getting you on our team. What I'm working on at VMWare is not this. It's not even closely related to this, but it's still pretty cool stuff. So I'd like to talk about this. This is something, I don't know if you can see it, there's an elephant there riding on a parrot. The woman who is now my wife, actually as of last Saturday, she made this image for me and I really appreciate it. Thank you Sarah. We have here a blue elephant that's post-gress-ish and a parrot and we're trying to bring them together. So normally you don't think of having bleeding edge features on a relational database management system. It's just not how we normally think of these things. We think of them as sort of stodgy and old-fashioned and maybe outmoded, but this is post-gress. This is not stodgy and old-fashioned. And so we do have these cutting edge features and I'd like to start by thanking the people to blame for this new feature. That guy in the middle, Jonathan Leto, he's been blazing a path through the parrot space and he was the person that turned my wacky idea of embedding parrot in post-gress QL into actual code that actually executes. So this is this really important difference between the visionary, that's me, and actually having something happen. So he's kind of the person that we really owe the credit or the blame, whichever way you want to look at it. This guy, Josh Tully, anybody familiar with him? No? Okay, so he is the person responsible for embedding LOL code inside post-gress. You can now run stored procedures in LOL code inside post-gress. Is anybody familiar with LOL code? See a few hands. Does anybody hunt over? Nobody's copping to it. The guy on the right is David Wheeler, my colleague at Postgres QL Experts, a consulting company in which I hold a stake. He did the PGTAP tests, which should be familiar to those of you that write Perl. Anybody familiar with TAP? That's a few hands. So TAP is the test anything protocol. Turns out it's not Perl specific and it's handy for lots and lots of different, let's say test-centered methods of development. I really wouldn't call any method of development test-driven. See tests don't up and come up with a set of requirements and start writing software. So tests don't drive development and never will, but you can have tests and they're handy to use when you're developing software and you should never omit them. Down there in the middle that sort of dark picture is of me. Also like to thank Daniel Arbelo Arrocha, whose picture I was not able to get in time for this talk. So Perret is a virtual machine for dynamic languages. It's register-based and it's really, really cool. It's pluggable, it's interoperable, it's dynamic. Postgres QL is strangely enough the world's most pluggable database. You know why that is? So the answer was you can incorporate Perl scripts and Postgres QL store procedures, which is true, but what I had in mind was the fact that Postgres hung out at the University of California at Berkeley for many years being the host of people's PhD theses. So it was a place where you could sort of test out your new ideas at every level of a database management system, however strange. And so the result of this was an architecture which is pluggable at every level, at least in theory. There's a few parts that have been sort of welded together in the interest of performance or just out of the disinterest in actually plugging something in there. But that's how it got to be so pluggable. It's also extremely standards compliant. As far as I know, no database management system in production is more compliant with SQL 2008 than Postgres. That includes DB2, DB2 being sort of the place where SQL really got started. And of course Postgres QL is why I'm here because if it weren't for Postgres, no, but none of you guys would have heard of me as it is only some of you have. So why are we doing this? Well, when you're going to embed a language in Postgres, the process is sort of, that's a pain in the neck. Really, really serious pain in the neck. There's a lot of repetitive work that goes on. There's a lot of sort of cut and paste duplicate code that goes on. And just generally as a lazy person, it offends my sensibilities that all this has to happen every time someone wants to plug a language into Postgres. The process should not be so arduous. One goal I'm aiming for with this project is for you to come up with a language design that you're interested in in the morning. Well, okay, so maybe you took longer than that to design the language, but you choose it in the morning, you look at it over the course of the day, you plug it into the parent system, and by the time you log out for the evening, it's done. That's what I want. Okay, so this is, in order to get this, we're making a PL toolkit. That's what this, that's what PL parent is intended to be. It's probably not going to be as fast as say like a PLLLVM, you could imagine such an embedding. And maybe that's actually a better idea for the high performance kind of applications. But it's meant to be able to do rapid prototyping in the real sense of actually being able to knock out a prototype quickly and have it work. So for just a moment, I'd like to digress into something a little philosophical here. What we're actually doing is what I call the anti-cloud. So in the cloud model, what you have is you disperse your data and you disperse the computational resources that you apply to the data, all sort of around a network. What's happening here is what we used to call active databases. This was one of the great debates of database management back in the 70s and 80s. And on the passive side, you basically had the idea that the data store was to do nothing except store data. And on the active side, you had this idea that you were going to have this enormous wonk of data and then right close to it, you were going to put sort of computational resources and actions and things that went on with it. So one of the first outcomes of this was a performant foreign key implementation. Another thing that came out of the active database idea was triggers, which we now take for granted as being something that you can do. Store procedures was another thing that we now take for granted that you can just do in databases. Nobody really thinks of that as revolutionary or innovative, but it did come from this sort of philosophical split. Okay, one more little bit of philosophy. This one is a little gem from the Ruby community. Anybody want to guess what that is? Don't repeat yourself. Don't repeat yourself. I just can't emphasize enough how much of a pain in the rump it is to make a PL in Postgres right now. The documents themselves say something along the lines of insert several thousand lines of code right here and then you'll have a new PL. Well, that's not the right answer. Another thing we'd like to be able to do is write things in PL Perl 6 and then call them from PL Python. Does anybody notice anything strange about that second language? No? Okay, it doesn't have a capital U at the end of it because right now the embeddable Python implementation is one that can only be called by the Postgres super user or it can only be written by the Postgres super user. And this is because the Python that we're using to embed in Postgres can't be limited in its extent. It can't be prevented from opening file handles or pipes or other things as the Postgres system user, which let's say sometimes you want a little bit more sandbox than that provides. So what we'd like to do is sandbox the virtual machine in which all these operations occur and then pretty much by magic we get a PL Python trusted. Okay, so how do we get to there from here? First thing is first, we're going to do a PL Perit Intermediate representation. Anybody familiar with PIR? Okay, that's one more than usual. PIR is sort of the assembly language for the Perit virtual machine. Only unlike an actual assembly language, you can do some pretty high level stuff in it and it's really cool. It's actually kind of fun to program in, it brings you back to the time when you may have gotten really close to the machine. In this case you're getting close to the virtual machine and the virtual machine does a little more. So being close to it lets you do powerful stuff. Okay, so that was the first thing. Alright, that's our first challenge is embedding PIRIT. Right now the PIRIT embedding is a little bit in flux. You need PIRIT master and PostgreSQL master, well not so much anymore. It'll actually run on PostgreSQL 9.0, so that's actually a good thing. But if you really want some of the more cutting edge features, we'll probably be tracking PIRIT master and PostgreSQL master. This is one of the very strange things about working in open source or free software. Yeah, I know they're not exactly the same thing but close enough. Is that you get to work with sort of cutting edge projects and this has upsides and downsides. I'll talk about the upsides right now. In the process of getting this embedding done, we found three or four bugs in Postgres and fixed them and got them into the Postgres code. So that's the upside. The projects can sort of play off each other and improve each other. There are downsides. I'll talk about those later. Okay, so that was the embedding thing. We did manage to get PIR embedded. The next thing to do, so now there's sort of an assembly language like thing that you can use to write store procedures in. Well, okay, as cool as PIR is, most people will not want to write code in it, so we need a high level language. Some of the high level languages that run on top of PIR right now would be Tickle, Python, Ruby. There's a project that's supposed to be doing PIRL 5. There's PIRL 6. That was what PIR was originally designed for. So that's some of the HLLs that we should be able to get in relatively simply. The HLL API is a little bit rough, but we do actually have, well actually it's PL Rockado, not PL PIRL 6. Does anybody here care about the difference? No. Okay, so we have HLLs. The next thing we need to think about is how to marshal data in and out of Hostgres. Now this is a little bit more complicated than it seems, because Hostgresql has this amazing type system. I know of no type system in relational database management that comes close to the flexibility and the power of Postgres, but what that means is that there isn't just a list of types that you have to support if you're really going to have Postgres support. You have to have some way of creating new types in your language binding. So as of now, we support basic data types of text and the numeric ones. We're working on the time types, and we need something to fall through to, which is the byte A or sort of blob of bits, which has to be the fallback mechanism for all of these types. It's at this point that I would like to beg for your help. If you're interested in this sort of thing, this is a great place to get started on PL parent. Okay, that's the marshaling thing. Well, I've gone way too long here already without showing you some actual code, so I'm going to do that right now. This is PLPIR. So what we do is we create this function. It has a name. It has a type input. Oops, it's back there. Okay, so it has an input type and an output type. It has a language. Then we say as, and then we transition from one language to another language. This is one of the things that I think is really, really interesting about parent is this idea of formalizing the transition from one language to a different language. Anybody that's written a interpolated double quote has sort of done the same thing. You're going from one sort of language context and that double quote mark is telling you that you're moving to a different language context. But it's so small and so simple that it's sort of easy to miss that this is what's going on. Whereas parent has sort of made this explicit and then I think there's some interesting possibilities as to what kinds of things one could do with this into the future. Anyway, I'll talk a little bit more about that later. So we have dot gram num x add five dot return x. Not super exciting code, but you could sort of imagine what's going to be the result there. Here's another one that just handles strings directly. Now what I dimly recall of assembly, which is what this really is didn't really have a concept of strings per se. Maybe I, you know, maybe I wasn't quite thinking of it the right way. Okay, so that's PLP IR. Now, there's not too many small children in the room, so that's good. Got a little scary thing here. Anybody decipher that? That is Pearl six. And it's really important to comment code that looks like this if you absolutely have to write it. Don't do what I have just done. Okay, so that's that's the languages we have. That's sort of what they look like in practice at the moment. Pearl six is a little PL Pearl six is a little bit broken because the parent project was running at full speed and the rockado project was running at full speed and they weren't quite synced up together. And so if you try to create this function that'll work fine right now. But if you try to execute it, it will peg the CPU and eventually crash your back end of Postgres. You won't lose data or anything, but it's kind of annoying. And that's one of the things that I would like to get your help fixing. Another thing we need to work on is access control. So we have some idea of what sort of controls we want to put on access and we have at least the ability to deny direct access to file system. So at least that kind of attack is not launchable through a trusted PL parent implementation. We would like configurable controls some more control over the network so opening up network pipes and of course some sorts of tests for this kind of access control. We have, as a project, actual PLs, tap tests from PGTAP, a Git repo, an issue tracker. Maybe someday Postgres will have one of these. And a FreeNode IRC channel, thanks FreeNode. We also have packages for Debian and Fedora and I don't know what other OS is, but there are packages and so you can just use the packages and that's it. We also have lots of enthusiasm. This is a very important part of any ambitious project. If you're bored with it, there's no way it can ever succeed. So, yeah, here's what we plan to do soon. We plan to sort of get things back working again. Some better argument passing, some more data type marshaling, just sort of cover a few more of the built-in types and lots and lots more tests. What we'd like, some better sandboxing than we have now. It's a little bit ad hoc and ad hoc sandboxes are great for playing in but they're not so good for access control. More HLLs, if you have a favorite language that you want to see in Postgres and just kind of want to make a name for yourself. This is your opportunity and that of course leads us to more developers and of course users because unless the thing is out in the world it's kind of a toy. Let's see here. How you can help now? Well, I like to put in something really concrete and I found one just this morning while I was reading over the source code again. Basically, there is a thing in the source tree that's called plparit.c. It contains two languages which kind of have to be loaded together and that's sort of a modularity violation that would not be hard to fix. It would give you sort of an entry point into the source code of plparit. Another thing we could use some help with right now is ensuring through the Postgres dependency system that parrot-based PLs explicitly depend on PLPIR and cannot be loaded without it. Then I'd like to see about expanding PLPIR's scope so that when you start to build things on top of it you're not finding yourself needing to build extra features in order to get it to actually work with the database. Of course you can go to GitHub and take a look at our issue tracker. Into the future. This is where we begin to go into things that are a little speculative. I'd like to see about tighter parrot integration in Postgres. There's some little licensing issue but I think we could get over that. One of the things that Postgres has is an amazing SQL engine. It's so amazing that we actually managed to break Bison when it hit one point because it was too big. For a while there we actually had to use a patched version of Bison in order to build the grammar. I think this is maybe a sign that Yak is starting to maybe be a little too small for what we're doing. I'd like to see a way to make the SQL handling done through Parrot because Parrot is really built for this sort of thing. You have a large grammar, you can pretty much get executable out of it very quickly with Parrot. Another thing that I would like to do is sort of have this transition between SQL and PL's and among PL's this is the kind of thing that I think Parrot would actually be very good at as far as handling that transition in and out and cleaning up after itself and doing all those fun things. Then of course there's the all important stuff you create because I can't think of everything and the more people we have contributing and creating and criticizing and just generally making the thing better the better it'll be. You can find more info at pl.parot.org and it's at this point that I'd like to open up the floor for more questions. Yes, hang on a second, let's get the mic. It's on? It is. You need people to work on date time support. I try to do that around Postgres. Can it be done with SQL's insane time zone system? Can it be done with SQL's insane time zone system? Well, you know, the calendaring systems as a class are some of the hardest things that we attempt as a species. If you don't believe this, have a look at some of the calendaring systems that we've built like Stonehenge. These are things that are not trivial and I don't think it's SQL's time zone system that makes it so. What happens is that we have the confluence of physical phenomena like orbital times of various astronomical bodies like the Earth and the Moon and the Sun, possibly some other ones. And then we have this thing laid on top of it which is a crazy quilt structure of law and custom. Not all of which is meant to fit together at all. So the question of, you know, handling dates and times, I think if it were to handle things the way Postgres does, it would be sane enough for practical purposes. Next. Alrighty, well, I'll be around for the rest of today if you have questions, comments, brickbats. And I'd like to thank you very much.