 So, I won't start with the obvious question, who's heard of PostgreSQL? But I will ask, who's ever written a patch for PostgreSQL anywhere on your laptop? It doesn't need to have been submitted. Who's ever edited the PostgreSQL source code for any reason? Come on, Peter, I know you have. One, two, three, four, is that? Is that time? Did you get started? Cool. OK, my name's Simon Riggs, and I'm going to talk to you today about PostgreSQL's development. The PostgreSQL project is an advanced database server. I hope you got that bit. It's open source licensed, which means a couple of things. It means you can use the code, but it also means you can contribute the code back. And that's one of the things that I'm going to talk to you about today is contributing back. So, it's been around for 25-plus years. Different people count the start date from different times, just to confuse us. But it's been around for more than 20 years, right, so a long time. Now, sometime in the past, we only had a couple of hundred thousand lines of code, but it's actually been going up by quite a considerable number of lines of code every release, so that now we have more than a million lines of code in the PostgreSQL repository. It is written in highly portable C. Now, it's also very structured C, so if you're used to some modern languages where you're used to lots of libraries and structure, then that's exactly what you're going to get from PostgreSQL. There's an open engineering process, but it's also a very carefully managed development process. This is a slide that I'll return to a couple of times. New contributors are very welcome. The purpose of me giving this talk today is to encourage you, all of you, to contribute in some way to the development of PostgreSQL. Why? Because software that is not maintained is dead software. So everybody needs to contribute to that to make sure that it stays alive. Now, you might think that there are so many contributors and so many developers that you personally couldn't possibly make a difference. Well, Peter over there has been contributing longer than I have. I've been contributing for 12 years and it's been mostly a very happy process. I've got a lot out of it personally. What I would point out is that not all of the people that have been contributing contribute for that length of time. In fact, it's a bit like the NFL in that there's an awful lot of people passing through and only a couple of guys staying there for like a decade. So we definitely need your support. There are some senior people that have stopped contributing over time and you need to get involved. I can admit to being slightly over 50. So I think you can do the math on that and work out how many years it's going to be before I stop contributing. So there needs to be people coming in at the bottom end to make that work. Peter's sitting there thinking, that's fine. I've got like 80 years left before I'm that old. So one of the things that I'm going to talk about is what the development process is. And you may have heard or there may be sort of stories in the bar about it being very difficult to get a patch accepted into postgres. And well, you know, yes, it is. But that's because people don't really understand that it's a complex process. And it's not easy to just simply contribute to that. It's the same thing as saying how many people have submitted a successful law through Congress. It's not many. But does that mean that new laws are not worth trying to pass? Well, perhaps a different subject, but hopefully contributing to postgres is a lot easier than getting a bill through Congress. If you are just interested in being a developer, that's easy. Because all you need to do is type git clone and then the address. And then you can clone that onto your laptop. If you don't know what a command line is, then I think we're probably going to lose each other over the course of this talk somewhat. Then all you need to do is edit one of the files and do the normal dance of configure make making still. Now that makes you a developer. That does not make you a contributor. A contributor is somebody that actually takes something that they've done and gives it to the postgres project. So there's a lot of people that are developers and not as many that are contributors. So what I'd like you to be is a contributor. Now, if you don't want to give back to the project, you can publish your work under some paper and submit it to some conference or other. And you can get a patent or just merely a publication. So there's lots of people in the world that contribute research, but don't actually publish that back to the postgresql project. Now, there are also some that do. So the research projects on the left there, the Axel forecast and SPGIST projects are examples of research projects that were funded by governments and have put work back into the postgres project. And then the ones on the right, and I'm not meaning to be petty about it, but just to say that there's a lot of research going on that doesn't necessarily flow directly back into the project, but it's contributing ideas. And it also means that postgres is one of the most popular projects in the world for doing database research because the source code is easily accessible and it actually makes sense. So one of the most recent ones was the Axel project, analytics on extremely large data. We contributed features for security, performance and things like visual analytics. There's also a lot of development funded by companies and that's all different types of companies. Obviously, I haven't listed every single company on there. Some are not there because I've just missed them off. Some are not there because they don't actually contribute. And so what I would like to mention to you is please do check that the people that you're working with actually contribute to the project rather than just claim to be experts in the particular topic that we're discussing. Some companies like NTT, the largest telecommunications company in the world, contribute very significantly to postgresql development. And then there's lots of other different types of companies involved as well. Who controls the project? Well, some people might lead you to believe that they control the project or in fact it's controlled by a single company or entity. But the truth is it's a multi-company, multinational project where people literally from all over the world contribute. So if you go to Japan, there's a very large user group there with full-time staff members that they're employed by the user group to organize things. And actually, it's a bigger organization than it is in America or Europe even. So there's actually a very wide interest in postgresql wherever you go in the world, whether that be South America or Europe or lots of different places. One of the difficulties for people to understand is when you say it's an international project, some people just don't kind of get what that means. That means like, oh right, so the docs are in Italian or something, right? No, no, if you go to Italy, there's a huge user group in Italy and in Japan and there's lots of interest. There's people submitting translations, there's people submitting patches. There's a lot of activity all across the world. There are, of course, alternate versions of postgres and sometimes that work has been forked and sometimes that's been spooned, meaning that there's some sort of tracking release going on. I'm very pleased to note that there is not a major fork of postgres in an open-source sense. Some other open-source communities have managed to sort of split up over time and go lots of different ways and actually I'm very happy to say that the postgres project has always managed to avoid such difficulties and even though there are sometimes difficulties and disagreements, there are major strides taken to keep the community together so that there is just a single postgres QL that everybody uses and that actually is one of the greatest strengths, I think, that the project has is its community spirit and its teamwork. So how is the project organized? Well, almost, well I'll say, all of the commits to the code repository are managed by a small group of people, the committers. Now in some software projects, you get a very large number of committers involved and in fact it's kind of a low-level badge of merit to become a committer in some projects. Now the postgres project, unashamedly, does not do that. It does take a long time to become a committer and to some people that is a negative aspect Well what we think is that it's difficult to write and maintain the database software in the shape it is now and actually it requires significant experience, both experience in terms of work as a team but in terms of code contributions. So actually the status of committer is not granted easily to people. Now that does not mean that the people that contribute their time or the code to postgres are not worthy of merit and actually there's a separate merit system in postgres where contributors are mentioned on a specific web page and we go to great lengths to put individual credits in particular places so if you do manage to get a patch accepted it will have your name against it for all time in the release notes. So the project does go out of its way to recognise people. Now that's a good point because there's unfortunately always a factor involved in contributing in the thought that is somebody contributing just simply to get their name in lights and if that's the thought that's motivating you it's probably going to lead to quite a shallow contribution and as a result is probably going to lead to your work getting rejected. So if you make a full and deep contribution then it will more likely be accepted. In fact I remember Peter's probably going to say, why do you keep mentioning my name? But my first patch I thought this is brilliant, it really works, it does everything and then actually it was Peter that came along and said well you should design it like this and it should have all these bits hanging off it and I was like oh no I can't possibly do that so I spent a week sort of deciding whether to commit suicide or not because my work had been criticised publicly and all this kind of stuff and actually what you've got to realise is that it's absolutely nothing of that sort there is always discussion about any idea that comes forth and whatever you thought about your own ideas or contributions there's always something somebody else can do to add to them to make them better and it does require a little bit of patience perhaps maturity to see that that's the case okay and I don't mind saying there's nothing I've thought of in this project that hasn't been improved by somebody else saying some comment I've never ever contributed something and it just went straight through without improvement obviously it went through, it receives many comments but I'm saying I never had an idea that was not improved through feedback from somebody else okay now that is a stark difference to the way that things work in companies where almost all of the code that you write is never inspected by anybody your boss just says did you finish it yet and he doesn't bother to like check the code or even you know doesn't hardly check with the QI guide to see whether it works you know it's just did you finish it yeah so the post-credits process is significantly different from most software development and that's one of the reasons why it's harder to get your code accepted because we've got very high quality standards and if you're not used to that and I have to say when I arrived I was not used to that then you will falter you're gonna have a big problem with the process but if you understand that it's a peer review process where people are actually going to look at every single line of code that you wrote and comment on it and they've got good comments because these are clever people now why are they clever people if you're a post-credits developer does that make you clever no that's not true at all it's just that all across the world there's a lot of people involved and so just by by chance by randomness you're going to get a load of people many of them will be cleverer than you okay that's the way the world works so don't be put off when you get lots of comments back on your code it's part of the process and it is well intended okay and if you play with that correctly it's going to end up both a good thing for you and a good thing for the project so let's have a look more about the way that the project's organized because if you understand the way the project's organized then you're going to stand a better chance of getting your contributions accepted and I'll say again we definitely want your contributions so there's a release team, a security team, a server monitoring and management team and a core team that provides oversight now the the main process for code contributions is something called a commit fest which I will talk about in terms of the actual release process so the the current stable release is 9.5.2 the current release that's in development is 9.6 we work to approximately an annual release cycle and each release takes just slightly longer than that because we overlap the releases slightly there's a maintenance release every three months and maintenance releases may be triggered by extreme bugs so if you're developing something you need to develop it in a way that follows the process but also in a way that follows the timing of the release itself if you submit something a week after freeze it's going to wait a whole year okay and as some people do they think oh well that's fine I'll submit it the week before freeze and then it gets rejected and then they have to wait a whole year okay so if you're too clever with the process then you're going to wait around a long time so the thing to do is engage with the process work with it in its timing and then you'll be successful so the development schedule is we tend to have a meeting once a year where we chew through a bunch of stuff and then we have these things called commit fest that are incremental phases within the development cycle so each year we'll get this cycle happening and now we each year we tweak it slightly but it's kind of roughly this each time so there's four commit fest and then we move through the main part of the release process each commit fest has a separate commit fest manager and then there's a release management team that guides us through the remainder of the process some of these things are new some of these things have been around for a while so what the hell is a commit fest well what we used to do in the old days was there's a big queue of patches and people took stuff out of the patch queue and committed it and what we found was that if well for whatever reason your patch may have lingered on the patch queue for like two years without anyone looking at it so what we decided was that we were gonna have a process where we would collect all the patches and then every commit fest we'd clear the whole patch queue okay now that's almost successful but there are some things that get pushed back from commit fest to commit fest where people can't really decide what to do so we do our best to clear things each commit fest so everybody likes to develop their own code and yet all code must be reviewed prior to commit so obviously the math is that if everything needs to be reviewed then every code contribution needs to be matched by one review contribution so if you wish to be a code contributor you need to understand that you probably need to commit as much review time as you do development time okay that's the math now some people get it some people don't now early developers don't tend to get it but I'm trying to get the message through to everybody that that's what you need to do so if you submit a patch you probably need to review somebody else's patch as well okay quid pro quo now the commit fest manager will run the process smoothly and they do so via a commit fest application which is shown there now I think over time some developers move through to a greater number of reviews and basically after quite a long period of doing just development I've kind of realized that I needed to change my game and I've started doing a lot more reviews now rather than a lot of personal development not because I wasn't successful at development it's just that the situation is that we need reviews just as much as we need development and there's unfortunately a lot of people that are focused on the development side of things so we need more review so the development statistics are both encouraging and discouraging we have four commit fests a year which means that we have around 250 features or patches in every release now you may say well how do I judge against that well Oracle was heard to say that they had 300 features in their latest release now the good news is that that means that if Oracle release every other year and we're doing 250 features a year then PostgreSQL has got 200 features per release now that'll be 100 features per release more than Oracle does on average so PostgreSQL is developing more features each release than Oracle is so that's a good number to take to people when you say how is it that the PostgreSQL project can be doing what it's doing it's because we've got a lot of people involved and they're contributing a lot of work but that doesn't mean we don't need more okay so we could actually accelerate this process if we had more reviewers and more contributors now I am happy to say that the 9.6 release has been better than normal and we've had a lot more reviews so we actually got more through in this release now the success rate for patches depends we can discuss exactly what we mean by success but what I can say to you is that if you submit a patch and you expect that patch to get committed you have got approximately 0% chance of that being true almost every single patch goes through about 3 different versions and if it's a big feature you can expect it to go 10, 20, in some cases 30 versions of the patch before it finally gets committed and some features have taken more than 5 years to get through the process completely so some might say that's hard some might say that's a hard quality bar but the point is everybody appreciates ProScres' quality and its feature set and it's hard to come up with something that is genuinely worthy of adding to that okay so please understand that so I don't know if I got through to you before but the project is looking for new contributors we welcome contributors who work for companies or whether you're doing it in your spare time it doesn't matter a lot of people started doing stuff in their spare time and then went into it more professionally there's an awful lot of people in the world that have got a few hours to spare and the skills to contribute so if you're a ProScres user you might consider helping out in one of the commit fests maybe you could persuade your boss to spend Friday afternoon doing some review work or some other kinds of contribution so we're looking for open source contributors of all different kinds one of the most valuable things you can do is submit good bug reports now I would guess that about 10% of bug reports would be classified as good good as in the sense of it mentioned a bug in a way that is reproducible by the developers if somebody said I had this funny error message and then it went away okay then thank you for that if you submit a script and all of the preconditions to recreate that bug then that means we can run that script and we can fix that bug straight away the analysis time for bug reports far outweighs the bug fixed time in most cases so it's actually when you pay a company for support you're mostly paying for the analysis time not necessarily the bug fixed time obviously you need skill to do the bug fix but you need time to do the analysis so if you submit a vague bug report to an open source list you can be pretty sure that nobody's going to spend a lot of time on that so people pay attention to people that submit good bug reports so if you're the type of person that submits a very vague bug report and then you argue with the person that's trying to help you you can be pretty much assured that you'll be ignored the next time you submit a bug report so please contribute bug reports they're particularly important they're almost as important as new feature designs and requests but new feature requests are useful and good if you're thinking about them in a particular way I remember that somebody submitted a bug report once that they wanted an un-vacuum and I remember we all laughed and there was lots of joking around on the thread and then somebody else phrased not on that thread but on a separate thread they said what we'd actually like to be able to do is roll the database system back to a previous point in time so that we could recover missing data and when it's phrased in that way it's the same feature request but one feature request got rounds of laughter and the other one came out with a that's a good idea we should think about doing that so if you spend some time thinking about what your feature request is think about some of the problems and then describe it neatly to people then you'll actually get a lot more traction then obviously the guy that submitted the un-vacuum request probably left thinking that you know the Postgres project was full of ignorant savages that laughed at his brilliant ideas who knows, anyway so the project structure is that there's a core to Postgres but then there's lots of other projects surrounding it so it may be that if you wanted to contribute you could contribute via one of the drivers or perhaps by writing an extension and in many cases that's a lot easier to do so some people decide that they want to be a core code contributor and really that's in some sense is the wrong way round you need to have a feature that you'd like to build before you can become a contributor in a particular area so you need to know where you're going to discuss things because all of the open source development happens on mailing lists some of it happens person to person but most of it is happening in the public eye on particular mailing lists so you need to go to particular places now if you turn upon hackers asking for help with a SQL command you will be mostly ignored and told to go somewhere else if you come up and start discussing code then people will be interested in what you're saying and will listen and you can progress from there so there's lots of different places to discuss what it is you want to talk about so please try to get that right in the sense of if you go to the wrong place and discuss a great idea you may well be ignored whereas if you discuss it in the correct place people may well listen all of the contributions to Postgres are tested on a build farm so one of the things that you need to appreciate is that any code you do write will need to be portable and if you say well I can't be bothered to make it work on Windows then that's probably going to lead to an automatic rejection of your ideas so you need to be even-handed open and aware that things need to run in lots of different ways the test coverage of Postgres is not all green as you can see so one of the contributions that you might make is submitting new tests that allow us to test different parts of the server you might also learn from this which parts of the code are new which types of things are worthy of being extended anyway some further information if you want to submit a patch make sure you discuss it in the right place discuss it on hackers first if you write a patch first and then submit it that's usually considered the wrong way around and it's got more chance of rejection depends how you phrase it if you say I've put together this prototype of an idea I wonder if anybody's interested in this direction of research and thinking could we could you help me to make this into a full feature then you may get some traction with that you need to basically to agree away forwards with uh... interested people and that could be anybody some developers have been there a long time and some people think that you need to get agreement from certain developers before you can move forwards uh... obviously you need to get agreement from everybody before you can move forwards it's what we call the consensus process so there isn't agreement from a small number of people you need to get a agreement from everybody now obviously that means you need to approach things uh... in uh... an open way uh... but you also need to explain things to people if you are incredibly terse in your descriptions of how your new feature works or or you or you just simply don't give enough information then you'll get no interest and effectively your patch will be ignored so uh... assuming you get interest in your ideas and you show a willingness to develop those ideas then you can write them and publish and you will then need to wait for review uh... and uh... as i've said there uh... you need to uh... be prepared to be very patient patience does not mean sitting there waiting for two years without a reply what i mean is uh... if somebody appears in your mind that they are hostile to your ideas bearing mind that it may be just in your mind because somebody has written down on an email that will not be possible that does not mean uh... actually i personally dislike you i think most of what you've said is rubbish and i never want to see you again it doesn't mean that at all in many cases you could imagine that the person saying that has a kind of kindly interest in what you're doing and he's kind of leaning over you and going yeah i like your ideas but that's probably not going to work and if you imagine that that the responses you get are helpful and interested you will then respond to them well and come forward with new versions you can usually tell how old somebody is by uh... the hostility of their reply and uh... sometimes you know uh... even minor comments are perceived as as negative uh... which is a shame you will need to listen to what people say about your work uh... and the other way around is you need to give good review criticism now it's you know it's easier said than done and sometimes people phrase things in a way that uh... leads to hostility uh... but that's okay you know it's so it's it's easy to apologize if you said the wrong wrong thing some people find apology uh... particularly difficult but actually it's quite easy you just say i'm sorry did did we misunderstand each other there let's let's move on uh... developing the changes is the easy bit you can do that on planes trains and at home uh... so there's always time for development if you're the sort of developer that uh... considers compilation and testing uh... to be uh... you know a slight nuisance uh... actually i've hacked up a good patch uh... but i haven't yet complied it to see if it will work uh... then obviously our relationship is not going to go far please think about validating your changes uh... you need to include documentation anybody that believes that's documentation is the thing that you do last in a software project has not written good software because if you write down what the software is supposed to do okay that means that all of the people that you're working with read that uh... commentary understand what it's doing they can comment on the design and they can also help you test it if you haven't written anything down about your patch it's extremely difficult to i work out what it's supposed to do uh... how it does it uh... see uh... can we test it to check that it does that even though i say that we all laugh regularly every commit first i read about five or ten patches i'm trying to work out what the patch actually does and i can't work it out you know there's like pages of code sometimes there's comments in the code you know saying like well i was time to call that function or but there's nothing in there says why did you do it this way you know why did you write the code like that why didn't you add it to the jdbc driver for example an explanation of why you added that code there an explanation of what this thing actually does why are you trying to do it i know that it might be crystal clear to you that you say like oh well uh... this adds some catalog function or something yes but why you know what is the purpose of what you're trying to achieve so one of the biggest uh... and most stupid ways that you can get your patch rejected is to design something write it but then just describe the design so poorly that we go about six months batting the patch backwards and forwards before somebody goes oh you're trying to do that we already do that over here and then your patches rejected instantly on the basis of its uh... duplicate functionality now that's just a stupid waste of your time so please don't do that uh... it's not because the post-credits project is particularly anti new contributions far from it we want that contribution but you need to help other people understand that your contribution is a good one so uh... over the development uh... of your patch uh... you will need to uh... maintain different versions of that patch and i recommend that you use some form of repository uh... for that development because very probably at some point you will extend your work and then go oh no that's rubbish i have to backtrack okay so the likelihood that you're going to reuse some of the work that you uh... did earlier in your project is quite high uh... so make sure you keep all the different versions of your code uh... and also make sure that uh... other people have got access to those other versions because somebody might see your new version say actually now that you've done that i can see the way you had it originally was actually the best way please could you go back to version fourteen uh... and uh... and use that use the code from there uh... obviously other people are developing things at the same time so uh... you're going to need to rework your patch rebasing it uh... on a repo or changing it in many times in response to uh... to other commits in the uh... code base so a lot of the work these days happens by the commit fest application and uh... you will see that uh... you need to submit uh... a patch uh... the patch has a particular status and once it's been reviewed it may be passed back to you so it'll go through needs review back to waiting on author and then if people finally agree it will be set to ready for committer where a committer will then swoop in and uh... in a lot of cases they don't commit the patch because that may well be the first time uh... that one of the committers has seen that work and they may find problems that earlier reviewers did not see being set to ready for committer actually just a later stage in the process but it's still no guarantee that it's going to be accepted if your work does not have a reviewer find one don't just sit there and wait uh... for your work to be reviewed uh... you need to work with other developers and it's a common thing to do to swap reviews obviously if you're going to rate each other's work golden and uh... things like that then that is going to be spotted and uh... effectively there at those reviewed comments will be ignored so you need to work with somebody that's going to give you honest critical reviews the work that you're doing uh... and if you can't uh... find somebody you can write to people that you've not met say hello i'm working on a patch could you do this uh... unfortunately i do get quite a lot of emails saying could you mentor me as a developer and i have to say well i'm sorry i'm uh... fairly busy uh... so probably no uh... the patch review process is available on the wiki uh... there's lots of information on the wiki and you need to read it and understand it uh... there are various types of review that take place on a patch so you may think that just because somebody's gone through your patch line by line and found no problems with it but therefore it is ready for commit uh... one of the big things i find is that the design of things uh... is actually uh... fairly poor and that we wouldn't want to accept that particular design the code's great the design is not so there's various levels at which things could be accepted or rejected so have a read of that and understand the different types of review so overall uh... the development process is about achieving an end goal uh... we're interested in moving forward uh... the postgres project it is not a demonstration of your technical coding prowess this is not a step on your career ladder uh... it might become so uh... but if that's the approach that you're going to take then you're probably not going to be as successful uh... as if you actually think about the needs of other people uh... so what i can tell you is if you want to have something accepted uh... focus on real needs so that you've got a good problem analysis you've got a good design uh... for how you can solve that problem uh... and then good project management of your own contribution make sure you respond to people in a timely manner make sure you're responding in a uh... in a polite and useful manner so we're looking really for a brilliant execution so this is not some like uh... reality tv show some cookery uh... cookery show where like one person gets asked to leave the program after every commit fest it's not it's not a game right we're interested in improving the software and if you want to be a contributor we want you to be a you to be a contributor then it's possible for everybody to to get involved so think about the project management aspects of this as well uh... so if that that might mean that you need to plan to spend every monday night on it for three months or something in order to get it to flow correctly so so think about all of those aspects as well when you develop uh... something it's a database and as you might expect it's all about persistence it will take some time to get your work accepted but we want to see that work in postgres okay uh... i ask you uh... very much to pay forwards if you've used postgres if you've benefited from it in your company uh... or your career uh... then do something to contribute back to the project uh... that helped you in that way uh... and remember that uh... the postgres project is a team and uh... that we're interested in having everybody work together as a team in order to make solid contributions uh... to postgres ql so uh... i'm gonna leave you uh... now uh... gonna pass over in a minute to uh... peter that's gonna talk about testing uh... but just uh... the final message is really that myself and many other people would like to see your contributions to the project uh... and uh... if you can see your way clear to adding to the project in some way shape or form uh... that would be very much appreciated thank you very much