 All right, good morning as Lovely couple said to me in the lobby of my hotel room this morning. Happy Fosdham. I think it's been eight years since I was here last It still looks the same. There's a little more water on the floor this year The weather is a bit worse, but otherwise it feels pretty great So this is my first presentation on MariaDB, MySQL and so on in quite some time And what I initially envisioned was going to be a knockout 20 minutes tightly compressed of beautiful slides great information and Yeah, well Let's set those expectations a little bit lower Let's go with mediocre slides Information that many of you already know and opinions from someone who's been out of it out of this community for Let's say about 10 years First let's start by reworking the title patterns and anti-patterns in OSS participation To be more specific really we're talking about interestingly Actually free software and open source, of course We have a project and a company that I know based purely on free software like MariaDB as it is GPL licensed fits that definition and The ecosystem we're talking about isn't the broader open source or even free software and open source ecosystem It's of course this mixed ecosystem that we live and work in where we have corporate actors Non-profits hobbyists lots of adjacent projects It's a different space in the general space The other interesting thing here that occurred to me as I worked on this deck was It's not just any project. It's a project that's been running for a while MySQL has been Initially unireg bontese Database ish system in 91 and it's evolved and forked and changed and split now. There's drizzle which is sadly dead and web-scale DB which is sadly dead and Precona server and Maria DB itself are all these interesting variants that are left over all complex wonderful interesting projects That have a lot of technical debt in some ways and I'll talk more about that in a minute So also, I realize that most of you probably won't know me now when I started working at my SQL in 2001 I was the first community manager and I had the good luck of going to conferences and having people Usually figure out who I was pretty quickly and I came to conclusion. I must be popular and likable But mostly it was that my skills popular and likable in those days and I found later on working for less important projects that Actually, I was relatively ordinary So here I am some years ago with Monty and a few other folks This was a while back So this presentation has also been about diving back in to find out what's been happening lately And it has been fascinating. I had no idea that there Were some of the MySQL forks that there are it's been neat to dig into those When I started this presentation my working assumption was that it had gotten easier to contribute to MySQL and all of its various forks That the position that we had at MySQL back in the day Which was that it was too difficult to contribute to the core which is why there are so few core contributions Have been proven false And when I chatted with my new colleagues at the MariaDB foundation the early chats let me to think we've done it We've overcome it how and this for references pretty much the line that we had for a lot of years We were saying it back in 2000 Morton Mecos former CEO of MySQL was saying it in 2008 Five years to get into the code of the server So when I started digging in a little bit I Saw this is great. I ran a few naive commands and get I'm like there are a hundred and seventy four contributors To the github MariaDB repository alone. This is fantastic But then I started digging in a note. I'm just focusing on the server It doesn't mean that the other contributions don't matter, but the servers and interesting Point to dig into because it is complicated and because it was hard to contribute to there are also the other incredibly valuable contributions like documentation and storage engines and user-defined functions and Providing support on forums and all sorts of other things so I Started off in the spirit of my initial presentation with fancy graphics. So here I took all of the commits And graphed them out as a stream graph by Committer over time all the way back from the commits that were imported into github from the early days of MySQL and Starting in 2000 and we're moving all the way forward to the present date I thought well, I can't really tell much by looking at that. In fact, maybe I'm in a club somewhere in Amsterdam in the 1990s and there's music playing and yeah, not useful So I use simpler tools and started digging in dear get please tell me in the last year How many unique email addresses were there for the committers? 79 that's that's okay. I mean there are you know 20 somewhat engineers that Maria DB corporation and say 10 or so Five or ten engineers at the Maria DB foundation and that doesn't seem like a lot. So how many How many contributor domains are there get please please tell me? hmm So there are 25 unique domains. That doesn't feel like a lot of contributors to me So what are those numbers actually look like over time? Well for each of these years we end up having You know at the highest point in 2006 there are 386 contributors and it's dwindled down over time and of course the earlier years were MySQL as a whole and I Thought well We have to dig in a little bit more to see and it's time to go back to using a bit of a fancy graph Instead of just fancy bits of command line stuff. Let's dump the data into a form. We can work with it and Then stick it into another stream graph and on the Left-hand side we have the Organizations just done by taking the domain names of contributor email addresses and looking at them So we can see here That over the last year On that left side and I'll go back so you can see the comparison You've got all these domains The left hand side so these are the folks that were have committers that have email addresses that Maria DB calm Maria DB org and so on and if we dive in a bit we see things like yeah Maria DB calm at the top with volume of commits and then below Maria debris DB org Next and then below ask Monty org What the hell? Well, there are still some legacy email addresses in use committers Haven't necessarily updated the their get preferences So what people are these when we go and look over by domain and so then we can see oh, yeah So Marco from the corporation a lot of commits in 2018 Sergey Golubchik Consistent contributor from the early 2000s Perhaps needs to update his case If we go through the list, there are probably a good number of names that many of us recognize at least for me There are a lot of former colleagues in here Especially in the higher volumes of contributions and if we look down here at the lower volumes Gmail it's a bit generic GitHub. That's weird DBART. Well, that's going to be another Maria corporation Employee Monty program someone else needs to update their their get configuration So with my initial hypothesis being we've got a lot of contributors now actually We have a moderate number of contributors Many of them have been with the project for a long time in some cases coming up on 20 years And of course one over 20 years And it got me thinking as I looked through the commits as I dug around a bit more as I looked at oracles Commits and so on It seems to be that the hypothesis we had nearly 20 years ago of it's hard to get into the server is true and chatting with the contributors who work on the server code now They say things like yeah, I'm four years in and I hope one day to understand the code base completely Maybe the fifth year will bring them enlightenment It's interesting and there I think there are a lot of reasons for it, and I'll I'll get into this in just a minute. I Probably should have changed the order here. So let's quickly chat about How adoption fits in so We have this Database engine ranking from BB engines calm. I think it's a huge ranking. It's super interesting They track 343 different database systems 140 something of them are now relational systems. I was blown away because I didn't realize how many systems that come into being recently You can see on this logarithmic Chart so logarithmic on the left side the Maria DB score over time You can read up on their method for scoring on their site and yeah, it's really going quite quickly at the top You have Oracle my SQL SQL server Dominant in the space for a long time. You have PostgreSQL Being the other one that is trending upwards and Then you have all the others as Q light is a nice benchmark in the middle DB2 is relatively flat So my SQL and variants Maria DB and so on have a lot of people using them There are many many many Available or there's a large pool of available contributors so it seems that this hypothesis that it's tough to get into the code must hold true because if There are so many more people who could get into it Why aren't they? I think some of it we can explain by looking here This is a handy diagram that I stole from realm They make a mobile database of sorts that I have not looked at one bit and they tried to graph out the introduction of new database technologies over a Ten-year period and they missed a few things like I don't see Berkeley DB on there anywhere other systems that they missed But it's accurate enough. There's this explosion at the end of Where people need databases that need databases that are relatively specialized But the effort of digging into a mature complex Long-term code base like my SQLs like Maria DB's it's pretty daunting. I think often It's easier to start something on your own or fork. There are I think four three or four my SQL forks in that list Then it is to dive directly into a project And we see this in many other projects if you look at say postgresql There are five years in the early days of refactoring that code base to make it so that Normal mortals could work on it or in well abnormal mortals I don't mean that in a pejorative sense, but you have to be pretty skilled to work on any database engine code base the same thing with say the Netscape code base that had to have heavy refactoring into Firefox before it could become reasonably accessible to a larger group There are some things that should further Contribution that should make it easier in the Maria DB space at least So recently there have been programs at the foundation to enable new contributors to provide the kind of mentoring that's needed If I look at all the names that are heavy contributors in those previous sets of graphs, there are people who had Access to for the early ones direct access to Monty So they could talk with someone who understood the internals very well later on the contributors could talk with people like say Serg or Eric Herman who also had deep insights from working with Monty from working on the code base There is a strong mentoring component You need to have access to someone who can tell you why the code does something in a certain way because it's not always obvious and This is I think one thing where we can start to make more of a difference in the old days with MySQL regardless of who the owner was There was a corporation you would hire people you would have some contributors from outside Who usually worked for large organizations that pushed the database like hell and that needed to fix some specific issue, but might not Generally work on the product and nowadays with MariaDB at least there's the corporation on one side Who has that similar pattern? They have developers they employ they do a tremendous amount of good work on the code but they've got commercial motives and They can't necessarily mentor Less committed say full-time employed committers and that's where the foundation comes in they're able We are able to mentor people in a way that say a corporation can't and build structures that help people get into the code base And all of this as I focused on the server and as I focused on the time ticking along We can't forget that there are many other ways to contribute If we try to distill all of this down into about let's say for big points the patterns that work well over time invest in people the I think with every complex project when a developer gets into it. There's a larval phase where you eat sleep code and That takes real support. You have to have the financial resources to do it You have to be at the right stage in your life to do it and it's a lot easier if you have help from someone who knows the code base and even if I Think this applies to any project Investing in people who are in the right place to learn the code base and to help others Helping them learn positive community norms critical I think the next thing is you have to respect the technical debt of a project Years ago at my SQL we we came into the old adibus code base which became max DB and that Code base had a lot of technical debt had been developed for a very long time. It was a very Forward-looking product. It was very advanced But it was written in Pascal and that Pascal was transpiled into C and then that C was hacked further And then it was turned into a finished set of binaries. I could not ever manage to successfully build SAP DB max DB on my own after making any changes to anything and The pitfalls were kind of comic like you needed one version of Python not earlier not later to build it period one version And I didn't realize this I had the most up-to-date version so for these projects I think sometimes we have to say it's easier to stop and refactor It's easier to move on and do your own project until you understand something better There are lots of great projects SQLite for example that begin with some exploration with saying hey the existing stuff is too complicated I don't get it. I want to try to re-implement using what is more state-of-the-art now I think an important pattern is understanding the value exchange and There are early days of open source people had this naive view that I don't think any of us hold anymore Which is people contribute just because they enjoy it, but enjoyment isn't enough You have to be able to feed yourself you you have to enjoy the people that you work with there are Many things that fit in with these needs and when you've got a complex product that demands a lot from you to work on it you need to make sure that You're supported or that you support the people who are doing this complex work, and so this means at times In some Non-profit spaces people do things like they give coders grants the coders that are digging into something that requires a lot of focus Get some money so they can they can spend their time working on it Maybe we should look at the same thing with the foundation and we do certainly employ a good number of developers to give them that Time and space to focus on on a complex code base and the thing that I've of course danced around the entire time is Contribution of the core is just one piece in most projects The highest value comes at the boundaries between projects where you make things talk to each other the client API's the storage engines user defined functions The places where you take Two or more things that each have value on their own and you combine them in a way that creates More value than you would have had before and so good examples are in the early days would have been things like PHP Where PHP plus MySQL powered a whole bunch of people's careers and say 60% of the web together with the patchy and Linux at some point in history It's the combinations that really bring out the best in open source I think and So it's easy to focus on the server as being oh, yeah We have to make it easier to contribute But we can also make it a lot easier to keep contributing at the boundaries to make it so the places where people Most likely need or most often need to extend are easiest to extend If we think about it just mathematically The number of people who will want to make a product work with another product It's probably much higher than the number of people who want to extend an existing product Because there are dozens of programming hundreds of programming languages. There are dozens and dozens of frameworks There are so many systems that work with databases that numerically they just vast the outweigh the others so with 30 seconds well in five seconds 35 seconds left. I will say thank you and See if there are any questions