 Hi, so I'm here to talk to you about the distributions from the view of a package Where if you are the upstream package owner or package maintainer, what it is like to work with distributions so So who here works on a distribution? Wow who here works as an upstream package Okay, so some of you also have similar experiences possibly with working with these distributions For about six months. I've been at the CT office at Percona and Percona does all forms of my SQL and MongoDB and You know we make all these open source tools that are 100% of obviously open source And then before that I was on the founding team of MariaDB server. I left because they stopped making open source software So first and foremost, I'd like to say thank you very much to all the packages Because you do a wonderful job we make software and If we don't get them into distributions or and distributions mean a lot of things nowadays including, you know Docker images and so forth. We don't actually get adoption Adoption is crucial for us to continue wanting to make better software There's only then will we know if they are bugs or you know, we did something silly and so forth So I'm going to talk to you a lot about my experiences with the MySQL ecosystem because I've spent a lot of fairly long time On the MySQL ecosystem. In fact, this gentleman here Lenz has also spent quite a bit of time at the MySQL ecosystem Though he he left us maybe five years ago Or seven years ago. Yeah So MySQL has been around now for 22 years. It's pretty old software. How many of you use MySQL? Okay, so the rest of you use it and don't want to admit you're using it, right? So MySQL is old. I mean it started, you know in 1986 with Unirag And you know, it was just a text UI to an ISM data star. You had very simple rows written to disk Just then you added indexes on top of that So you have these MyISM and form files, FRM files, which is form to enter data And all these all of this is changing as MySQL 8 comes alive, right? Form files are going away There'll be no more FRM files. There'll be SDI files inside of, you know In NDB and there's also JSON. So that's like a huge positive as well. So no more FRM files Prokona server also been around for a fairly long time. As a company, it's been around ten years. The server is now nine years old MariaDB, seven years old, you know February 1st was its release, first ever release We've also seen things like web scale crop up and if you're interested in some of the people who made web scale They're at the MySQL dev room right after this talk Not that I'm telling you to move Yeah, and web scale doesn't really exist nowadays because that was the idea of having a consortium of people to work On making MySQL better for people like Google and Facebook and Twitter and so forth because upstream being Oracle weren't so So kind in terms of accepting patches without signing the Oracle contributor agreement MySQL for what it's worth didn't start off as GPL software. It only became GPL in 2000 You know more often than not you may have some embedded device still still working from from those days running 323 GA It's still one of the most popular all the distributions of MySQL around You shouldn't use it really Then there are a whole bunch of other for all releases and for one and so forth but it is very well worth noting that it was Around the time my SQL became GPL that the software license for the client connector changed from LGPL which is the lesser general public license to the GPL Which meant that if you were going to embed it now you'd have to get a license exception And it took a while before we even got something known as a forced license exception because and the reason forced license exception came about was largely because distributions didn't want to touch my SQL with the new my SQL with a 10-foot pole they were worried about linking say PHP or Pearl against the GPL of my SQL client which would then possibly mean that those applications would also have to become GPL So the first license exception came about and of course my SQL grew to fame Around the 90s during even the web 1.0 days largely because of replication and scale out so you'd have you know Famous website like slash dot the people still read that It still exists. Yes, but I don't know if anyone still reads it though Slash those are around 20 years ago actually and they were big my SQL users and probably still are so to talk a little bit about Licenses and license exceptions You know, I don't know how many of you remember Alan Cox big big guy in the Linux world and Still is and he was also very adept with with licensing as well My SQL changing the client libraries from LG PL which made my skill a lot of stuff like PHP infeasible This could have hurt my skill adoption had there not been a false exception And a lot of other people around this time the Fedora project was also starting and Fedora was like no We don't really want to include a new my SQL And it was only then did you get a false license exception? and this is what Richard Stallman himself refers to as parallel licensing and Why why does my SQL need to have this parallel license or a dual license? Very simple back then if you were going to embed my skill connector and you wanted to use it for commercial applications say, you know You were a hardware manufacturer making a router back then there were even phone companies doing it and even like Software like Photoshop even so Generally the idea is They need to pay a license fee, but for soft for software doesn't need to pay a light any license We hence the false exception did you have a question to be fair the first license except exception is different from the dual license Yes, of course The terms of the GPL and makes it more compatible with the other with the others yes Yes, but yeah But we need to also talk about the dual license because we need to we needed a way to make revenue And that was really why there was a dual license and The first license exception allowed my SQL connectors to actually finally go into the Distributions and why is this important right? It like if you if you want to move software forward You don't really change the old releases you keep on changing the new client libraries to make it better And if the distributions don't ship the newer client libraries You're actually heard because whatever cool stuff you have in the server that the new client library needs to access people can't use so This was was a crucial decision to make a license exception so that it gets shipped But we still continue on with the dual license and we actually my SQL till today has a dual license for what it is Was but these first exception still is is is available on the on the website hasn't changed since 2012 distributions are also very interesting because Every time you make a new release of a client library You have to make sure the application binary interface doesn't break because you have to then rebuild possibly hundreds of packages that link against it and This this is an example from 2006, but the same example happens in 2016 When Debian for example is replacing my SQL with Maria DB and everything has to be rebuilt against Lib Maria DB client In fact, this move is happening right now. So if you join the package of my SQL main list, is anybody on that? It's a wonderful mailing list because well now it gets such high traffic I mean people are complaining that you know Maria DB is not drop-in Maria DB is breaking their software and We're learning a lot of new things That we didn't know because Debian is clearly diverse So they don't like to rebuild hundreds of packages and obviously ABI should never break This came from the packages mailing list Now the packages mailing list for my SQL unfortunately is still fairly is kind of dead as is the internals mailing list but this is something that even distributions and People who are making packages actually complain about because they need a place to discuss things with the upstream developers And if the upstream developers don't provide that this is bad from a from a distribution standpoint as well So Maria DB for example has its own packages mailing list where it tells you hey you that's going to be a new release and it's it's very similar with with Percoda and My SQL currently is more like here. We're going to drop something on you and this is the whole we're going to drop something on you usually like when there's CV ease and so forth and That typically happens at least once every three months because Oracle has what is known as a critical patch update day once every three months and then you get a new New release and my SQL itself gets released once every two months Now when it comes to speed of releases I would say Fedora and Ubuntu really primed this nine month distribution cycle Maybe even six months six to nine months. They release very quickly now database software a little bit more complex than than this and the idea was to also make, you know, nine month releases but the reality is we make 24 month releases and this is across the board from my SQL and Maria DB as well and The the real problem we face as upstream is that if we make a release today and you ship something you are now shipping something that is possibly outdated for a very long time and We get and we make new feature new feature fixes in the next release which people don't get and Red Hat has found sort of a Workaround for this. They have what is known as software collections or SEL Which sort of actually helps in this in this scenario But this the speed and velocity doesn't actually work out very well for upstream people Now if you take a distribution like rel or or Ubuntu, which has a lot or even susa Which has long enterprise support You now also have to support the software for the lifetime of the distribution So imagine if you take if you've taken two-year-old software Packaged it and now you tell us. Hey, you've got to support this for for another eight years that's ten years of supporting software that we really didn't want to support more than five years and This is actually a real life problem So again rel seven ships, you know MariaDB server five five and the reality is rel seven will only and and its first level of support. I think of the quarter four of 2020 and Five five was released in you know, 2014 Which is longer than anyone wants to support and this is again very true from a Even from a Pocona standpoint where we need to start thinking about deprecating modules inside So if you want to deprecate an engine even we have to think very carefully about how long this this is going to take We do realize what ships and releases matters and this is the and this is amazing thing you get about you know statistics Whatever little statistics we get out of popcorn Which Debian does provide is that what is the default really does get used a lot? Because more often than not people just say I want to install WordPress WordPress pulls in my SQL and People don't care that they're running my SQL. They just care that they're running WordPress and this is true even for you know Acco nutty and and so forth and You know, you'll see that my SQL five one is really popular I believe I took this out actually from Ubuntu. Why is it popular? Because Ubuntu LTS releases are more popular than every other Ubuntu release that comes out my skill 5.1 has been end-of-life for a very long time, but a lot of people are using it So they are using it with potentially lots of security holes and they don't know it Distributions can make something So make make upstreams life a lot easier by giving us statistics Distribution statistics are extremely sparse if we look at Fedora and open Susa They used to have statistics several years ago, but they don't have even even simple things like mirror statistics of how many people Pull downloads we we do not get so statistics are extremely sparse We are we are getting some stats of things like Debian and Ubuntu's popcorn, but I'm sure no one in this room runs popcorn, right? Yeah, you can't rely on popcorn It's like what yeah, but it's fake news. Yes, but in the absence of real news fake news will suffice And this is especially important for it's also important for my SQL and Prokona But it is extremely important for MariaDB server because it is a venture-backed company And you've got to impress the venture capitalists every month. You got to say look we we had extra downloads It's so good So statistics like Docker and juju they actually give you statistics kind of useful But we need more stats And also when you hear, you know random codes like this this this I picked out from MariaDB press release They have a 12 million strong community, but then if you look at phone home data, I see lens Grimacing if you look at the phone home data because MariaDB server actually includes a little phone home thing to report What is being used in the server? It is off by default. No one turns it on and if you If you believe 12 million and you look at the phone home data, which I picked in December of 2016 You've only got 12 a little over 12,000 users. So is it 12 million or 12,000 users? Hmm That's a huge huge difference in numbers. So again possibilities of fake news and then of course we think about support years The beauty of Linux is that we have so many distributions But if you are upstream you really get you really do care more about bugs that are reported in say rel and you know Debian and Fedora as opposed to bugs reported in something like ghost BSD they are there definitely tears if we don't tell you but we have them internally and In some distribution vendors also do end up getting a like a level 3 support agreement with various Upstreams to make sure that those bugs get fixed and that's actually quite useful Database servers are like a distribution themselves Especially when they're like in the MySQL ecosystem because we support storage engines We support plugins and in fact if you do a show plugins on you know a random Say MariaDB you might find 130 plugins or something available for you and We realize things like When you require malloc libraries like like J malloc for at one stage You didn't actually work on free BSD This actually takes work around because otherwise this engine stops working on that platform and we need to care about all platforms live Judy is not provided in in pretty much any distribution and This prevents an open OQ graph which is a graphic engine for from MariaDB and my SQL to not work and This now means we have to provide live Judy packages, which that makes things extremely complex some people run power 8 for example and Extra backup works wonderfully well on x86, but not so on on power 8 and There are a whole bunch of things like what happens when the server because MariaDB server now includes Galera cluster requires all these other tools like IP route and our sync and And in in some instances now it's pulling this and people go when they're updating from MariaDB server 10 0 to 10 1 They go why is MariaDB asking for all of this? This is insane Also another very important thing that you know would be nice if we got Fixed from a distribution standpoint is when we go out for consulting gigs Sometimes we encounter people that say look we need to have all these packages installed But we will not allow you to get on the internet Now rpm can do dependency resolution Deep-package we find we find it really really hard to deal with resolving dependencies when you're offline This needs fixing c-pan even you have mini c-pan you can take things offline in the mysql world We love c-pan we love c-pan and pearl a lot So when it comes to bugs we obviously encourage regular communication via mailing list But then what happens if you upstream is mysql the mailing list is quiet Then you go to other mailing lists like the MariaDB list on the on the Percona list Which is actually gets a lot of traffic Sometimes bug system ccs. That's the best thing we can get right getting cc'd on bugs so that we know that there's a bug If you don't tell us as a bug we will never know There is no real clear dashboard I think launchpad was supposed to be this dashboard that actually ML game made it everything together But that that was the dream, but reality is we don't we don't see that today And Yeah, we also really don't get bugs that you you see in distributions report an upstream Which is actually a real problem for us because we we don't know what we don't know We've mitigated all of this distribution problems by also making our own packages and telling and encouraging people to bypass Distributions and use our own packages and mysql for one is very happy that this happened because now They have these stats of how many people are using their repositories, and it's it's a huge number Which they don't release And this is again very true with Percona server so Percona server doesn't even follow the mirror mirror method So we actually track 100% of all our downloads so we can tell you yes We have three million downloads because that came through our system Of course, we don't know if it came if it comes via distribution But we know what comes from our system and ready to be server. It's it's all through mirrors as well So again, we don't have a really good number And of course our packages tend to have additional things as well, right? You know maybe more engines more plugins these are things that sometimes distributions just choose not to build and There are good reasons why maybe there's no reason to do it or maybe there's no reason to want to support other engines Just okay Taking a quick look at You know the Debian patch set as well as the Fedora patch set where we still don't have equivalence in terms of you know pat no patch no patches in distributions this is always going to happen and You know inside of Debian for example We have patches that we see for the MySQL package to make it work on things like her and K free BSD Which which apparently does matter But then we also if I go go looking at patches really closely sometimes we get you know really silly patches which is just Just fixing English and there's no reason to to fix English in your package That you should just think you just report this upstream and get it fixed there You don't have to run a patch to fix a spelling error I mean we we try to spell properly, but we fail sometimes I Spent a long time I saw I saw we saw Maria DB in 2009 and right up until 2016 I spent time Focusing on replacing my SQL with Maria DB server in Distributions and the journey started in November of 2010 so maybe about eight or nine months after our first release and It has it has progressed tremendously into enterprise distributions and so forth and Debian just start the journey for 10 one Which is why I said the Debian MySQL package maintainers mailing list is an awesome place to be now because you can see lots of bugs Lots of people complaining if you like that sort of drama in your life It's it's well was being there But it's also really a statement to the amount of architectures and the amount of packages Debian has we never saw this kind of pushback from Fedora We totally see this from Debian because they'll tell you stuff on you know random weird hardware that doesn't even run Linux kernels apparently Yeah, so I would say this this goal would probably be complete in 2017 So it would take it would take in seven years. Whereas if you look at LibreOffice LibreOffice Replaced open office and pretty much everything much quicker And you know that's probably because LibreOffice was actually getting developed and open office wasn't in this case All three servers are getting developed and they're all getting better And it's not infrastructure like a database. That's true. It's end-user software It is Correct already shipping that Yes, it was much it was much easier for them so this is a note March 23rd 2016 where the Debian release team decided to make Maria DB the new default And as I said, this is a still an ongoing thing now in in February of 2017 But that's also a testament to how long it takes for Debian to make a release. It's improved though, I guess and Why why did Debian choose to switch to Maria DB? Back then it made a lot of sense because there were things like you know test cases were not being published and Debian wanted security information published Debian has actually ejected things like elastic search out of the Repository as well because they don't get security updates You know we at one stage there were man pages silent silence silently being realized it's away from a GPL but this was actually just a bug that was was fixed shortly thereafter and I guess the most important thing was was security security security security and nowadays nobody really talks much about it, but It's reputedly that you know Maria DB would be a little bit more open with security than say My SQL, but this also poses a problem for users a Problem unlike what you'd get inside of say LibreOffice and OpenOffice in the sense that Maria DB server is now not compatible with my SQL it was compatible for quite some time But it is not compatible and this creates real-life problems that affect users users are saying things like how come I can't use json Functionality when I expect it to just work I'm reading this article on the internet But my mind but what is saying it's my SQL, which is really Maria DB doesn't provide the functionality They get angry and they're going to get angry at you not not upstream So if you're if you're a distribution beware for the sort of thing so yeah bunch of missing features Bunch of different features so the on-disk formats are changed Also, what happens when upstream says hey, we don't have enough time to work on something You can't just you know orphan our packages immediately. We need to know and and so on so in conclusion I'd like to say yeah distribution methods definitely evolve You know even someone like Mark shuttle will keeps on trying to make new evolutions with snap for example But a product without any distribution will definitely hurt its adoption potential oh And if you happen to run my skill 5-7 you will now see a warning like this, but this warning is is the tip of the iceberg Thank you for listening. Do you have any questions? Okay one question if anyone has a question Yes Oh, yeah, the question is problems with offline install so sometimes in some environments They say you can't get access to into that so you have to download all these packages yourself manually and then install them rpm you can do like Dependency resolution even when you're offline with yum or what the new tool But you can't do that with the deep package deep package doesn't do dependency resolutions We have to install packages like one by one in the correct order whether especially Yeah, does that work offline now If you have all the dabs, okay, well this this is a common complaint we got from consultants It's also not a very common scenario where they tell you you can't work offline that you must work offline. Yes Thank you