 Hi everybody, I am Matthew Miller, the Fedora project leader and this is the special Director's Cut presentation of the State of Fedora talk at DevConf in Burnow for 2016. I'm going to start off with some beautiful quotes from the press that we had over this last year as they're looking back at Linux over the year of 2015. They said some things about us. This one is from Pheronix, which is a Linux enthusiast hardware gaming site, which traditionally has been a little hard on us, I think. But recently, especially with the things we've been doing, they've come around and love us a lot. So this is some nice things they said here and then this one is from the register, which is a kind of IT tabloid talks about technology and often tends towards snarkiness. And so I am actually really proud of this one because not only when they did their year in the review, they're not snarky at all. They only have nice things to say about us. And further, it's actually some things I'm really glad to hear, not just that, wow, they did a good job of putting together another operating system release, but that the project altogether with the efforts we've been doing Fedora Next is in heading in a good direction. So I'm really happy and proud of the entire community for this quote here. So the first part of my talk here is kind of a fedora by the numbers. So let's get started with that. First, we have some numbers about downloads and connections and things like that. And whenever I do this, I start my thing with a scary dinosaur slide. This is because Steven Smudgen, who's the Fedora infrastructure Uber assistant men who put together a lot of the stats that underlie this, really would like them to be shared with caveats. Particularly the caveats have to do with the y-axis, the actual bear counts of things, and I'll go a little bit into the methodology of that. But basically, like tracking dinosaurs in the wild, we are not actually doing direct surveillance of people. We're kind of doing second order metrics where we observe things and try to extrapolate from there what we can find out. We don't have like some operating systems, things which log every keystroke and send them to you. We value privacy as an important part of Fedora. So we also have a pretty strong need to make sure that we're succeeding because if you can't tell how well you're doing, how can you tell what you should be doing differently. So we do want to count people, but we want to do it in a way that's sensitive to people's privacy needs. So as part of that, we can't really guarantee that the y-axis means anything in absolute terms, but we do think that it has some usefulness in relative terms. So here is the first scary slide, and this is basically kind of a raw data version of this. This is going back to May of 2007, all the way up to approximately now, and this is the connection counts for our update server for every release, starting with Fedora Core 6, going all the way up to the beginning spike of Fedora 23 there. And so what we're counting here, because this is the director's cut, I'll go into the geeky detail of it. We have a mirror manager, where basically Fedora updates are distributed from volunteer mirrors all over the world, and that gives us a lot of, like, it's a, thank you very much to all of our volunteer mirror admins, because it's a lot of bandwidth and it's really hard to do Fedora with, get Fedora to people without that. And the thing that when you, as a user, update your software or install new software or check for updates, we have a thing called mirror manager. You connect to it, or the command you're running connects to it, you don't have to know about this when you do it, but it's what happens behind the scenes. It connects us and it tells you where your closest mirror to get that software is. And so what we're measuring here is connections to that mirror manager per IP address per day. And so this is where some of the things you get into the problem with, you know, what this number exactly means. You can see it goes up to about 100,000 there, and we think there are more active Fedora users than that, and I'll tell you why in a little bit. But some of the things that complicate this, network ads translation, NAT, you're probably familiar with this from your home router. Basically you've got a bunch of IP addresses on the inside, but to the outside world it looks like just one. And in large institutions these days, because the world actually ran out of IP address space a couple years ago, and also for basically the privacy and security reasons that this gives you, often large institutions look like just one IP address or a handful of IP addresses. So if you're coming from Harvard University, there could be, you know, 20,000 people running Fedora there, hopefully there are, and but this will count all of them as one. So that undercounts. On the other hand, if we have somebody who like really likes coffee shops, but also likes to walk around between them and goes to, you know, two, three, four, five, 20 coffee shops a day and connects to their network and does something that causes the updates to be checked, that person could be counted 20 times a day. So we don't know, but I suspect that the first thing undercounting overcount is much more than the people who are connecting a lot of times. It is also the case that this overcounts people or undercounts people who don't have broadband access because they're less likely to be online and checking into the update server. If you have to dial up to do it, you might decide to actually never do updates. And in fact, anybody who never does updates or installs new software is not counted and that probably means that you're undercounted in cloud computing where you make a gold image and you install it and it doesn't ever check in and maybe you are running a million of them and never hitting the update server. I hope that's true, but we don't have, because we don't have, you know, invasive metrics, we can't really see those things. Okay, so you may have been staring at these numbers on this graph and trying to decipher them. That's nice if you want to do that, but I've also simplified it so you don't have to. This is our geologic ages of Fedora. So this is the same data, but I've stacked it so you can see the total amount is the amount of Fedora we've got out there overall with the caveat of pretend that the number over here is magical y-axis number, not some sort of thing that necessarily means something directly, but again, I think the relative numbers are meaningful. So this is back, the old releases here. The light blue is Fedora 8, which was, just slipping back here, you can see the peak. This was an oddly popular release and I mean, that's good that it's popular. I don't have a really good explanation for why. One of our theories is that it was one of the early images available for Amazon EC2, big public cloud provider, and it was not only was it available early on, but it was also one of the only images. And then when we got to Fedora 9, we broke support for the virtualization technology for good reason, but you know, we, there no longer was support for the virtualization technology Zen that they use in Amazon or used in Amazon. And so Fedora 8 didn't have that boost. Fedora 9 didn't have that boost. And we actually, that works again now, but there was a period where it didn't and so Fedora 8 hung on for many, many years there. So that's the theory of the Fedora 8 spike, but if you've got a different theory, I'd love to hear it. We don't really know. Anyways, after that we have the purple age of Fedora here where we had a nice upward trend there. Basically that goes from 9 through Fedora 14. Things were going pretty well, although I think you can kind of see it's kind of a slowing off curve there. So when we got to Fedora 15 here, this is a release that we had a disturbing thing happen and I'm going to go flip back to here again where we had peak there and this is the upward trend and then uh-oh, we didn't, Fedora 15 actually at its peak never even exceeded the Fedora 14 installations. There are a number of fingers, that's things and finger pointing together as fingers. A bunch of possible weight reasons for that and obviously we made a lot of change in that release. So GNOME 3 first hit there and that was also our first release with SystemD. There are other things going on at the time. So that was a lot of change and so one theory is, boy that change scared a lot of people and scared them away and that was bad, could be something else but we had this downward trend for a while and that was a little bit disturbing. So when we got to the 10 years of Fedora point, as you can see from my shirt, we decided, okay we've had a pretty successful 10 years of Fedora. Downward trend notwithstanding, Fedora is pretty awesome but we need to make sure that we have the right operating system for the next 10 years. What do we need to do to make it look like that and so for Fedora 20 we took one year to do that release rather than the normal six months so I separated that one out separately as the yellow release. So Fedora 20 was a really nice solid release but it was kind of in the same mode as the other one. We just kind of retooled our infrastructure to do things differently and then with Fedora 21 we have this new strategy of separating out the Fedora workstation, Fedora server and Fedora cloud editions and I can go into great length about that strategy and why we chose it but part of it is by having these things in different targets we can try and really appeal to a specific area and get things right for the people who want to use that and then hopefully grow from there to world domination which is obviously always the goal. And we can see from that this green is really going up at a nice steep line. If we can keep it going like that we really actually will be at world domination and I don't know, not many more slides on from this so that's exciting. I'm probably not sustainable but I hope so. You can see that we're actually at the peak where our peak now is higher than the previous peak back there so we're on track for there and so here I've made an even more simplified version of it. This version basically the green line is the top of the stack there, total Fedora and then for each individual release this is the point at which that release peaked and so here's the F8 weirdness over here. Good weirdness but weirdness nonetheless and then you can see back over here at this end Fedora 22 very high. Now I had to put a note here because when I first showed this to my boss she freaked out, oh my goodness what do we do wrong with Fedora 23? The answer is we're not at the time for Fedora 24 yet and it turns out that the peak for each release is basically the days or day right before a new release comes out. Well, Fedora 22 hit its popularity the day before Fedora 23 came out and so back to marketing this has kind of some interesting implications again since this is the director's cut I'll diverge a little bit. We've traditionally structured our sort of marketing cycle and activities around this six month release cycle and that's like we decide we have that new feature we're going to try and push people to the new releases here and we do a good job of pushing people do get excited about the new releases they switch but the fact is Fedora user growth keeps happening throughout the life cycle the release at a pretty smooth upward curve and only stops when we get to that new releases there so it kind of indicates that we don't need the six month releases to drive new users and in fact when we had the year-long Fedora 20 that still was an upward curve that whole time. There are some good technical reasons we want to do six month releases there's a lot of upstream projects like Anilm that are on a six month cycle so working with that is beneficial and it kind of is nice to have an integration cadence that moves quickly or you know it's two years two releases a year is not really quickly in the state of agile and continuous delivery but that's another topic. Six months or maybe faster sometimes useful from a technical point of view from a marketing point of view it might not be doing all that much and so maybe we could try and focus our marketing on things that are not necessarily so tuned to the release cycle. Okay, so this slide is kind of a little bit of a diversion this is some correlating data really so we have, I was saying that we basically are counting people by observing activity in the wild and some of that activity here is the update server connection another thing that we thought we could count so if you go to a coffee shop or like at the university here you connect to an open wireless network and then you get to a portal page that pops up and it's subverted your DNS and you can't actually get to the internet until you type in some login or if you're at an airport watch an annoying advertisement or some sort of thing like that and so we have a feature which helps understand if you are in that environment and then pops up a login screen sort of just like you will get on basically other operating systems today have the same comparable technology for captive portal detection and this is a long way of saying in order to do this it tries to load a certain URL on the fedora web server and if that comes back with actually literally the word okay you know that you can get to the internet otherwise it will come back with something else and you know everything is being tricked so we can count the number of people okay and see the number of people who are using fedora on their desktop and going around to coffee shops or not and this is the line this is the feature that was added in fedora 21 so obviously it was this is people down here there's a few people who were testing it here but as it started becoming the default you can see that line going up and this is kind of correlation that we're not crazy on exactly what our numbers are measuring it's lower than that one and even the slope is lower is that people who are upgrading from older releases don't necessarily get the new configuration unless they actually pull it in on purpose so this is just kind of interesting correlating data for now but as more people come on to these newer releases and actually going back to the ages one here you can see if you kind of slice right along here you can see that most people are actually on one of our recent releases which is actually again back to the dark ages I kind of call them that this period here we had a lot of people who were clinging on to the older releases and not upgrading and so we actually have a case right now where most of the people who are most of the people who are running fedora are running fedora on one of the newer releases so that means as we look into the future the red line is going to be closer to the green line and we can use that for some interesting comparisons so another thing we can count is ISO downloads this is when you go to get fedora.org and click on get fedora the download link there so here again this comes from mirrors so what we're counting is not actual successful downloads but we're counting when people tried to start a download of the website and SMOOGE has done a lot of work to try and filter out bots and automatic robotic connections here and we have reasonable confidence that these are humans downloading fedora installer and of course we don't know what they did with that installer once they downloaded it they could have installed one system they could have thrown it in the trash or they could have installed, done an install fest installed 2200 systems with it we don't really know and particularly with the cloud image I think it's actually pretty likely that one is downloaded and then used many many times anyways the punchline here I guess is that each of these blobs is about 1.2 million downloads so that is quite a lot for each release and so that's one of the things when our access goes up to 100,000 on the other releases I think that we probably have better conversion than 9 out of 10 people throwing away the download after they've downloaded it so the number is somewhere in between there another thing that is worth noting you may have been staring at this and wondering why are those peaks going down and I thought you were talking about peaks going up well when you do an upgrade from one release of fedora to another now you don't need to download an ISO to do that it's an online upgrade where you type some commands and so basically people who are upgrading aren't going to get the ISO now I would like that still we'd be in better shape if it were actually going up new users we have every release more and more but I'm not surprised to see this going down but I hope that we can eventually turn that around as well so this is another way of looking at the download numbers this is basically which percentage of them are which fedora edition cloud server or workstation we can see of the editions about 75% are workstation downloads here and then server is the orange and this is again the access here is different this just shows of the last couple of years when we've had the split in the editions and that is our primary downloads so pretty good chunk 15% or so for server and then a little bit lower for a cloud and atomic which is our new container based operating system which currently is cloud focused as well those things are smaller up there at the top but we think that people who are doing cloud images hopefully again are downloading one and installing a lot of it and we don't really count that very well with this metric there's an interesting gray blob in the middle that is a network install and so a network install basically is you download a smaller ISO and then when you actually go to do the installation at that time pulls everything off the network that's useful for geeky power users and it's also very common for institutional use so if you're installing a university and this is Fedora is your corporate desktop which that's the thing that happens then that means that that's something people want and we actually had previously we only had a network install for server because the thinking was it's really server people who want this this power use of the installer but it turned out we had a lot of people saying so where is this for a workstation I really need this and so we added it back in for Fedora 22 in the middle here I think that's when that was and you can see that demand was real because that gray line is fairly significant so in addition to the main additions we also have a bunch of other things called spins and labs, spins are alternate desktop environments, Fedora workstation is based on GNOME because not because there's like a secret agenda to promote GNOME above all things but when we said we want to make this Fedora workstation and have it be something that is specifically dedicated to our target market blah blah blah blah blah Red has a lot of people working on GNOME and Fedora has traditionally had a very strong GNOME good strong relationship and that's people who showed up to do it and we chose it as the best option for that particular use but these other desktops of course especially in the Linux enthusiast community are very important and interesting and we think it's pretty cool that we can offer all this different diversity we added a cinnamon spin I'll show some of the breakdown here if we go to this slide I just have a non-stacked yeah right this is not stacked and it is just the representation of the spins not the additions here so you can see KDE is clearly our most popular and with cinnamon rising up here it's the yellow line here which probably is hard to see in the projector but cinnamon added recently and that being jumping into popularity and LXDE and XFCE kind of fighting it out for second place there a lot of cis admins love the XFCE so the spins are alternate desktops we also have this thing called LABS LABS may have an alternate desktop but they're mostly focused on being a purpose built collection of software so we have some of them like the electronic spin with tools for building electronics spin obviously which is preloaded with games and that kind of stuff and two of them I think are particularly interesting here one of them it's low at the bottom here is the fedora security spin but there's kind of this cool little bump right there the security spin is like penetration testing forensics all kind of things around the field of computer security and hopefully white hat hacking and so this is something that I know is used in schools and I expect that this big jump right here was a class where somebody said we're going to be using fedora security spin for something we're doing here so I think that's pretty cool and another one that is a low line on the graph barely downloaded at all is the fedora robotics spin which is I think disproportionately impactful for what for the number of downloads because the people who make this and work on it also use it to routinely win robot world cup soccer which is awesome it's pretty cool that that's being done with fedora and our fedora spins and the technology we use to put together the operating system can be used for all sorts of neat things like that architecture so this is this and the next slide are breakdowns of basically what kind of computer people are running things on and so this is 64 big red 64 bit Intel normal computers here is 32 bit which is actually much higher than I expected it to be when I went to look at this and up there at the top we've got the yellow architecture and that's all arm 32 arm 64 is negligible on this and some of our other architectures we have fedora for the s390 people who want to run fedora on the mainframe you are welcome to do it and some people do it's just that the number of mainframes out there is not enough to make a dent in our charts but fedora is really there and power pc is also the new power pc stuff is incredibly fast we have fedora for power pc as well but it is also making only small dents in our chart but if you're interested in that we've got it so this is the same thing over a longer axis the previous one was downloads and so this one is the update server connection and you can see basically the previous graph is this part of the chart and one of the things I noticed is that arm even though it's still small is a higher percentage I think that's because people who download the arm images probably use them multiple times since they have multiple systems for one image they've downloaded I think it's interesting also that 32 bit was on this big crash to zero and then whoops hit a floor at 20 and one of the things that I think might be responsible for this is in our new download page we have you go to get fedora.org and go to the fedora workstation site and you'll see at the top of the page there's a pretty banner that says download now and has a nice green button and if you're like me and you browse the web a lot you might sometimes get in the habit of skipping past the top where it's blah blah blah pretty stuff where's the meat of this page and it happens that in our current design we've got an other downloads link section over there and that starts with things that aren't the main download starting with the 32 bit download so I have a little bit of a concern that people are just missing the actual thing that's supposed to be obvious click this ignore the rest and actually looking at the download link so I'm going to do some AB testing with the websites team so we can actually experiment and see how that affects people and this is one of the things I think is kind of fun about you're looking at the data here and saying wait a minute this doesn't come out with what I can expect what kind of causes we can do experiments and it's fun I'm a huge nerd data is fun okay so this slide switching track a little bit this is Apple Apple is enterprise packages no extra packages for enterprise linux and it is basically fedora packages fedora rpms to build on centOS and rel and this is hugely popular and it continues to go up and up and up in popularity and actually see that somewhere around 2011 it crossed in popularity from people downloading accessing the fedora OS to people using the apple packages that's one of the biggest services we provide and I think that's testament of course to the success of rel and centOS so that's cool and the really vast audience for those kind of stable downstream platforms but it's also a neat thing where we're kind of tied into those communities where we have something that's kind of a bridge across to what's happening in centOS and I think it's also it's a neat thing that we have that in the fedora project and not isolated because it brings in people who want to contribute and add things to the distribution at sort of the upstream level that fedora wants to be at. So those were all numbers on fedora the operating system but fedora as a project is more than just an operating system in fact I really like to think of it as mostly a project and a community and an operating system as one of the things we were just organizing around making so previously one of the questions people asked me a lot was how big is that community you keep talking about and I would make up some numbers and then I started thinking stop making up numbers and try and figure out what those numbers actually are so the first question how big is the fedora contributor community and now this takes some amount of explanation to figure out where I got these numbers from and also how how much we can trust them just like the big dinosaur I don't have a dinosaur for this one but I probably should put in like a pterodactyl or something to make a warning so we have this thing called fed message bus or it's actually now the federated message bus I think because it is not used just by fedora Debian uses it and I think CentOS is looking at it and so basically this is a message bus that anytime somebody does something in the fedora infrastructure or many times automated systems do things in the fedora infrastructure a little bit of data is put on the bus saying who did what when and what happened and the results of it and so there's a lot of different activity that causes message bus to happen and we use it for a lot of things connecting services together if this happens respond in this way it's the way as a contributor you can get a notification like if you want to know whatever certain wiki pages edited you can get a notification that way you can get notifications about some package changes if there's a new version of a package you can hook it up to our notifier and so there are a lot of different activities and some of these activities have useful properties for accounting contributors like for example they're attached to a username and specifically there are some which are attached to a username and are initiated by someone who actually did something so it's not just that something happened to that user or something automated happened related to the user but the user sitting at a keyboard and did some action that we can count and of those as well there's a subset which are actually the user is logged in so we can actually you know know that this is a certain user with a certain fedora account that we can trust to be the same over time and not be just some made up pseudonym so that's why I say activity in three key areas that are also easily counted so I don't know that this is actually representation of everything and I actually have some reasons to believe that it's not let me go through the areas though first Bodey, Bodey is our updates QA system basically so when a packager takes, fixes a bug or has a security problem that fix we put this into the system they put it on Bodey which is a website where anyone can run it to a testing and then they can give karma feedback positive or negative this fix my problem oh my goodness this is broken and terrible and so the votes get counted up or down and that's kind of an important process because if you don't get enough feedback on your updates it's risky to release them in the future we will have awesome continuous integration and automated testing on these things right now it depends on the community to do it and it turns out we actually have a pretty large community of people doing that over a thousand people in 2015 who actually did that so that's Bodey here this get is actually the package changes themselves whenever somebody makes that update to a package and commits it to get the message goes out saying they did that smaller number but still a significant number of people involved in that and then the wiki and you may think oh yes documentation and then you would be horribly wrong because the fedora wiki is great for many things and there is some documentation on it but it is not really a documentation site what it is is an office whiteboard and various other office implements it is a workspace that people use in the project for doing a lot of different things in fact QA uses it for validation testing and a lot of people use either home kind of home page kind of things on there for users and people put draft ideas there there also is some project contributor documentation but mostly this is contributor activity of all kinds you can't say okay this is docs so those are the things that are easy to measure because they generate fed message things so there are some things like the documentation which isn't measured very well in this translations would be another big area that I don't have data on working in bugzilla bugzilla uses different accounts than the fedora account so I didn't include that there is also ask.FedoraProject.org different accounts so I didn't include it in this right now and the ambassadors tend to do a whole lot of work going to conferences talking spreading the love for fedora that doesn't necessarily generate any of this sort of activity maybe a few wiki edits but maybe not a huge amount of activity there so those aren't counted but of the things that are counted we do have more than 2000 contributors who did at least something in 2015 and quite a lot of those active in multiple areas so this slide I actually went and looked and said okay there's that's people who did at least one thing what if people you know who's really active somebody you know some people might do drive by things there do we have a really a core community that does you know it does a lot and so here I just took in each of these three areas the 10% and said okay we're going to count who was really putting in a lot of this work here and it turns out there's about 300 people who in 2015 were in top 10% of active you turned out to be it's quite a lot of activity here and one of the things I think is interesting is there's a lot less overlap than I expected you see there's more overlap in this chart than there was in the wider one which makes sense but I kind of expected this to be almost all on top of each itself and it's not which actually to me indicates that there's probably some other missing bubbles that also add up another you know another couple hundred people who could be contributed considered in the core that didn't actually get it get to this graph but it even 300 is a pretty respectably healthy project I think I would like to grow it more that's definitely one of the goals but I think we're a pretty good shape there so time to break this down a little bit more this is since this is the director's edition here this is kind of a super nerdy breakdown of those specific areas and I thought this was really fascinating so it looks complicated but it looks more complicated than it is I think so trust me here so this is the Bodie feedback update numbers and so what I did is counted the number of people per week so when it goes up to 150 and this week here see 150 different distinct people gave at least one bit of feedback that week there and so this is a stacked graph so that's the top number there basically shows the amount of activity that week and it ranges from a low of around 30 here up to the high of 150 in some busy week and you see it's very spiky I think that's because the number of updates varies per week quite a bit so that's as well but then the coloration basically is broken down into buckets based on how active that particular user wasn't was not just in that week but in the quarter that that week happens to be in and it's actually again nerdy it is a rolling quarter so it's that week plus the six weeks before and after because I didn't want it to be skewed by weird date breaks but so that you can basically see it broken down there and I have a more useful view for seeing that bit broken down by contributor activity so this is the exact same data but the graph is instead of being a count it is a percentage so it's filled in graph here so basically that means that top blue up there is 50% the bottom 50% of contributors in in a quarter are doing about 10% of the work so 10% of the feedback comes from drive thru this is basically what that means and this is the 1% here doing a pretty hefty chunk here the top top 1% does not but that next 10% basically the top total 10% the yellow and the green together works out to be about two thirds of the effort in doing this pretty good feedback and actually as I look through all these different areas we see that same kind of pattern where 10% and two thirds which I think is pretty good it would be kind of nicer if we had a bigger long tail more drive thru is more lightweight contributors and I think that's one of the things we can kind of work on doing it can easier to give that kind of feedback and work but there's a like a social media rule and 9010 rule which is 10% does 90% of the work or maybe more than that and you can see we're better off than that with two thirds so I think that's actually also this is pretty healthy here so this is that same slide basically for the package changes the disc good activity and there are exciting dips here Christmas Christmas and Christmas so I think that that's kind of fun because again when you're doing something with data and you see a weird artifact and you're like oh yeah that is clearly explained by something in the real world it helps you know that your data is actually connected to the real world in some way and it's not just abstract numbers and red hat shuts down for Christmas and a lot of people you know outside of red hat contributors are on vacation and doing family things and so we see this big dip in activity it doesn't go to zero some people are still slaving away over the break which is important especially when security is concerned but there's definitely that big dip so this is the same thing for the package changes again with the percent per bucket and this one has some spikes that are not Christmas but in going the other way here the Christmas actually flattens out so you can barely see it and here where it's like here's a Christmas here the percentage stays about the same over that break but these spikes are release engineering doing something called a mass rebuild which is as you might think when you rebuild a whole chunk of packages in fact all of them we often rebuild the entire fedora distribution as compiler changes happen new security features are added and sometimes just to make sure that everything hasn't bit rotted so everything is done basically by a release engineering team which is a core handful of people who are very active already and that makes these spikes of the 1% having done a bunch of the extra work that week drowning out the normal workflow for that week and actually Dennis Gilmore who's one of the key fedora release engineering people just yesterday, the day before that switched from using his personal fedora account for doing this to using a new release engineering account so in future versions of the graph I will be able to separate those out completely without also discounting other I could have just blacklisted Dennis and Peter Robinson and a couple of people like that Rex Dieter as well and that wouldn't have been fair to them to kick them out so I didn't do that but having a release engineering account will certainly help that this is the same thing for the wiki you probably have the point by now one thing that is different from the other ones the other ones are kind of flat in growth and again I would like to see contributors going up so that's something to work on you know you got to have the numbers have the baseline to know you're working from so it's good that we have this baseline now but here we see the wiki activity number contributors the wiki going down and I said that the wiki was a whiteboard not a documentation site and I think that's the inevitable thing this is basically wiki bloat that is killing the wiki's usefulness and so the good news is right now on the fedora docs mailing list the docs team is working on replacing this wiki solution with a new ASCII doc get based workflow which will be a lot more lightweight than what we have now and will not be confused with the wiki and I hope that once that's up we can just basically put a banner at the top of the wiki that when you're not logged in it will tell you welcome to the fedora project wiki this is a whiteboard for contributors if you're looking for docs you're definitely in the wrong place go to the docs thing over here if you wanted a whiteboard awesome login because we're not going to get rid of the wiki it's way too useful but we also don't want people going there thinking that they're going to get useful documentation because it's a trap here's the breakdown by percent I think the only thing worthwhile noting here after all those stuff is that we're still following that same 10 percent 60 percent pattern here that the other stuff does so that's kind of a pattern across the project and since it works that way in three areas I'm going to just go ahead and assume that it's basically that way for everything because that's extrapolation works that way right yeah so there is enough of the deep dive analysis into the numbers there I also have in the number section some other interesting numbers that I have not gone into in such deep exploration but I would like to in the future some other areas to analyze and I've got a couple graphs that basically talk about different things so this is fedora magazine fedora magazine is something we started three or four years ago but it really only started two years ago in being really active and the team working on this is awesome if you have not checked it out fedora magazine.org so this is user focus documentation and articles of things of interest ranging from highly technical to community pieces about you know people who are involved and this has just gone up and up and up in popularity the red dots here are this is per month so those are months in which there was a fedora release and obviously that drives extra interest but you can see even even in not those months we've got a night if you pretend the red doesn't exist and draw a line right there that's a nice upward slope as well so congratulations to the magazine team here and this is becoming really important. I started this presentation off with some press and we get great press but the thing is it is hard for a linux distribution to get any press these days because although I hope everybody listening to this distribution knows that linux is cool in the general populace it is not really a hot topic and it's not something that sells magazines and drives general IT page views unless we have drama and we try to keep all of our drama positive and then who wants positive drama it doesn't make good press. I talk to even friendly journalists and they're like I love what you're doing I don't have a story so having the fedora magazine basically gives us an outlet to talk about ourselves and people who are interested will show those press what's cool again we're on a chart towards world domination here and soon it will be bigger than gizmodo ok so this is a random stats gathering on the fedora develop mailing list you can kind of see the same trend of graphing here this is basically new users showing up to the fedora develop mailing list over time and I tried to break this down again into groups here so the yellow is so the blue is a new user who posted that month and then is never seen again drive buys there the red is new users who posted that month and then posted just one other month so they're kind of drive by and then and then yellow are users who showed up that month and then stayed around and obviously because they're counting that way there's nobody here that doesn't mean we didn't get any new users I just can't tell if they stayed around yet or not so this is why this one gets relegated to the back of the presentation because I'm not quite sure how to present this the best but I'm showing it to you anyways because I think it's kind of interesting I mean obviously one of the things we want to be doing is every month getting more contributors and getting ones that stay around and you can see like we haven't scared off everybody we've definitely get new users over the last couple years new users come and new users do stick around and that's something we obviously want to encourage and emphasize fedora mailing lists have especially had some reputation of being sometimes prone to getting into gigantic flame fasts a little bit being harsh and in the last couple years we've really tried to stop that both by enforcing our code of conduct telling people you are out of line remember to be excellent to each other and also by doing the same sort of thing when a thread is just going off the rails and going in circles back and forth productive discussion has stopped even if it hasn't gotten to be acrimonious that kills the mailing list when there are threads like that because you don't want to just you two people going back and forth about something so we ask people gently okay we got it now it's time to shut down the thread and usually people are listening to that so that makes the mailing list more pleasant and I hope that we can actually see activity in these lists going up over the next couple years and more new users and I've got some other ways I'm thinking of analyzing the mailing list so there are versions of this presentation so much more mailing list analysis it's going to be awesome okay this is IRC meetings this is one of our interesting stats and this graph looks like noise so I drew a red average line across here sort of the important thing here this is IRC meetings completed every week and this is approximately three meetings a day something like this is over the last two years so we've gone up to basically 25 IRC meetings a week and a lot of these are half hour to an hour long people chatting back and forth making things go in the project and this is really the engine that drives Fedora development we have the mailing list activity happens but IRC is real time higher bandwidth so people get a lot of work done this way we go through tickets and crank things off and so one of the things I'm concerned with is that this is iceberg activity because although IRC is awesome open source people use it like crazy and you know hacker script kiddies use it like crazy the rest of the world does not use it even though for some reason they're using Slack now and Slack's not really anything different it's just got a brand name on it that's really a side track anyways IRC as it is is hard to get into and that's what Slack offers an easy way to get into something like IRC IRC is great but it's really just below the surface and especially as a new user getting into IRC you've got to like what's a nix or what's a chance or why am I getting kicked out of it what's getting kicked out of a channel what's a channel is this like CB radio I don't know what's going on so the technology is confusing and then the culture is confusing because people use all that tech speaking stuff that's IRC stuff that spilled out into popular culture it's true I see my daughter on Minecraft doing all this IRC you know the chat stuff you'll be ready to be a hacker but for most people it's really a big barrier to entry and it's also something that is you don't see it you go to the Fedora project website you don't see that all this is going on it looks like is this project dead no we are not we are very alive we are just hiding in caves we've tunneled into an iceberg and and again these slides are random collections so segue to something completely different this is people voting in the elections for the Fedora Technical Steering Committee FESCO over time and this is going all the way back to 2008 when I think I don't know if that's the first elections it's the first elections we have numbers for and you can see this is about 200 users which I actually think it's a it jumps up and down a little bit but it's kind of constant over time around there and I think that's kind of concerningly low given they just said we had more than 2,000 people active in these core technical areas in the project last year and over 300 of them who are really active people are not voting very much so as with elections in government we have a problem with low voter turnout and maybe it means everybody is super happy with the leadership maybe it means people are heads down hacking and don't want to deal with politics that's not necessarily bad bad things there but it's interesting that number is low now again a random segue to a slide I think is also super interesting and I did a lot of work on this one does everybody at Fedora work for Red Hat obviously Fedora is sponsored by Red Hat and Red Hat puts in a huge amount of money and effort and all sorts of things into the project and it is often a perception in the press that Fedora is a Red Hat joint and Fedora is a Red Hat joint collaboration but it is not an entirely Red Hat thing and I think so I went and actually looked so I took those top users from back from the second Venn diagram there the active users in 2015 and I did some analysis to see is this someone from Red Hat and so 26% of those people use the redhat.com address so those people were easy and then I actually had to go through by hand because we have a whole bunch of Red Haters sneakily using other domains which I need some explanation because I say sneakily but there are actually a lot of good reasons people do that a lot of these people are Fedora contributors that for long before they were Red Haters and were hired into Red Hat and they had either a personal identity established as a person in the internet as a lot of people do or like me I hope I have a personal identity but more importantly all of my email filters for dealing with the thousands of messages on these lists live on a server somewhere and I was not about to port that into my Red Hat address and have that all mixed up so my Fedora mail goes somewhere different from my Red Hat address and so that puts me in the sneakily I mean there's not any lack of pride in being a Red Hat or wanting to disguise that involvement it's just those kind of things there so I say sneakily but I'm saying that as a joke I don't think anybody's actually being stealthy in that way anyways the basic thing here is that adds up to about a third of the core contributors being Red Haters and I show a lot of people this and they're like wow that's actually surprising a lot of people are worried really that Red Hat has putting in a bigger share of that and I think that would be unhealthy for the project if it were 90% Red Hat and then a few other poor souls trapped along but we can see really Red Hat is a big stakeholder Red Haters are big stakeholders but not the majority so that's really kind of interesting to see there if you slice this down to the top 1% I think it gets a little bit bigger but that's because something like this it's hard to be in the top 1% if you're not fortunate enough for that to be your full-time job which is also why many of these people got into this chunk because they were doing an awesome job and Red Hat's like we could use more of that so okay so that is the end of the statistics part of my talk