 with the Wikimedia Foundation, and one of the things I'm currently trying to find out is how to measure activity, people in our technical communities, and you probably know that Wikimedia is a large, large project. There's like more than 900 websites, and there's many areas where you can contribute technically in different ways, and we're currently trying to get an overview, and even that is hard. So it is a complex task, and in this talk I would like to quickly show you what we already have in place, and what we want to get in place, and maybe also a little bit of a problem is in the complexity. So it's more like for your interest, or if you're curious also to play with technical metrics, statistics, things like these. What we have currently mostly is about Git repositories, code repositories, and we mostly use Garrett for code review. We have our own Garrett instance at Garrett.wikimedia.org, and for this we've been having a platform called wikimedia.bitersch.io. If you've seen an Elasticsearch, Kibana, standard platform thingy, this might be familiar to you. It is all free and open source. It's actually a looks foundation project. You can find it under chaos.community, chaos with a double S, and the code base is public on GitHub, so any other free and open source software project can also set this up for themselves. We have it hosted by Bitertia, but this is also possible to set up yourself if you're interested in gathering statistics about your free and open source project. And there's also a documentation page on media wiki org, which is called community metrics. I think I have screenshots here because I never trust the internet at conferences, but I could also show you live. So this is the GitHub page of the chaos project by the Linux foundation, where you could get the code. This is, I hope the zoom is sufficient, wikimedia.bitersch.io. So this is the overview page. You can see the navigation up here, and you get some basic statistics about the most active people in the GitHub repositories, which organizations we have. So here you can see wikimedia foundation, individuals, hallow veld, wikimedia Deutschland. So this is the contributor base we have by organization by affiliation. And down here there's way more statistics, gits, garrets, mailing lists. We index a lot of things. We also index a little bit our issue tracking system, which is fabricator, and some edits on media wiki.org. And for example, now if I go to garrets and the overview page, because we use garret for code review, there you have more specific statistics. And as it's elastic search kibana based, you might know this if you've played with this. Whenever you click on a certain value, you can filter by that value. So for example, if I use the pie chart here and only want to see the numbers for independent volunteer contributors, I click it, and you see the numbers now change. Obviously a bit lower, and you see up here that a filter has been applied. And you can continue with these things. Then you can go filter here also via code repository. For example, the media wiki core repository. If I click on that one, it also filters for the value. And you can basically drill down the statistics you want to gather here. And there's, as I only have 15 minutes, there's way more things you can find out here. Also, for example, who reviews patches in garrets, how long patches have been open, media time, all these things you might want to gather to find out how well are we doing as a project when it comes to both involving volunteers and also give them the feedback when it comes to code review and engagement that you would like to give. Or also areas for improvement. For example, in wiki media foundation, obviously we have engineering teams, and some of them maintain certain code repositories, so you can filter the view for certain code repositories, and then see, for example, you realize sometimes that patches written by volunteers, it takes longer to review them than patches written by your coworkers. And these kinds of things which you may be already assumed, but it's nice to have actually data. There's also a few caveats here. So, for example, I usually don't use the git statistics because Garret is where the code review happens, and once a patch proposed in Garret has been accepted and merged in the git repository, you would also see that in the git repository, but as all our software is open source, free software, we also, of course, pull in a lot of git repositories from other upstream projects, because we use a lot of software invented and maintained somewhere else to run our servers. So the git statistics also include activity that we've imported within the git repositories from other companies, so that's kind of misleading. And there's a few more caveats which are actually, I hope all of them are listed on the community metrics page on MediaWiki.org, because at some point, I had to create a section, behavior that might surprise you. That page also has some examples like how can I, for the most common questions I get from interested people and also co-workers, or you want to publish an annual report and show how many volunteer contributors you have in the code bases and these things. So that is what we have. These were the screenshots in case the Wi-Fi doesn't work. And now the section, what is patchwork? A spoiler, basically everything else. Because this was the look at git repositories and Garret for code review, but there's way more going on when it comes to technical contributions and code in Wikimedia. There is GitHub. So we have some projects, quite a few that don't use Wikimedia Git, Wikimedia Garret, but they prefer GitHub because it's a different contribution system or workflow. So we already tracked some of that, but we still have to improve even finding a way how to find all the repositories related to Wikimedia development on GitHub, because they're not all under the same organization. When it comes to what I just showed you, Wikimedia bit.io, we define what is being indexed in a public JSON file project. So this is also linked from the community metrics page on Media Wikiwork, where we define basically what gets indexed and it's a long list, as you can see, also some mailing lists. But there's a lot of code actually on the Wikis, inside of Wikipages. So there are user scripts, there are gadgets, like small JavaScript things that enhance functionality, and they're actually quite common. So for example, Wikimedia Commons or English or German Wikipedia, they have a lot of gadgets even enabled by default, which makes some behavior easier. For example, on Commons, a common gadget is adding a category to a photo or image that has been uploaded. That's way easier if you use a gadget, which is enabled by default. There are Lua modules, and there's templates. For example, the info boxes that you see in many Wikipedia articles on the side. For example, if you look up a Wikipedia article about a person, these are all templates, and they're all stored on Wiki. So this is harder to track, to get a full overview of that. And some extension code even, we have about 130 Media Wiki extensions deployed on Wikimedia servers. But if you take a look only at the extension home pages on Media Wiki.org, there's more than 2,000. So there's a lot of code out there. And sometimes this code is even stored just by copy and paste, putting it on a Wiki page and saying, here, copy and paste this, and it should work. Which might not be the best revision system when it comes to maintaining code ever, but it's a quick and dirty way, so these things exist. And one other example, unknown code repository locations. We also have something called Toolforge. That's what some people call cloud services nowadays. So you can host your own little helper tools, which other people then can also use on a cloud services platform called Toolforge that we offer. One example would be, for example, page views. So if you want to see which pages are the most popular on some Wiki, that's one example out of also thousands of tools now, actually. And though, of course, the rules are that you must publish the source code. It's sometimes really hard to also make sure that this happens and where it happens. So for most repositories we know. We have an index, but for some we actually don't know, which is also something to work out. So recently, even getting a number of things or getting an idea, like, what can we measure? What do we have? How much do we have? I started to create a table, and even visualizing that was an interesting task. I'm still not sure if anybody understands this, but black basically means doesn't exist. You don't need to, there's nothing to measure, to index. Green means, yes, we do measure this already. And the red ones mean yellow means it's tricky, but it's kind of possible via some scripts or using the API to get numbers out of the wikis, insert namespaces, for example, the module namespace. And red means it's very hard, but we'd like to get this data at some point. Plus also the complexities. So the numbers you see here is sometimes correct numbers, sometimes more of a ballpark, vague figure about how many items, code repositories, projects we're actually talking about. And with some numbers, we're even wondering. For example, it says 270,000 modules and templates on the 900 sites, websites we have on wikimedia servers. And this is what the database query says on Hive, but we're not really trusting that number yet. So this is actually what we're going to be after over the next months to also have way better data and a way better overview of where our developers actually are. Because we know in code repositories, we have about 200 to 400 code contributors in Garret and code review per month. And we now also know that we have about 500, 600 people who work on user scripts and gadgets per year. But for many other things, we don't know yet. And that's what I'm trying to improve over the next month. So maybe realistically years, let's see. But yeah. So that's basically it. I hope this was a bit interesting. If you have any comments, questions, feel free to catch me here. I'm sometimes around the table. Feel free to catch me after this talk. These are links with more information. Or if you don't manage to catch me, feel also free on the community metrics page on MediaWiki.org, the first link. There is a discussion page. And there you can also bring up anything, ideas, ask questions, I watch that page and usually reply. Thank you.