 Okay, great. Well, thank you very much for having me. This is a wonderfully diverse crowd. And so I hope that I will have a nice perspective for us to talk about. So I already was given an introduction, but I always throw this slide at the beginning of my deck. I usually actually put another slide too. So I'm a research scientist at the Wikimedia Foundation, but I'm also a Wikipedia editor. And I've been working on the technologies around Wikipedia and open knowledge for a long time. I actually started there before I went to graduate school. I actually started out as a computer scientist, and so I liked building things and that sort of stuff. But I didn't really realize that building things for Wikipedia could turn into a research agenda. So I was happily surprised when I got funding to look at conflict on the internet. And it turned out that there were technologies that we could explore building around conflict in Wikipedia. And so I could turn my hobby into my profession, which has turned into awful hours of work, but I am sure that everybody in the room can relate to that. So there are three things that I wanna talk to you about today. These things must come in threes, so I have three. First, I wanna talk to you about Wikipedia, how I look at it as a socio-technical system. And so I wanna talk to you a little bit about systems thinking and use a biological metaphor to talk about how I think about how Wikipedia's subsystems work together. Next, I wanna talk to you about a critique that I raised as my PhD thesis from the University of Minnesota. A critique of how algorithmic quality control works in Wikipedia. And so I'll draw from standpoint epistemology and this idea about how we encode our ideologies and the technologies that we build in order to discuss how Wikipedia got to the state and why it hasn't recovered. And then finally I'll talk about an experimental program that I'm working on to try and build infrastructure to change this socio-technical fabric around quality control in Wikipedia. I'll introduce this term that I use, Progress Catalyst. And I'll also talk about how I draw from feminist theory to inspire the design of systems like this. Specifically talking about hearing to speech as opposed to speaking to be heard and the recent literature around the dangers of subjective algorithms in social spaces. Without further ado, let's get on to this socio-technical of Wikipedia. So I've gotta start with this question. Usually my audience has a mixture of knowledge about Wikipedia or not, but I'm not really gonna talk about what Wikipedia is, but I do wanna show you something that I think, a few things that I think are fun about Wikipedia. So it's really big. And of course, this begs a lot of questions about how it got to be this big and how it got to be such high quality. We have about five million articles in English Wikipedia and that's bigger than any traditional encyclopedia ever. There's wonderful articles on Wikipedia. One of my favorites to show off is this article called the list of lists of lists, which is exactly what you might expect that it would be. So for example, it's kind of hard to see on the screen here, but that top link on the list is the list of ancient kings. And so if we click on that link, we will get to another list. Which is a list of ancient kings. And so let's pick on our, or a list of ancient kings. Let's pick on lists of pharaohs, which is the top link there. And then we can get to this wonderful article, which is actually a list article, but it has a lot of information about pharaohs too and how they progress and that sort of stuff. So if we scroll down the page and get to the list, the first pharaoh that is talked about here is Pita. And so Pita is actually, now I've forgotten, I've actually given this talk before and I've forgotten the introduction to Pita's article. But the thing that I really wanna share with you is that I know that you pronounce the P because Wikipedia's have put through the effort to make sure that there are pronunciation guides here. And so it's not a silent P, which is what I originally suspected when I came to this article. You actually do pronounce the P sound. So Wikipedia is also a Wiki, which is a bit of software. And this is the thing that I think that especially as a technologist, people tend to be most familiar with. Anybody can edit, it's shared authorship, it's an online database. This is the thing that's very interesting about Wikipedia as a publishing medium. And it flips the publication model. So anybody can edit, so you publish first and we'll worry about review after the fact. And this involves a lot of wonderful and concerning things about Wikipedia, which I'm going to get into later, how this review process actually happens. It's also an online community. If we take all the language Wikipedia's and put them together of which there are a great number, many of which are not terribly inactive. But if we just look at the active editors in these communities, there's about 100,000. They focus their efforts in all sorts of different spaces, there are discussion forums like these village pump things that focus around technical concerns, policy concerns, that sort of stuff. And there's also Wiki projects that are subject specific, such as Wiki project video games and Wiki project medicine, two very active Wiki projects in the English Wikipedia. The way that I like to look at Wikipedia though is as a system that converts available human attention into encyclopedia material. And like any system, of course, it has this input and output dynamic. I think that it's of course limiting to think of Wikipedia like this, but it can help sort of bridge the gap between a lot of thought that we have about efficiency and how socio-technical systems like Wikipedia work. And so when I give this talk, it's maybe important for you to know that I'm coming from this position of trying to figure out how this socio-technical system manages these inputs and outputs. So, and when I say socio-technical, I'm really drawing from a literature that's kind of old, a literature that's the field of computer-supported cooperative work. So a long time ago when computers started to become things that would actually enter social spaces, social and technical things were seen as entirely separate. Social things that were things that would happen may be next to a computer, but the computer wasn't really involved in this sort of experience. After a while, we realized that we could develop technologies that would help us solve social problems like communicating with each other through email or coordinating our activities through calendar applications. And so I see sort of a partial overlap between social and technical as this sort of groupware period of thinking about social things and computing. But then we ended up starting to explore this idea of a computer-mediated space, a space where most people are connecting through the computer and the computer and the technologies that people are using affect everything. And so that's really where I think Wikipedia operates and this is why I wanted to use the term socio-technical in my introduction, that, and I'll explain just a moment how I think about the integration between these things. So this is important, I gotta stand next to my arrow. So I'm a technologist and that's really where most of my training is, but for a moment, I wanna talk to you as though I'm a technobiologist and I'm gonna draw some correlations between biology and technical systems. So considering some nice living systems that we've probably encountered in our anatomy classes or not anatomy, but biology, so the bacterium and the paramecium. And if there's anybody who has a good background in biology in the room, they should tell me that that paramecium or that bacterium is way too big. It is way too enormous. It should be like one of the little tiny specs that the paramecium is farting out right there. It should not be visible. And that will be important in just a moment. So a bacterium is a very simple organism. It's really just like a sack of saltwater that contains some stuff that does some interesting things by chemical pathways, things running into each other. It's got some DNA, some ribosomes, some vacuoles, but they're mostly just floating around in space on the inside of the bacterium. Whereas a paramecium is a very complex organism. It has these subsystems that float around. Like for example, the endoplasmic reticulum is a specialized subsystem that allows you to encode RNA into proteins much faster and more effectively than just ribosomes floating around in the bacterium. It has specialized vacuoles, like the contractile vacuole that allows the paramecium to operate in both fresh and saltwater. Anyway, so these systems operate at entirely different scales. And I think that this is a useful way to think about how populations of people can operate. So on the left side of the screen, it's kind of hard to see. But beneath the bacterium, I have a photo of a fishing village, a small village that's very often studied for the social practices that are there. One of the people who studied a lot of fishing villages was Dunbar. And so Dunbar came up with this consistency that he started to see as he was studying these small sort of historically simple social structures around fishing villages. That they never grew beyond 150. And whenever they did, they encountered social problems and would very often split or there would be something that would limit them to about this size. And so when I think about Wikipedia, Wikipedia is a collection of people just like a fishing village. But there's all sorts of infrastructure and subsystems that allow Wikipedia to operate at a different scale. So I think of Wikipedia much more like a paramecium. So the fishing villages at 150, as I was saying earlier, Wikipedia is at about 110,000. But it's made up of these two things. So the collection of people that write Wikipedia, and it's, of course, terrible to see this picture, that's a photo from Wikipedia 2014, which is our major conference for talking about Wiki things. And the technologies that they use. And when I say the word technology, I don't just mean digital technologies, the software, the hardware that the system runs on, but also the policies that they form and the processes that they use, the external tools that are not media Wiki but are developed by Wikipedians themselves to support their processes that make that work. And just like you would never study a paramecium as a paramecium that I have here without considering its organelles and its subsystems and how they interact and operate, you would consider them together. So when I think about socio and technical and why we have these two words jammed together, I'm talking about why you study the paramecium as an entire thing, not the sack of saltwater and its membrane separate from its organelles and subparts, you have to study them in context because they all interact with each other in order to make this larger system work. That's what I mean when I say socio technical in the context of Wikipedia, you can't just understand the people, you can't just look at the technology. If you wanna understand how the system works, you have to look at how the people interact with the technology and vice versa. So I like to think about Wikipedia in this way and with the biological metaphor, because I think just like this paramecium, Wikipedia is a system with specialized subsystems. So I wanna talk very briefly about what I think a nice view of Wikipedia's specialized subsystems are. So one of the problems that Wikipedia has to solve is how we're going to allocate work. Who's going to do the tasks? What tasks are there to do and what tasks are we gonna do first? How are we gonna prioritize those things? And so there are a few things that we know about this. We largely get this for free due to Linus' law and so there's actually a couple versions of Linus' law. I'm not actually talking about Linus' Linus' law. This is Eric Raymond's Linus' law and that's that given enough eyeballs, all bugs are shallow. So not that kind of bug, sorry, I never set up that joke, right? So the idea is that given a problem in a piece of software that if you have enough people look at that problem, there's gonna be somebody who has the insight and expertise to fix that problem easily and that's the idea of a bug being shallow whereas a deep bug would be hard. The insight of this is that visibility is critical to open systems. This is one of the ways that we can take advantage of the open nature of a publishing platform or a software repository is that a lot of people can see it and we can get efficient contributions by people who might otherwise not have known or might not have otherwise been available. And so a corollary for Wikipedia is that given enough people see an incomplete article, all potential contributions to that article will be easy for somebody. And so just a brief overview of the literature, it turns out this tends to work in theory. It's part of becoming a Wikipedia editor. We can support this visibility with technologies that call attention to various things and bad things happen when we take it away. We can actually see massive productivity losses when we make things less visible in Wikipedia. Okay, second subsystem I wanna talk to you about is regulation of behavior. If we're gonna have a big crowd of people in the same space, we need to figure out our norms, we need to propagate them, we need to enforce them, we need to deal with our problems, and we need to of course decide how we're gonna consistently work in the same direction. This is probably one of the most well-studied questions in Wikipedia, how does it actually organize? So I went to Google Scholar, this was actually about a year ago and I searched for Wikipedia governance and of course Google gives me this absurd estimate that there's 37,400 papers written about Wikipedia governments and of course there's not that many, but there are a lot. And so I went to the 10th page of results about this and you might notice if you're familiar with this field that actually a lot of those results on the 10 page are actually relevant. They are about governance patterns in Wikipedia so I just wanna give you the impression that this is really well-studied. So I'm going to tell you the view on Wikipedia governance that I think would be most practical if you were starting as a Wikipedia today. So there's two types of norms in Wikipedia. One is prescriptive where we say I think people should do this and one is descriptive where people have been doing this and it seems to be working pretty well, let's write that down so that we keep doing it that way and new people know that we do it that way. So that's prescriptive and descriptive. So let's say that you identify one of those, then you would write an essay and so which is just a long form written document that you would put on the encyclopedia. Then once you had that essay ready you would file a request for comments. There would be a discussion, we would decide whether this is a norm that we want to adopt, maybe we wanna make some changes to it, maybe it's actually already covered by a pass norm. If it passes that request for comments it becomes something that we call a policy or a guideline which is a formalized norm. If it doesn't pass that request for comments it just goes back to being in an essay status and it could go back to another request for comments although a lot of these things just remain as essays. And so this really forms a divide between the described norms in Wikipedia as informal and formal. If a norm becomes formalized as a guideline or policy it's sort of like it becoming law, it's much easier to enforce. Whereas if it stays as an essay people might talk about it it might affect how people think about Wikipedia but it's much harder to enforce an essay because not everybody has agreed on it already. And so for informal norms you get things like Wikipedia's guide or essay on don't stuff beans up your nose which is an essay that says don't tell vandals not to do a thing they haven't already done yet because maybe they haven't thought of that with the insight that if you tell a child not to put beans in their nose they might go oh yeah bean would fit in my nose. And then of course on the formalized side of things one of the most important policies in my opinion for Wikipedia is verifiability. That Wikipedia is not truth, Wikipedia are those things that are verifiable because if we tried to reach for truth we would never be able to solve arguments. There's too much debate about truth. What we need is something that we can verify in sight and so those are the things that are intended to be covered in Wikipedia. Lots of interesting problems that go along with this that I'd love to talk about later. So we can do some interesting analyses to sort of figure out the dynamics of these sort of norm formations in Wikipedia. And so in this graph we're looking at the rise and decline of the growth rate of these formal and informal norms. And so for the two types of formal norms they mostly grew until about 2006 and then they started to slow down rapidly. They're not growing nearly at the rate that they used to whereas informal norms have just exploded around the same time. So at the time where we started to decide that we're gonna have a formalized process around building these norms there was a sudden explosion in the rate that people were creating these essays that we couldn't figure out how to formalize. That we didn't wanna make universal rules but people really wanted to write about them. We can also look at the way at which people cite these norms when they talk to each other and argue about what content belongs in Wikipedia or what behaviors are appropriate. And I can't dig too much into this research but one of the things that I think is very cool that this research found is that Wikipedia's governance structure is generally inclusionary. That the norms are really well documented and anybody who knows about them can cite them in an argument and this is basically the best way that you can win an argument in Wikipedia. Newcomers to Wikipedia may not know about these norms but they learn about them very fast because as soon as somebody cites one of them at you now that's a weapon in your repertoire. And so a lot of work goes into citing these things at each other turns out that who cites these policies distributes from experienced people to newcomers relatively quickly. Okay, third system I wanna talk to you about and this is one I'm gonna dig into quite a bit later so I'm gonna be a little bit cursory about it right now and that's the quality control system. This is maybe the biggest question about Wikipedia. If you haven't seen any of the literature around Wikipedia's quality the first question that you might ask is how could you actually have a high quality encyclopedia if anybody contribute to it? So we need to identify and remove damage and vandalism and falsehoods and that sort of stuff. So the way that this works is also benefited by Linus's law that for any given article a mistake is likely to be noticed by somebody who's gonna see that that's wrong and then they can take it upon themselves to fix that in a relatively efficient way. So any damage is shallow to somebody probably shallow to lots of people because vandals aren't terribly clever. We also supplement this with two technologies that operate in sequence on Wikipedia. One is a fully automated quality control system that uses a machine learning model to detect vandalism that's saved in Wikipedia. This system is really fast. It'll revert vandalism about five seconds after it's saved. It requires no human effort but because of limitations in natural language processing it doesn't actually read the article. It doesn't know if you inserted a falsehood. It's really looking if you inserted a curse word or if you did something that's kind of strange with the article. So it can only catch a small proportion of the vandalism. The stuff that makes it past that goes into our semi-automated systems. So these are actually humans using an interface that very quickly shows them edits and then asks them to make a judgment about whether it's good or bad. This is still pretty fast. People using the system will revert vandalism about 30 seconds after it happens and it's designed to minimize human effort that goes into this quality of control process with the insight that humans catch most vandalism at a glance. Much better than machines at it. And it turns out that people looking at articles these semi-automated and fully automated quality control systems and the administrators of Wikipedia and their notice board for catching vandals as they're causing problems in Wikipedia operate in a nice distributed system that in a lot of ways mimics the innate and adaptive immune systems that we see in biology. Where innate is you get an inflammation on your skin where you get an infection that's not your what we generally refer to as your immune system it's the immediate white blood cells attacking whatever is weird right here. So that's the people who are reverting vandalism at the boundary of Wikipedia whereas we also have this adaptive immune system which is our bodies learn about a specific pathogen and builds an immunity to that pathogen. So Wikipedians will learn about vandals and figure out who's vandaling and what from what IP address range is and then they'll figure out ways to ban these people and prevent them from vandalizing Wikipedia again. So it's slow but it's specific and it works globally. Once you've identified a vandal then you can ban them globally from the entire site and we don't have to deal with their damage again. Okay, two more systems that I wanna talk to you about community management. Wikipedia is built by a large group of people we need to train them, we need to welcome newcomers in we need to make sure that they find work that they wanna do, we need to mediate disputes and of course we need to do training not just for newcomers but for all sorts of things. So Wikipedia gets a fire hose of newcomers. There's about 6,000 newcomers who join Wikipedia and edit something a day. And so we have some routing systems in place that split off the vandals from this group of newcomers and route the quote unquote good faith newcomers which is a Wikipedia and bit of jargon for well-meaning newcomers towards newcomer help spaces like our question and answer space the tea house. And I'm gonna get into a little bit of this later I don't wanna dig too deep into that we're gonna get into community management issues in a little bit. The last subsystem that I think is really important but often overlooked in this kind of space is reflection and adaptation. Wikipedia's own Wikipedia. I work for the Wikimedia Foundation but we don't really set the direction of what Wikipedia is. This is a common mistake that people would think about with this company. We're sort of like a flea riding a horse. You know, we can do minor things but generally the horse goes where it wants to go. So does Wikipedia figure out where it wants to go well? So they have to ask these questions where do we want to go and how do we wanna get there? What should Wikipedia be? What is it right now? What are our problems? So going back to those policies in norm formation I like to think about these two types of norms the informal and formal as reflection and adaptation. So a lot of the essays if you sit down and read them they'll talk about where are we going? Where should we go? How should we get there? But adaptation is when we actually formalize a plan for how we're gonna get there. Here's how we're gonna change our behavior. Here's what belongs, it doesn't belong. So it's a very clear way to build in that adaptation to the system. But this plot might make you ask the question why are we now having a whole bunch of reflection but not very much adaptation in this norm formalization space? And I think that's a really good question that I don't have an answer about but we might like to talk about later. So brief summary of the direction that I'm going with this biological metaphor and thinking about this as systems. So this paramecium is a complex system of chemical interactions that you can't really look at one molecule and really figure out what's going on in the system because the system is complex and it has these exponential and stabilizing effects. In the same way that Wikipedia, it's not interactions between chemicals, it's interactions between people, the technologies, the policies that they build. You can't look at a policy, you can't look at an individual person, you can't look at a technology and know what the system is doing in the same way that you can't look at an individual molecule or an individual organelle in a paramecium and know what it's doing. So on to looking at this Leica system. So this brings me to an issue that I think is happening or in the socio-technical system of Wikipedia. So this is actually some work that I published around 2012. It was one of the cornerstones of my PhD dissertation. I don't have a lot of time to dig into the methods and that sort of stuff for this study. So I wanna tell you the story quick but know that if you have access to the slides which I'm sure are online by now, there's an open access article that I wrote about this and so you can dig into that called The Rise and Decline. And so I'm gonna tell this story using this graph that shows the active editor population in English Wikipedia over time in three phases. So in the early phase of Wikipedia, there weren't that many people around. This is between 2001 and 2004. You mostly knew everybody and if we think about Wikipedia as like a Wikipedia systems as a set of gears, the gears were mostly made of people. There was the media Wiki software but all of the infrastructures that people put in place to organize the system and make it work were human infrastructures. They were people that understand processes and implemented them. Around 2004 to 2007, there was an exponential growth in Wikipedia. And so this is the time where Wikipedia suddenly became a fire hose and the people who were long-term editors in 2004 were vastly outnumbered by the people who were coming to the site and so they had to deal with these problems of scaling. So one of the ways that they dealt with this fire hose was by building these algorithmic quality control tools and building them into their socio-technical infrastructure. And so here I have gears that represent the fully automated quality control systems and the semi-automated quality control systems interacting with fewer humans. You just needed fewer people to do quality control work once these semi-automated and fully automated systems were put in place. And so this helped them deal with the growth that was happening. But after this period, we see an abrupt and steady decline in the population of Wikipedia. It actually took us a little while to figure this out. Oh, I forgot before we get into that, before I tell you what I think was happening in this decline, I wanna talk to you about how people thought about the technologies that they were building during this growth period, these semi-automated and fully automated quality control tools. So in order to talk to you about that, I wanna talk to you about these concepts of standpoints and objectivities. And for this, I'm going to try my best to channel Donna Haraway, who's a philosophy of science scholar. So in her studies, she was looking at scientists who were studying apes. And so she looked at two sort of characteristic groups of scientists who were looking at ape behavior, male scientists or male dominated science labs and female scientists. And I put an asterisk there because when I say female scientists, I really mean like feminism sort of new wave informed groups that were maybe not dominated by female scientists, but at least were inspired by certain directions that feminism was taking. So both of these groups looked at the same real thing. They were taking observations of the same species of apes, the same types of behavior, the same context of those behaviors. But they ended up drawing different conclusions. So the male dominated science groups drew conclusions about reproductive dominance or reproductive competition and dominance, whereas the female scientists drew conclusions around communication and social grooming. And so the idea that she came up with in thinking about why this happened was around these terms standpoint and objectivity. So your standpoint gives you a view of what's important and valuable. So is communication practices valuable or dominance around sexual selection valuable to understand about ape behavior? And so the objectivity that you construct is something that brings this value to life. So say the methodology that you apply. So when you're observing ape behavior, do you look at grooming patterns or do you look at how sexual partners are selected? And through the application of these things, you can come to different conclusions. But I think a really key insight to this is that it's not an argument about which answer was more right. Both of these can be merged together and extended to form a more complete picture of ape behavior. So if you were looking at a textbook right now of ape behavior, you would see all of these things. You would see an expanded standpoint of what's important to know about apes. You would read about reproductive competition, dominance, communication and social grooming. So it's really important to know we're not critiquing and saying, throwing away this sort of old view that the male dominated scientists had, we're extending it with this new standpoint. So now to discuss the standpoint of the growth period of editors in Wikipedia. So what they saw was that Wikipedia is a fire hose. There's just a massive amount of new stuff coming in and it shows no sign of stopping anytime soon. Bad edits need to be reverted. This is the main concern about Wikipedia. How can it actually have quality when anybody can edit it? We need to make sure that we don't let the bad edits in. And it would be great if we can minimize the effort that we waste doing quality control work and so we can spend time writing the encyclopedia instead. So they built this objectivity into the tools that operate as filters of new contributions coming into Wikipedia. These tools really separate edits into good and bad and operate in this sort of distributed quality control system appropriately to that. And for this standpoint, for these values that they came to, it was a massive success. Wikipedia is extremely high quality. Vandalism is reverted incredibly quickly. And when these tools go down, I know that they're very effective because we've done natural experiments on when they went down. It takes an awful long time to revert vandalism when we don't have these things here. So as a technological intervention, this was a massive success. And so right now, this is a pretty good view of Wikipedia. You have the internet on one side and you have Wikipedia with its recent changes viewed on the other side. And we have these systems in between that reduce the workload of reverting vandalism by about 90 to 95%. But, so these tools were generally designed around 2007. By 2009, we realized, oh no, that exponential growth that we were just taking for granted, it's not actually happening anymore. And it turns out that what's happening is that the newcomers are still, like people are still coming to Wikipedia and droves, but the newcomers aren't sticking around. All the people who were in Wikipedia for a while are still there and staying there as long as they ever did, but the newcomers are not sticking. So when it took us a few years to actually figure out why. And so here's the result of this work that I was talking to you about from my thesis is that we forgot to design for socializing the newcomers as they were coming in. So let me talk to you about how this updated our standpoint. Oh yeah, and so this is, to say it very clearly, it seems that quality, our quality control subsystem was squashing our community management subsystem because we were throwing out the baby with the bathwater. We were essentially having good faith newcomers getting stuck in this quality control system being pushed out. So given this, Wikipedia is still a fire hose. We still have to make sure that the bad edits are reverted. We still want to minimize the effort wasted on doing quality control work, but we now know that it's super valuable. We have to socialize and train the newcomers. People who are new in Wikipedia need to make friends. They need to feel like the experience is rewarding. We can't just tell them when they've done something bad with a robot. So this new standpoint was incorporated into a lot of conversations about how we organize stuff for newcomers in Wikipedia. So more newcomers has become a major Wikimedia Foundation goal, and so we've been working on a lot of initiatives in this space. And also people in the Wikipedia community have been working on initiatives in this space, such as mentoring spaces like the co-op, which is like a one-to-one mentoring pattern. You can sign up and get a mentor and they'll help you edit Wikipedia. And the T-House, which is a question and answer system where you can file a question like I was reverted and I don't understand why. Can somebody take a look and let me know what's going on? So this is a win. We extended our standpoint. We changed how Wikipedia works. And so this is great, but there's something that didn't change. And so I want to talk to you about one of the most dominant quality control tools. One of the ones that people use to interact with newcomers the most just because of the temporal rhythms of these things. So first I have to start with the sugar because I generally give this talk to Wikipedians. They need to know that I'm not coming from an area of pure critique. Huggle, which is one of these semi-automated systems is an amazing piece of software. I have no doubt that it represents the state of the art and distributed quality control. This software and its users are responsible for critical work. Without this, Wikipedia probably would have crashed during the exponential growth period. Its developers and users are wonderful people. They're actually a lot of my collaborators are the people who work on these tools. We owe them a lot for their thankless work. And so, thank you. I believe what I say, they're really cool. Now the medicine. So when we discovered that Wikipedia was declining, this is what Huggle looked like. So in this interface, it's kind of hard to see, I'm sorry, but this part of the interface here is showing you the diff. It's showing you the actual change that was made into an article on Wikipedia. This part of the interface are the two buttons that you push to say this is good or this is bad. And there's a list that's sorted on the side by the probability that the edit is bad. And so they use some basic machine learning algorithms to try and predict. Is this edit likely bad or is this edit likely good? And note that when you click those buttons, a whole bunch of actions will cascade. If you hit the bad button, this edit will be reverted. A warning message will be sent to the user who made that edit telling them to stop analyzing Wikipedia. And if they have a couple of those already sent to them, then they'll automatically be reported to the administrator's notice board so that they can go block them and the adaptive immune system can take its pathway. So this critique has been going on for a long time. Like I said, I published this in 2012. Well, here's what Huggle looks like today. So you can see there's still this diff pane here. We still have the good and bad buttons. And there's still this list that sorts things by the probability that they were bad. But the interaction between newcomers and quality control people hasn't changed. We're still sort of cramming people into this idea of they're either good or bad. They can't be good faith in making a mistake. They're vandalizing Wikipedia and they need a warning that tells them to stop doing that. And so, oh, and to their credit though, the developers did add a button that if you click it, will send a welcome message to this newcomer. Regretfully, it seems that nobody really uses that welcome message at least the last time that I did an analysis here. So in the end, it looks like quality control in Wikipedia is still not designed with newcomer socialization in mind. The conversation moved to developing these new social spaces for newcomers but not the design of the technical systems that people had put in place back in 2007. So newcomers, especially those who don't conform to the normal Wikipedia mindset and pattern of behavior remain marginalized by this because they're being hammered by these quality control tools for making mistakes or just not doing it the way that Wikipedians have done it in the past. And so we're still not seeing mass and like not any real substantial gains in the retention of good faith newcomers. And so we're still having this decline because newcomers who are trying to contribute productively are being pushed away. And so that begs the question of why? Why if we extended the conversation in some spaces, why didn't the technology change too? What is it about the technology that made it so stable that it didn't adapt? And so this brings me to the third part of the talk that I wanna talk to you about which is the infrastructure for socio-technical change. So the core part of all of these quality control tools that I've been talking to you about is the machine learning classifier. The basic way that this works is you take a set of statistics about an edit. Is this user anonymous? That's a Boolean value. How many characters did they add? What's the longest token that they added to the article because a lot ofandalism is just mashing the keys and if you don't put any spaces that makes a really long token. And of course, how many bad words from like a list of curse words and that sort of stuff did they add to the article? The machine learning algorithm does something which is mostly building up correlations in some sort of difficult to explain way and difficult to describe after the fact for sure. And then it makes some judgment over whether this edit was good or bad. If it was bad enough with high enough confidence then Clubot will automatically revert it. And but if it was somewhere between kind of bad and totally good, then it'll show up in this sorted list inside of Huggle and so people will review the probably bad edits first. It turns out that the three dominant quality control tools in English Wikipedia all have their own machine learning classifier that are essentially solving the same problem just in slightly different ways with similar fitness. So let's say that you wanted to sit down and take this expanded view of what newcomer is coming into Wikipedia and quality control like how we should deal with the fire hose. You wanna take this expanded point of view and do something different to incorporate this extended standpoint then the first thing that you're going to need to do is build a machine learning model that'll help you detect damage so that you can deal with it efficiently like we have in the past. So in order to do that effectively you're gonna have to read about 20 plus research papers on Wikipedia damage detection. Machine classification is not part of a standard computer science degree which would be fine because most tool developers don't have a standard computer science degree anyway. It's extremely labor intensive. There are a lot of performance considerations not just for applying the model but for extracting the features that go into the model. So you're probably gonna pull your hair out and give up. And so this is what I think has generally been happening as people consider what we can do in this space. And so to talk about this, I wanna borrow this type of plot from chemistry and just a quick show of hands. How many people are familiar with this plot on the screen with no explanation? Okay, so we have maybe about a third, a quarter of the room I have seen something like this before. So essentially what we're looking at is how you fry an egg. So this is a reaction pathway, how you have a chemical reaction in this case we're denaturing the proteins in the egg by applying heat and we need a certain amount of heat which is about 149 degrees. It doesn't exactly work that way because molecules bonds around and there's some randomness that have to do with heat but generally if you get an egg up to about 149 degrees the transparent stuff turns white. And that's what we're looking at in this graph. The idea that I think is very interesting here is that there's this threshold. You have to cross this energy threshold in order to make it further in the reaction path. But you can introduce a catalyst. So if we had an egg white catalyst then we could potentially reduce this temperature substantially. You might imagine a catalyst that could allow us to convert a transparent egg to a white egg at room temperature if we had something that made that chemical reaction easier. So when I think about the current state of the art that we have around quality control and newcomer socialization in Wikipedia and where we would like to go I see this building machine learning model as sort of an activation threshold that you have to put in this certain amount of work in order to even start working in a better direction. But it'd be great if we could knock that down, if we could dramatically reduce what that threshold looked like so people could experiment in the space much more easily. And so I see that as sort of a progress catalyst that if we wanna make progress in the space we need to make progress easier, at least easier to get started for somebody who's not like me and it's a computer science background. And so this is one of the systems that we're building in Wikipedia. So the idea is that we take this idea of the machine learning model that's in all these different systems and we centralize it. We make one, we make it as efficient and effective as we can, we do public evaluation metrics, we make sure that it's really good. And we make sure that it has excellent uptime so that we don't suffer from those times when the quality control tools go down and vandalism sticks in Wikipedia for a while. And so the cool thing about centralizing this is that we won't just benefit the current quality control tools, we'll benefit new tools that can take advantage of these things, tools that are aimed more towards socializing newcomers. HoSpot was the one that routes newcomers to good faith newcomer spaces and Snuggle is a side project that we can get into if you wanna talk about that. So that if you have your idea about how you would apply this extended standpoint, now you don't have to go through the effort of building up a machine learning model, you can just reap the benefits of actually having one available. So how does this work? It's just a web address, it's a simple web API and I'll show you how it works right now. So you can actually bring this web address up in your browser and it'll do the classification but you don't have to do that, I'm gonna walk through it real quick. So in this URL, we're encoding a few things and Wiki is the database name for English Wikipedia. We're gonna look for some scores there. We wanna know is this edit damaging or not? That's one of the models that we host. Is it harming the article? And this number is an ID number for a particular edit in Wikipedia. And so if you give this URL to the system, oh and by the way, I forgot, you're not quite gonna see this, but this particular edit, the 638-307-884 corresponds to a good edit where somebody is making sure that a link to the old woman, which goes to a page saying there's lots of things that are called the old woman, there's these plays, there's all these sorts of things. Which one do you wanna go to? Well here they're changing the link to make sure that it goes to the play because when we say the old woman in this article, we want somebody who clicks on that to go to the article about the play. So this is a good edit, it's called link disambiguation. Wikipedians do this quite a lot. So you give this to the system and it says, I predict false. With a high certainty, this is not a damaging edit to Wikipedia. But let's take another edit. So 642215410. So in this edit, this editor who is username by the way is blank123456789, already an indicator, removes a reference in the article and replaces it with all caps, llama grows on trees, or llamas grow on trees. I should get the grammar right. So if we give this edit to the system, then it says, yes, I predict with high certainty that this is in fact vandalism and you should maybe go review it and see if it actually is damaging. And so this is how all these systems work. But the cool thing is that you can just go to this web address and it'll do these things for you. If you've ever written any code before, you'll know that making a request to an external service like this to have it do something for you is really, really easy. And if you recognize this output format as JSON, this is where all the cool kids are using these days on the internet, whether they have a computer science or not. Okay, so the system is fast. We can score and edit in about 0.1 seconds. So theoretically, we can be much faster than even the automated quality control systems right now. It's scalable and redundant. I have a computer science background. I've studied distributed systems. I can do this. I have good support from engineers at the Wikimedia Foundation and we're comparable to the state of the art because I've dug into the research literature and I'm incorporating everything that I can into making the system work. And by the way, this is good for research too. So if you're looking at Wikipedia and you wanna ask research questions and quality is part of your question, I think this would be very valuable. You should talk to me about that. So what I'm hoping that this system will do is act as this catalyst where in the example before we were essentially reducing the activation threshold by about 70 degrees Fahrenheit. In this case, we're reducing the activation threshold by about 20,000 lines of code to get one of these systems working and running in real time and one advanced degree in computer science. So you no longer need those things to have this type of reaction take place. And so what we're hoping is that we'll get an explosion of sort of ideas on how we incorporate this new standpoint and improve the type of work that we do. And we've already seen some success with this where Huggle is now using our system to sort those edits on the side by the probability that they're damaging. And so we now get to have much more substantial conversations with the developers about how they're sorting those edits. We can actually solve some of the problems that I think are happening in Huggle by improving the algorithm and having our better false positive rate. So there's one more thing that I wanna talk to you about but I wanted to separate this into its own section because this is really, this is my story. When I talk to technologists, this is the thing that I really want them to take away from a talk like this. And that's what you can pull from the critiques that we get from feminist theory and how we can turn those critiques into design that actually makes the system work better. It's not just better for people, it's better by efficiency standards. So one of them is subjective algorithms. So there's been a lot of literature recently about how algorithms in social spaces can cause social problems. So I'm quoting Zinepp Tufeki here. These subjective algorithms are algorithms often aided by big data. Now make decisions in subjective realms where there is no right decision and no anchor with which to judge outcomes. So for example, what is good? What is relevant? What's important? What's desirable and what's valuable? That's exactly what we're doing here. So looking at this classifier, we're trying to predict what's good and bad, but of course we're not gonna do perfectly. So we're gonna predict good and some important bad stuff and bad and some important good stuff. And so it's gonna make mistakes. So there's this really long quote that I have from a Wikipedia, but I just want to get the important part, which is avoid encoding racism and other biases into the AI scheme. So for example, this editor is concerned about local writing style, like saying beautiful village, which is common to writing about villages in Pakistan are targeted by people who are just opposed to that sort of thing being in Wikipedia. And so if we trained our AI scheme based on just what's removed from Wikipedia, it might learn these sort of things and it would hide that. Nevermind anybody how the decision is made because it's hidden inside that black box of the AI model. So let's say that we train this model on whether an editor or an editor is reverted or not from Wikipedia. It would probably in a hidden way down weight anything that has beautiful village added into it. And so we wouldn't just predict bad and a random sample of some important good stuff. We would predict bad and a bias sample of edits to Pakistan villages. And what's worse is if we update the model and train it on new data as vandalism changes over time, we'll get into this feedback loop where as we target these edits to Pakistan villages, those will get reverted more and then we'll train into reverting those things more and eventually we're going to end up with a Pakistan village classifier. So one more concern that I think is really important here is empowerment versus power over. So and here I'm trying to my best channel, Nell Morton, who talks about this idea of hearing to speech as opposed to speaking to be heard. So the idea of hearing to speech is that I want to hear what you have to say. So I'm going to make a space for you to say it and listen. Whereas speaking to be heard is more I want to set the tone of our conversation by talking first. And so when it comes to conversations about tools in Wikipedia, I'm pretty powerful. I've been a software engineer for greater than four years before I started working on the research space. I have a PhD in computer science. I have a substantial background in psychology, social science, systems theory and human computer design and practices. And I'm a Wikimedia foundation staff member. This gives me a lot of power. I'm in a very privileged position. So what I want to do rather than saying here's how we should change our behavior. Instead, I'm specifically targeting making it easier for other people to experiment in this space. So I have done interventions where I've tried to make newcomer socialization better by building a specific tool. And what I've learned from success and failure in that area is what people really want is me to reduce some area so they can do their own experiment experimentation in the space. So this intervention we're actually targeting empowerment rather than power over in this way. And I think that this will actually end up being more effective. We'll actually have a better outcome because of this. But even the basis of this conversation, I think quality control is important. We should have a conversation about socialization. But somebody else might say actually, that might be right, Erin, but I think that you're wrong. I think that we should also have this conversation about something else. So it would be great if the system that we're building could help them predict that something else that they want to get at. In order to do that, we're going to need to be able to train this system on what that something else is. So in order to train this system, we've built a labeling system inside of Wikipedia where you can manually review edits and say after the fact whether they were reverted or not, whether they were actually damaging. And so we have people who are keeping in mind this bias manually re-reviewing edits to tell us whether they're damaging or not. And so this helps us deal with the feedback loop. So we have people who are on the lookout for the Pakistani villages problem to manually relabeling these things so that they don't look at what was reverted in the past when they consider what's damaging now. And we're also working on the hearing to speech angle by making sure that the system is easily configurable so that if somebody shows up and they want to make other predictions, then we can load another sample of data into the system to be labeled. And they can configure a new type of form that asks a different question so that we can build machine learning models that predict that. And so using the system and machine learning together, I'm hoping that we can build lots of infrastructure that uses machine learning in a responsible way in the social space to make Wikipedia more efficient and affect a lot of the different aspects of the socio-technical fabric of Wikipedia. So, and one of the things that we're doing is targeting host spot. And so this is really my next project is using this classification service to more efficiently detect good faith newcomers who are trying to work productively and running into troubles in Wikipedia and make sure that they get routed to our newcomer help spaces. Yes, make host spot a machine learning system. So in summary, I talked to you about Wikipedia is a socio-technical system. We talked about systems and biological metaphors. We talked about this critique of automatic quality control systems, how standpoint epistemology can help us think about why they were designed the way that they are and how they encoded a certain ideology into them. And then finally, I talked about infrastructure for socio-technical change. Why I think that the change in these technical systems is limited by the difficulty in actually implementing some of the technologies that we can build services as sort of a progress catalyst to get these things so that the conversation can finally move forward. And I talked about how using these ideas of hearing to speech as opposed to speaking to be heard, we can think about designing these technologies to empower other people rather than just imposing my limited idea of what should happen next. And I talked about the dangers of subjective algorithms and how we can engineer systems to be aware of those things. Thank you. So thank you very much, Aaron, for sharing all of that very fascinating and thought-provoking material with us. We have about 20 minutes or so for questions and discussions. So there's plenty of time for that. I will remind folks that questions are usually short statements that are interrogative in nature, often concluding with a raising of the pitch of your question. And also, I imagine, Aaron, you may have a little bit of time afterwards if people have like extended conversations. Like, you're really excited about exploring further. I'll be here until about 4 p.m. Cool. So, Aaron. Thanks for that. That was cool. I'm curious about how you approach the kind of process by which users can kind of propose their own, you know, things to include in your machine learning algorithm, like, because there's a certain need for them understanding how to even phrase their proposal and say, is this even possible? And so, like, how do you bridge that language and knowledge gap with regard to what they could even think about doing with machine learning to help create a more inclusive system? Yeah, that's a good question. So I present about this work a lot in as many spaces as I can. And one of those spaces are around where tool developers congregate and Wikipedians congregate. So we have conferences and hackathons and that sort of stuff. And so when I give that presentation, I talk a lot more about how backlogs are pervasive to the work in Wikipedia and the sort of things that a machine learning algorithm can do to help sort and filter these types of backlogs. And so, like, reviewing new article creations, assessing articles for their certain quality level, there's all sorts of things like this that Wikipedians need to do, that machines can help them target their time on. And so I present the idea that way that I want you to think about backlogs and then tell me about your backlogs. Another way that we do this is, so this system is actually up and running in 17 languages now, 17 language encyclopedias. And so in order to get it up and running in one of these local Wikipedias, it's a completely different social structure and that sort of thing. We need Confederates who are from those communities to help us get it set up, help talk to people about why they might want it and then help us get things translated so it'll actually work in those systems. And so we talk to those people who work with us quite a bit about the problems that they see because they're probably different than English Wikipedia. And so we have a huge backlog of potential new classifiers that we can put in place like spam detection, aggressive discussions in the discussion spaces in Wikipedia and that sort of stuff. So I'm not sure that we're doing necessarily a great job at pulling everybody in, but we're definitely pulling a lot of people in to propose these things. We're mostly feeling it out as we go. We haven't built a lot of infrastructure. We haven't decided like this is the way that you propose something. It's generally if you find any way to interact with us in a space on a Wiki, send an email, find us an IRC, talk to me how to talk. Then we encode it in our backlog and start talking about how we can actually implement that classifier or that predictor. Do you imagine it could get beyond the kind of requirement for a high touch discussion to something that is more streamlined or do you think that high touch discussion is required in order to really understand context? I think that it can get beyond that. Right now, I really want the system to be there and work and I don't quite, I don't think that we can engineer like a low touch sort of system right now and do it quickly. But I definitely think that that's possible, that's tractable, we can do that. We could make it so that we can actually have people who are in their local context, they can get a little bit of training about how to create a test set, how to evaluate it, have the system give them general indicators and so rather than coming to me at the beginning saying I have a problem, can you solve it, they would say maybe I've trained this model and I'm not getting good evaluation metrics, can you help me take a look at it? I definitely think that's possible. Over here, raise the hand. Yeah, I have nothing. Any insight about the social metrics that you're seeing inside Wikipedia? Between the Wikipedia's? Like the social networks, can you help me in? Yeah, like people engage with each other in these discussions, so has anyone looked at that? And how is that changing over time? Like when you see the exponential rise? So I would like to defer that question to Brian King if I could. So like you care about like the discussion graphs, I mean so like there's some researchers like Andreas Keltenbruhner and David Lanaito like done things around like how people, how long does like a discussion take to wrap up? So you look at, I know there's a really famous discussion around Danzig versus Gdansk and like this was a discussion about this like German Polish town on the border, what should we call on Wikipedia and that was a discussion that went on for 10 years and it was very sort of active. There's a lot of like people. 500,000 words. Involved in that discussion. So it certainly has many kinds of relationships you can pull out of like people discussing with each other. A lot of work that Aaron's done that I like, this looks at whether or not like you write a word, how long does that word persist? So maybe you could pass it back. Yeah, so like one of the things that I would really like to look at more is not just like the direct and explicit interactions that people are having with each other, but also the implicit interactions because I think that we can probably learn a lot about who's removing, who's words that were added from articles and that sort of thing. So a big research program that I have right now is around developing efficient infrastructure where we can develop some data sets that are easier to process that will give you a lot of the institutional knowledge that I have about how to process these things efficiently for free so that you as an external researcher can pick up one of these data sets and ask your questions about these sort of interactions. So another one that we're digging into right now is are the explicit conversations that are on talk pages right now because we don't actually have discussion forums on Wikipedia, we have talk pages. They're a page that you edit and you sort of tab your comment in so that it looks like it's a reply structure. You sign it with four till these or actually whatever you want, you can even just not sign it at all. So it's anybody's guess who actually posted that comment. But the norms generally hold and I know a lot of those institutional norms and we're developing software so that we can encode that, extract the data and then you can just say, well, I wanna know who replied to who and what sort of linguistic characteristics it had. I don't really care how Wikipedians format their talk pages. So that's something that we're digging into really hard. Back here. You showed us an example of what would be called an undesirable positive feedback loop. Is there a way that positive feedback loop is something that can be automatically identified or, let's see. Oh gosh, this is ignore credit. That's something that can be... Yeah, so this is something that I've been looking into a lot. So gosh, I wish this would actually have linked to the page that I was hoping to show you. But so one of the projects that I'm running in parallel to this is a project that's, I'm not gonna get a graph, whatever. Anyway, it's around the idea that a lot of the incentive structures for Wikipedia are sort of like, there's a big gap for especially like subject matter experts and that sort of thing to engage in the Wikipedia space. It's hard to claim credit for the work that you did. And so I'm working on robust measurement strategies for measuring the productivity of a Wikipedia contributor and the value of the work that they're doing based on some kind of like third rail sort of things. Like if you edit articles that get a lot of page views, that's probably more valuable than if you edit an article that doesn't get that many page views. However, the article on Breaking Bad gets about 10 times as many page views as the article on chemistry. And so of course, like is Breaking Bad more important to an encyclopedia than chemistry? Oh yeah, I don't know. I don't know if I wanna say that I'm like academic, we should cover these subjects first. Maybe Breaking Bad actually is that important as a social artifact, but this is one of the things that we're exploring right now and actually trying to take these measurements. But I think that if we can settle on a set of measurements that capture maybe different aspects of importance and productivity and allow people to look at those things, they might then target their work and say, well, I care about importance with regards to academic importance. This measurement targets academic importance nicely. So now I'm gonna use that to try and guide my work. And so we have recommender systems that will find articles that you would like to edit. I would like to reweight those predictions by articles that you would like to edit that will have a high impact in the way that you wanna be impactful. And so we're just on the cusp of running experiments, interventions in that respect. Oh, one other thing that I would really like to do is I would really like it so that when you apply for a tenure track position or tenure, that you can pull up a well-known metrics resource about Wikipedia and say I added this much value to articles about discipline of choice. So two related questions. The first one is kind of in a world where you're scaling why go the machine learning route and not just deputize a ton of new editors and throw people at the problem. And then the second is why bother bucketing into good and bad faith at all and instead just treat everyone as good faith. Maybe the llama guy had a point or something. I don't know. Yeah. So I think the last question is surprisingly fair. That like there's been a lot of look into like what is legitimate in Wikipedia and how do we arrive at that? And it's a blurry line at best. I mean really describing it as a line is really difficult. There are a lot of people who control content in ways that are not explicit. I think that we might say that of course putting all caps, llamas, growing trees is not intended to be a productive contribution but there are a lot of things that are in that space where it's totally unclear and people will revert you forward anyway. But so going back to the like why bucket people into good faith and bad faith. I think that that's a really interesting question especially because so I started editing Wikipedia by vandalizing Wikipedia. I did. And I don't think that, no. Because I only did once and I got a personalized message saying hey I saw that you were testing things out maybe you shouldn't do that and maybe you should consider contributing productively and I did. And I think that we should consider that a lot of the damage that happens in Wikipedia are people who are just trying to see how it works so they're trying to be funny but if we have the right interaction with them then we'll find out that they actually are good faith and they didn't really mean any harm and if we welcome them and say there's space for you we have a lot of work to do and it would be cool if you stuck around then they would. There are some people who are kind of clearly bad faith but there are really, really vanishingly a small proportion. The people who just want to put a little bit of garbage in Wikipedia are pretty common. Your experience sort of sees into my question when I saw, so the example you gave with the Lama whatever all the output that there was was either true or false. When it comes in terms of process of how this, how the author gives back response to the person who writes does he see, for example, how the algorithm ponderated its own edits and what criteria it used where it just says yes or no. I think feedback and output this might be interesting in the process that goes with it. So like one of the things that's sort of notoriously hard with a machine learning algorithm especially one that works particularly well is figuring out why, like attributing a real why to why it made the prediction that it did. And so rather than going in that direction we're trying to do two things. So we have an open feature request that I've been working on to publish the features that were extracted for this edit so that let's say it makes a weird prediction you could look at well maybe it actually measured something wrong about this edit and you could get a sense that hey we're actually predicting false a lot when you take this type of measurement about what was happening in the edit and it shouldn't be false. The other thing that we're working on is we actually have several different models in place right now which will give you a little bit of context about the prediction. So for English Wikipedia we have a model that predicts whether the edit will need to be reverted whether it actually caused harm to the article because a lot of edits don't cause harm but they need to be reverted for other reasons. And finally an edit that predicts whether it was good faith or not. So you can do something in good faith it didn't damage the article but it needs to be reverted for other reasons you can or you can do something in bad faith that did damage the article and needs to be reverted. And so what we're experimenting with is showing all of those different predictions in the same user interface. I think that we can do better. Right now we're modeling on the edit level if we model on the user level we can do much, much better. My open bug request to the Huggle people is to incorporate our is this editor working in good faith model so that whenever you say revert this editor if the model predicts that they're editing in good faith then it will stop you and say hey you should really send a customized message to this person don't just let it be the Vandal message this is somebody who's worth your time to reach out to. It helps to be able to approach this sort of like should you revert this or should you not revert this in this way because I'm not just saying hey you should be better. I'm saying hey you have a lot of work and I realize that but I can help you know when it's going to be most beneficial for you to be better. So it's not gonna be that much more work to be better I'm gonna help you maximize that impact. There's a lot of danger in this area so like really the biggest thing that we're doing is having our entire process for building these models all of our evaluation metrics open. Every conversation that I have with my colleagues who I work on this is in a space where we have like a vibrant community who are talking to us about these things and so there are a lot of people who are developing intuitions about them but regretfully to come back to your original question I don't know if we'll ever get to the place where we can say why it made the prediction that it did and that's too bad. Time for about two more questions. Actually following up on what you were just saying with regard to that goal for having an open discussion I'm curious about your biological metaphor particularly when it comes to immune response because you can have situations in which the immune system goes hey what? Right and it starts attacking good actors and destroying the community to bring it back to Wikipedia and so I'm curious have you seen examples of that? Is the standpoint problem part of that issue when it gets codified technologically and how does the kind of machine learning and technical approaches maybe either exacerbate that and to kind of flash crash with algorithm trading or get away from some of the problems when people would adopt problematic policies that hurt themselves? Yeah so I really like that way of describing it. I've described this as an autoimmune disorder that Wikipedia has quite a bit and why I like to bring up that example of the immune system and so like I think that there are a bunch of things that you can draw between like immune responses and this sort of problematic pattern and how people with immune diseases tend to get inflammations and that's the innate immune system and Wikipedia suffers most in its innate space. So the question that I wanna ask in which I would really like to have somebody who's like an immune biologist to help me think through is how do biological systems adapt and what is the failure of the biological systems like from an abstract space what's the failure of the biological system that results in the autoimmune disorder? And maybe we can actually look at that from the same direction as standpoints and see that this is happening in social spaces too. That there's something about these complex systems and their strategy for dealing with this this open problem of damaging things sometimes come in that we can use to gain insights maybe on both sides or at least because there are a lot more people who are immunologists than there are sociotechnologists looking at things like Wikipedia and so I bet you I could learn a lot from them. Regretfully I haven't found somebody who's super excited about that and has a background in that place but I would love if you would direct them to me. So I have a question here. So you've outlined one way that it's possible to bring on board critiques from feminist standpoints and feminist research which is to take questions and critiques that are sometimes coming from qualitative researchers and using them to develop basically new dependent variables like new measures that are still quantitative in nature to try to get at some of the spirit of what these critics are bringing. I'm curious if you feel there's also scope for expanding the methods of inquiry beyond these like machine learning or quantitative methods while still making sense of Wikipedia at the scale that it is. Yeah so I would actually put that in the class of response to critique like so that like my methodological home is really behavioral measures at you know like quantitatively but I like my practice is that I'm a member of the community in which I operate that I like I mean just yesterday I did four interviews of people who are working around these curation spaces that I think that methodological breath and you don't you don't have to be like a proper ethnographer to benefit from ethnographic practices that becoming a member of your community going and reverting some vandalism trying to be a newcomer and seeing what that's like can inform the way that you think about the system can inform the strategies that you employ to design systems to pull people on board to evaluate them at the end. And so that's really what I would argue is that my position is a skeptic of how things work and what I'm really seeing actually being the truth sort of necessitates me to figure out how these other methodologies work and how I can address my questions more effectively by incorporating these broad methodologies. One thing that I also abuse quite massively is finding collaborators who aren't like me and working with them especially ones who are particularly critical of my work whether they're Wikipedia editors or other researchers because I think that the people who are most critical of your work especially the ones who demonstrate a massive amount of competency know something that you don't and have a perspective that you don't have. And I think that we too seldom invite people who are in that sort of critical relationship to the design table to say, you know, like here are the words that I'm using to talk about this here are the words that you're using to talk about this can you line these things up and talk about how we can do it better? Maybe we do have time for one more question if there is one. Well, thank you so much, Aaron and let's all thank him for such a fascinating question. And are you around for a few minutes? Yep, excellent.