 Emerald, let us know when we're ready. And we are live. Hello, everybody out there in Wiki Research Land. This is Jonathan Morgan from the Wikimedia Research Team, and this is the July, sorry, June 2019 research showcase. Thank you for being here with us today. We have three presentations for you. The first is going to be by Jonathan Chang of Cornell University on trajectories of block community members. The second presentation from Lane Raspberry, Wikimedia and Residence at the University of Virginia focuses on automatic detection of online abuse in Wikipedia. And last but not least, by data analyst, Morton Moran Ki-Wang of the Wikimedia Foundation product analytics team, some first insights from partial blocks in Wikimedia Wikis. With no further ado, I'd like to hand the reins over to Jonathan. Jonathan, take it away. All right, let me just get the sharing set up. Oh, and as you're getting the sharing set up, I should say that Isaac Johnson on the research team will be triaging questions on the Wikimedia Research IRC channel and also coming in through the YouTube page. So ping Isaac if you have questions for any of our presenters and we will get them into the queue. All right, now, Jonathan, take it away. All right, can you see me, like my slide? Yes, we can. Okay, excellent, let's get started then. All right, so as other Jonathan mentioned, I am also named Jonathan. And today I'll be taking you on a journey exploring the trajectories of blocked community members. And to get started on this journey, I'd like to tell you a story, one that might sound rather familiar. A user joins some online community, maybe Wikipedia, and over the next few weeks or months, they engage in some nice conversations and do some civil edits and all that kind of stuff. Until one day, they engage in some bad behavior, right, maybe some disruptive editing or harassment or cyber bullying. Of course, this reflects poorly on the community. So when a moderator or admin sees this behavior later on, they'll take some kind of action against this user. Okay, so so far, old hats, right? You've probably all heard a story like this before in some other papers or research talks. But I will point out that oftentimes when the story is told, it seems to just end here, right? People focus on this bad behavior or on the moderator response. When in reality, the story isn't over yet. Because we can now ask, after this moderator action happens, what does the user go on to do? And I think we would all love it if the answer to that question was, well, they just keep living their life, they come back and they keep participating and nothing bad ever happens again. I'm sure many of the Wikipedia admins in this room would be very happy if this was true. But of course, the reality is much more complicated. And let's just say that there's other ways this story could end. For one thing, the user could end up coming back and participating, but then engaging in even more bad behavior. Alternatively, they could just not come back at all, right? They might say something like, these rules are totally unfair, this moderator was biased, and so this community isn't worth my time. So I'm leaving. And I would argue that both of these should be characterized as bad outcomes, right? You either end up with more bad behavior on your hands, which kind of negates the purpose of the moderator action in the first place, or you end up losing a potentially valuable contributor. And we can give names to each of these three outcomes. And the purpose of this talk would be to ask, at the time of the moderator action, can we tell which path this user will eventually end up on? And we're gonna look at this from two angles. First, we're gonna ask about the characteristics of the user. And this is grounded in prior work, suggesting that certain types of users are more likely to violate norms. We are also going to ask about the moderator action itself, right? How severe was it? How did the user react? This is also grounded in prior literature. And if some of these citations look kind of old to you, like pre-wrote by the web old, it's because some of this work is actually going to be inspired by some previous work in offline law enforcement and law and law compliance. So as you might expect from the venue of this talk, the domain that we're gonna be studying is Wikipedia. And specifically, we're going to be looking at what Wikipedia considers disruptive behavior, which is defined as seen in this slide. Now, when a user engages in disruptive behavior, Wikipedia admins can respond by blocking that user. And just really quickly, what a block does is it prevents the user from making edits for a set amount of time, with the sole exception that they're allowed to keep editing their own talk page. This is gonna come into play really importantly later in this talk. And it turns out that on Wikipedia, when users get blocked for disruptive behavior, that almost half of these users after the block end up on one of these two bad paths that I mentioned earlier. So we can see that there's a lot of room for improvement here, there's a lot of room for good contributions from this research, if we can figure out exactly what's going on here, why are people ending up on these bad paths? So for a dataset, we scraped the Wikipedia block log, giving us over 100,000 actions. These are blocks for many different reasons. For example, copyright violations, which we're not really interested in for the purposes of this talk. So we're gonna limit the blocks only to those that have to do with what we might consider anti-social behavior. Things like instability and harassment. Now we're interested in what happens to the user after the block and we can't get that info just from the block log alone. So what we're gonna do is we're going to combine our data with the Wikicomp dataset. This is a pre-processed collection of all talk page comments that have ever happened on Wikipedia. Finally, we're gonna filter out bots and spam accounts which are honestly not very interesting and we do this through the use of a minimum activity filter. I'm gonna leave out the boring details but you can check out the paper if you're interested. After all of this is done, we end up with a final dataset of over 6,000 blocked users. Now, we might be tempted to just take these 6,000 users and immediately start doing things with them. I start doing experiments. But this would be a mistake. Why is that? Well, consider the case of comparing users who depart during their block to users who got blocked but stay on Wikipedia afterwards. This seems like a straightforward comparison until you realize that departure during the block is actually a subclass of a broader behavior, a more general behavior, which is just departure at all, right? Natural departure. People can leave Wikipedia for any number of reasons not just because they got blocked. And the challenge in this work is going to be we want to compare the orange to the green while making sure that in doing so, we are not accidentally comparing the blue to the green which is less interesting. How are we going to do this? Well, we are going to borrow some techniques from causal inference, namely a technique known as matching. Matching is going to let us make non-trivial comparisons between departure, the orange, and redemption, the green. And the way it's going to do that is for each user who departs during the block, let's call this user D. We're going to construct an experimental pair, pairing user D with a reformed user, a user who was blocked but didn't depart who actually stayed on Wikipedia. And importantly, the reformed user has to be blocked at around the same time as user D. This is the control for temporal effects. For example, maybe Wikipedia changed the rules between like say 2006 and 2007. And if we didn't have this control in place, we might be accidentally just comparing blocks that happened in 2006 to blocks that happened in 2007, which wouldn't be very interesting. And now to make sure that we're not accidentally comparing the blue to the green, we're going to add a second level of matching where we pair the pairs. For each experimental pair, we will construct a control pair containing a user N who was never blocked, but nevertheless departed Wikipedia around the same time as user D. And user N will be paired with a clean user, one who was active at and beyond the time of user N's departure. So I just want to point out that there's going to be an analogous process for comparing recidivism to reformation, but it's very similar. So I'm not going to go into details here again. It's in the paper if you're interested. So now that we have our controls, we can finally start asking the interesting questions, beginning with the user characteristics. What characteristics might matter here? Well, prior work suggests that norm violations correlate with how involved a user is in their community. And a simple way of measuring this is to just count the number of talk page comments. We refer to this as activity level. There's actually two versions of this. We can ask about how many talk page comments this user wrote to other user's talk pages. This is contributed activity. Or we can ask about how many comments other users wrote on this user's talk page. That would be called received activity. And it turns out that users who depart tend to have lower activity level, both in terms of received and contributed activity. This is very intuitive, right? Users who have less activity are less involved in the community and therefore have less reason to stay after they get blocked. Let me also point out that these differences are unique to departure during a block. So if you look at our control pairs and try to do the same comparison, you'll wind up finding no significant differences. So again, this is pointing out the importance of this control pairing. Now activity level is, this is a nice difference, but it only measures the amount of involvement. It doesn't measure what the involvement looked like, which you might also be interested in. For this, we're going to need a different measure, which we might term activity spread. Here's how this works. Let's say we have an example user who's written 100 comments. There's two ways this has happened. For example, this user could write the 100 comments over a lot of different user's talk pages where each interaction is very short, a very small number of comments. Or the opposite, they could write to a very small number of other user's talk pages where each interaction contains more comments, longer, deeper interactions. Based on this intuition, we define the contributed activity spread as the ratio between the number of comments written and the number of other user talk pages, those comments were written too. Intuitively, a low ratio correlates with the right picture, a high ratio correlates with the left picture. We could also flip these arrows around and analogously define received activity spread. So now we run our comparison again. We find that users who depart tend to have higher received activity spread. What does this mean? Well, remember that high activity spread means that the interactions this user is having are many in number, but very narrow, very small and not very deep. So what this possibly indicates is that these users are not as tightly integrated into a social circle. They don't have these deep and meaningful conversations. And once again, this comparison is only true for departure during a block. I will point out that there are similar results for both activity spread and activity level for comparing recidivism to reformation, but with a slight twist and one that really underscores the importance of these control pairs. But for the sake of time, I'm not gonna go into that here. If what I just said intrigues you, I invite you to look at the paper. For the moment, I'm gonna move on and ask a follow-up question. Now that we've established that these engagement features are correlated with future directory, we might wonder, is the correlation strong enough to actually predict future directory? Well, we can set up a simple SVM-based classifier experiment to test this. And we're gonna compare the two simple baselines. One is the reason for the block, right? Some types of blocks might have different outcomes. And the other is how long the block was. This can be dawned up as a proxy for the severity of the block. We're gonna throw in an extra feature, which is just how long the user's been active. This is just a basic measure of engagements. So starting with departure, we find that we can predict using the baselines what path this user will end up on. But if we now use these engagement features, right, the activity level and spread, we find that they actually get even better predictive power. So this is a good sign. This is telling us that the task is doable, prediction is possible, and that these activity features are actually giving us some information beyond what simple baselines would have told us. We can also combine these features with community age to get even more predictive power. For recidivism, interestingly, the baselines have no predictive power, but our features maintain the predictive power. So this is a really good sign as well. It tells us that these features are actually generalizable across different settings. Okay, so that does it for user characteristics. We found encouraging signs that indeed user characteristics carry some signal of future trajectory. Now let's move on to the second part of our question, which was moderator action. And specifically in this case, what we're gonna be asking is how do properties of the first block affect the likelihood that another block will happen in the future? Note that this is specific to recidivism as opposed to getting blocked in general, because it assumes the existence of these two different blocks to gather data from. I mentioned that this is loosely motivated by studies in offline law compliance. I put the emphasis on the word loosely. The extent of the motivation is that in offline recidivism, there are two views of repeat offense. The first view states that likelihood of repeat offense depends on how severe the punishment was. This is a very traditional view, right? It tells us that if you want people to not offend again, just punish them really, really harshly. But there's actually a competing view, which states that the likelihood of repeat offense is actually based on whether the offender perceives their punishment as fair. The intuition is that if the user thinks that the law is stacked against them, they have very little motivation to actually follow the law. Now think back to our prediction table. We saw that block duration, which is kind of our best estimate of block severity, actually didn't really have an effect. So what this is telling us is that in this context, it might be more promising to look at this other path, this perception of fairness. But this seems challenging, right? Perceived fairness seems like a very abstract idea. How do we even go about measuring it? Well, we can break it down in two angles. We can ask what can blocked users do to signal that they view the blocks unfair? Or we can ask what can admins do to signal to the blocked users that the rules are fair? For blocked users, we can take advantage of the talk page comments. Remember, they're allowed to keep posting under talk page during the block. And intuitively, if they think the block is unfair, they might complain about it there. For admins, we can actually look at the phenomenon of unblocks, right? This is cases where admins will manually lift a block, right, before it ends, essentially shortening the block. You can think of this as a way of showing leniency, right, of telling the user, hey, we are not jerks here. If we think you deserve a second chance, we will give it to you. Indeed, we can start by looking at these unblocks. And if compare users who were never unblocked, right, who served their full sentence, you find this base rate, almost 50-50 of reformation versus recidivism. But if you now compare users who were unblocked at some point during their block, we find a much higher rate of reformation. And this is actually a specifically significant difference. So tying this back into the theory, what this is telling us is that when users, you know, get told like, hey, you know, we are nice moderators, right? We're not unfair here. They're actually more likely to follow the rules in the future as the theory would suggest. Now, for the user's perspective, we're gonna look at linguistic indicators from talk page comments. We're gonna look at three different indicators. The first is apologies. If users apologize, this suggests that they acknowledge that they actually did the wrong thing, right? They acknowledge that the block was fair, right? Why would you apologize for something that you think you didn't do? On the other hand, a user could engage in direct questioning, right? Like for example, so what policy precisely have I violated here? This is very hostile and it suggests that the user is fighting back, which they would only do if they felt the block was unfair. Finally, as a very naive simple approach, we can also look for explicit uses of the word unfair or similar synonyms. Starting with apologies, we find once again that users who apologize are more likely to reform. And again, tying this back into the theory, this is consistent, right? This is telling us that users who think that the rules are fair, who acknowledge the fairness of their punishment, are more likely to actually follow the rules in the future. By contrast, if a user uses direct questioning or uses unfairness phrases, they're actually slightly more likely to recidivate compared to the base rate. And again, this is consistent with the theory, right? This is telling us that when users think that the rules are stacked against them, then as the theory suggests, they're less likely to have less motivation to follow those rules in the future. So coming back to this two-way split, earlier we found that block severity doesn't seem to carry very much signal, but these results do seem to suggest that at a very high level, some notion of perceived fairness does carry some signal. And the measures that I used here today are very simple, right? They're very naive and very coarse. So this really opens door to the possibility of future work, right? Looking at more sophisticated measures of perceived fairness and seeing if these results hold up. But these initial results do seem to indicate that this is a good direction of future research. And that is a perfect segue into my conclusions, right? Some closing thoughts. In this work, we have looked at two different measures. We have looked at user features and features of the block, both of which seem to carry some signal about future outcome of a blocked user. But there are a lot of limitations here that open up the path to future work. And the first thing I wanna be really careful to point out is that there's a key takeaway that I don't want people to take away from this talk. I don't want people to conclude like, hey, these people created a classifier that can predict with better than random accuracy whether a user will leave or re-offend. Let's take this classifier and put it into production use, right? I would think that's a very bad idea. For one thing, this classifier was just a proof of concept, right? It was not trained for a realistic setting, right? In real life, you don't have this sort of well-balanced, well-controlled data. And even if the classifier was really good, we would require careful analysis of potential risks and biases before we were comfortable putting it into production, right? Remember that in real life, classifiers that are used to predict positivism have, you know, revealed these, you know, very disturbing biases, right? So this gives us reason to pause and think, you know, are we really comfortable putting this model into use? What is this model actually doing? Even beyond that, at a more basic level, the setup that we used here does have a lot of drawbacks. As I mentioned already, our measures of perceived fairness are very incomplete and crude. Also, the use of a binary departure and non-departure feature is kind of simplistic, right? Users could, you know, come back but become less active, right? That's not departure, but it also seems very interesting in its own right. And finally, I kind of glossed over this, but the way that we're measuring re-offense is by the lack of another block. And this is not always a good measure, right? Maybe these people just get better at getting not caught. Finally, I will point out that the key drawback of all observational studies like this one is that we can't make any causal claims, right? So we might want to be able to say like, oh, unblocks cause better behavior, so we should prescribe this as a moderator behavior. And the truth is, you know, in this setting, we really don't have any causal claims, so we can't conclude that. We can't make that prescription. And this might seem very unsatisfying to some listeners, right? You might think like, well, if we can't make any policy prescriptions, then this seems like we haven't learned much, right? Are we just right back where we started? Well, I would argue the answer is no. And the answer is no specifically because of this fork in the road. In the past, as I mentioned, work has focused on the bad behavior and the moderate action. And we've kind of treated this moderate action as a standalone thing, right? Like, oh, bad behavior happens. Let me just block the user and forget about it. And what this research has reviewed is that this is not a valid way of thinking. Blocks have consequences and it's really important for moderators and Wikipedia admins to pay close attention to how their actions might be perceived. Because how the block is perceived could very well determine whether the blocked user ends up on the path to redemption, recidivism, or departure. That's my talk. Thank you very much. I will now open the floor to questions. That was wonderful. Thank you, Jonathan. And I will, I think we're gonna take one or two questions. I know there was someone in IRC and then hopefully if we are disciplined with our full docket today, we'll have time at the end for additional questions. Isaac, do you wanna relay some questions for us? Yeah, from IRC, I think this is about methods, Jonathan. A question was raised by Risker about, could you speak more to why you chose talk page or user talk page activity? I was pointing out that like due to templates and things being shared across talk pages, something like article talk page might be more pertinent, but just wanted to hear your thoughts on that. Yeah, that's a good question. I do want to be, there's a slight complication in how this data was handled. And again, I kind of left out those details during the talk for the sake of time and to kind of defer to the paper. But the research does not actually focus solely on user talk pages. So for example, when doing our activity filters, we do account for activity on article talk pages as well. And similarly when we're measuring whether the user departs or not, article talk page contributions are allowed. So if a user contributes to any talk page, not just user talk pages, we count them as still being active. The main place where we restrict the user talk pages is in counting the activity spread, right? So I drew that picture earlier of a user posting two other user talk pages and other users posting to this user's talk page. And the reason for that is that we're trying to, with that feature, we're trying to account for how directly this user engages with other members of the community, right? And since there's no like, you know, Facebook chat or anything like that on Wikipedia, the best proxy that we can use is to see whether user A, you know, writes to user B's talk page. And we count that as sort of a user A interacted with user B. I think we can do another one. From my RC, that's it. I had my own question, so I'm happy to ask them. But there's people in the room who don't want to ask. Well, can we go to the room? Okay, go for it, Isaac. And then we'll switch to Lane as soon as you're done. Yeah, and Jonathan, I wanted to clarify first which language I might have missed this, but which language communities you're working with. And if you're working with multiple, you saw kind of differences across them and could speak to that. Yeah, so sorry about that. I did leave that out. Thank you for putting that out. So this work is restricted to English Wikipedia. So unfortunately, I can't answer your question about whether there's differences in different language communities. I will say that this is a great avenue for future work because I pointed out that our data partly comes from this Wikicom dataset. And Wikicom is actually available in multiple languages. It's not available for all Wikipedias. It's only a very small subset, but at the very least, I believe, for example, that the Spanish and French Wikipedias are also available in Wikicom. So this, we're planning to release some sort of code for kind of reproducing this work, using this toolkit that I have on the slide, Convocate. And once that's released, we can start, if you're more interested, you can start picking up on that and start looking at applying this directly to other language Wikipedias. But yes, thank you for the question. That was a very good point. Great, hopefully we see some replication and extension in the future. Thank you. Yeah, this research seems ripe for extension of replication and maybe eventually turning into some sort of product or intervention. I liked your comments around that, Jonathan. It's easy to get carried away with this, with exciting work like this. All right, with that, we are going to transition over to Lane Raspberry, coming to us live from the University of Virginia, I presume. And Lane, I'm seeing your slides. Are you also hearing me? I am. So you should take it away. Thank you, Jonathan. I'm gonna be talking about how to detect misconduct on Wikimedia projects, particularly English Wikipedia, using some automated system. It used to be the case that when gentle people came to the internet, they were on their best behavior, now some people misbehave. And I'm talking about misconduct, and I'm talking about things like hounding, biting, being rude, grieving, flaming, spamming, spreading propaganda, or Wikipedia sock-puppeting, pretending to be someone who you are not. In this project, I serve the role as the Wikipedia or Wikimedia liaison for a team of graduate students they're unable to present here today. They did the technical aspects of this project, and I only facilitated. They've actually published a lot of media about the research they did, including a video, which is on the Meadowiki page. This presentation that I'm giving, it's going to be targeted to the Wikimedia community and to other researchers who might want to replicate or go in a different direction with wiki-based research on moderation and detecting misconduct. So thanks to Arnav Sarkar, Charri Rawat, and Sameer Singh. They just graduated with their master's degrees at the Data Science Institute here. Please check out their meta-page, and I'm going to give an overview of the media that they published about this project in the course of this presentation. So in this research project, what happened was three goals were set. One is to evaluate data, characteristics of blocked users in English Wikipedia for the benefit of the wiki community to discuss what it means to be blocked. Why do users get blocked and what kind of response should the wiki community develop to that? After we get the data about these blocked users, the goal was to come up with an algorithm which would rank users by the naughtiness that they were up to. Identify who is up to the most trouble, put them at the top, and then give this list to the wiki community to consider. And then the third goal was to actually come up with some kind of ongoing process, a bot, that would patrol Wikipedia, consider unblocked users who are engaged in whatever behavior, have the bot evaluate them according to the algorithm, and then again present that to the wiki community for some actions to be considered or taken. In this presentation today, here's what I'm going to cover. First, I'm going to talk generally about what it means to set up robot moderation of an online platform where users generate their own content. I'm going to talk about this particular project that we did here at the University of Virginia. And I'm just going to back it up here. The reason why you should know about the general system of moderation is because this project that we did, it's only one of 100 that ought to be done. So we looked at some aspects of English Wikipedia, but anyone could go deeper, looking only at the promotional editing, or only at the harassment, or only at using proxies or sock profits to evade moderation. And people could do these same kind of projects in other languages and for other wiki media projects. So this actually needs to be done, broken down into parts by the research teams to be replicated for more specific reasons. I'm going to say something about what barriers we faced in this project, which if those barriers were addressed, other people would be able to go further or easily. And I'm going to say something about what it means to do research in collaboration with the wiki community in general. All right, so talking about machine learning for moderation, this is basic for some of you guys, but it's kind of new, but it's kind of been done a few times so far. So of course, the big platforms, Facebook, YouTube, Twitter, they have these automated processes in place. What's unusual about Wikipedia is that the data's open, anyone can replicate the algorithms in the process, apply this to Wikipedia and learn this for themselves. Whereas in the more closed platforms, it's not possible for people to get access to the data. So there's a lot of students who are interested in learning something about data science, want to know more about online platforms of communication. They learn about moderation and machine learning through Wikipedia. So this is just one research project that's an instance of that kind of student research and student project. So to do moderation of an online platform with automation, you definitely need a digital platform and you need a lot of data. So millions of user interactions and you need to be able to have access to those. This is possible through the media wiki. So wiki makes this freely available. You can get these for any language or any project. There's a database schema that you can navigate and get the user behaviors or user activity records and publications that you want. You come up with some kind of artificial intelligence that's going to look at what kind of human moderation has already been performed. That is you need the user behavior and you also need years of precedent or millions of instances of human moderators, making some kind of judgment and you've got to suss out from the data when did the humans make a judgment and do something? Like for example, when did humans block a given account and what was that account doing in the time leading up to its block? So this is what we were looking at and this is what you look at in general. Okay, after you've looked at the behavior of the blocked accounts, then you look at behavior of unblocked accounts and see if the algorithm can detect some kind of trend or the users who are not blocked engaged in the behavior that users who were blocked are doing. After your algorithm does that, you give the algorithms results back to human moderators so that they can see is the algorithm, is this spot doing the correct thing or not? And as the humans correct the algorithm, it gets better and it needs less human support to do its own work. If anyone wants to learn more about this, a good place to start is our favorite encyclopedia, Wikipedia. And these slides, you can click on any of these links, these bring you to the Wikipedia articles of the same name. I'd like to say something else that some researchers neglect. And that is when you do research like this, you're doing it in partner with the community which is in that platform. And it's good for researchers to explain in the simplest terms they can in a way that's accessible and understandable to the community who's affected by these systems. So not only are these Wikipedia articles good for researchers who are getting started doing this kind of activity, but also when those researchers are explaining to the community what they're doing, then they should understand that that community is gonna look at Wikipedia and try to learn something about this themselves. So these Wikipedia articles are relevant as a sort of community outreach. Now I'm gonna talk about the particular research that we did at this university. And I'll begin by saying, you should look at the researchers on publications, what the graduate students themselves published. So of course they published a research article. It goes into a lot of detail. Anyone can replicate what they did. They've got their code on GitHub. It's possible to get their data sets. They say how to replicate everything they've done. So much of that is in the research article. I put the research article on Metawiki in a project page just for this research project and the students contributed to that meta page. There's links out to community conversations, other notes about the research. And this is how research works in the Weki community. The researcher should have some page where anyone from the community can contact them, give feedback in comments, and also put your research products, whatever you publish on that page. To compliment the research article and it's on Weki reporting, each of the students who were involved in this, they published their own essays about the ethical decisions that they had to make when there was some kind of tough decision and they reflected on what the challenge was in navigating those issues. And also on this meta page, on YouTube as well, Week Media comments, there's a video of the students talking for 17 minutes, a technical summary of everything that they did in this project showing their data, their processes and everything else. For anyone who wants a more technical explanation of this project, hear it from the researchers themselves. It's available on Weki. All right, so what do we do at this university? What do you do in general for these kind of things? So you got a media Weki and you get the amount of data that you want. It's gonna be too much data. You navigate through the schema to find the parts that you want and you have to have conversations with the Weki community to understand what this means. There's some things that you can learn from an experience we can feed in that some things that you're just gonna have to navigate around the website to ask about and see what other people are doing in this space. These kind of projects, it's possible to get some kind of result within a month but we were taking it easy with this and the students did this iteratively. They did the project, they asked for feedback, they did it again and especially they had a lot of questions about what is the social behavior in Wikipedia? If I'm looking at a certain kind of data or a certain kind of interaction, how significant is that? To what extent is that a problem? Which things are more important and which things should they ignore? They published this content on Meadowiki and then we had some conversations with real live Wikipedia in particular the New York City Wikimedia chapter who's very interested in moderation and safety. They gave a lot of essential feedback on this research. All right, so here's a technical summary of some of the things that we did in this research. So we found a million blocked users, okay? Looking back to 2004 until the present time, million people, they've been blocked by humans, or by human established process. Then we considered six million users who do not have a block but we only looked at recent behavior because these are the people who might need moderation right now. There's absolutely too much data to compute so it would have been much too expensive. We had to sample the data we have to only look at some of it in developing the algorithm. And again, if somebody were to do this project in another way, they could go in a different direction with this and look at other data or in the future, people are gonna be able to put more data into this to come up with an algorithm. This team came up with an algorithm that after it looked at the available data had an area under the curve of 84%, which means that they made the algorithm forget everything it had seen, look at the data that it had already considered and regrade to see if it would get the same answer about whether a user has blocked yes or no. Looking at the data, it learned from. And then we generated a list of the users who have not been blocked but who based on the behavior of other users likely merit a block. And this raised one of the first social issues. Supposing you generate a list like this, what can you do with it? Should you publish such a list, a black list and start accusing people in the wiki community of misbehavior with this algorithmic system? So lots of ways to take this conversation. Should wiki have these kind of automated moderation systems? There's pros and cons and it's going to take a wiki community discourse to sort this through. What speed should this develop? When should we use it? In what way should it be used? So of course, the reason why you wanna do this is because wiki needs the administrative labor. There's not enough humans on wiki to watch everything that's happening. There's so many edits being made and we want to support the humans and doing moderation with whatever tools are appropriate. Of course, one of the cons of this is that this kind of system, people tend to respect it, perhaps even more than humans. And that's never the intent of making these kinds of programs. This is supposed to support human decisions, not replace them. Another advantage of this is that it scales so we can meet your projects. As more people let it wiki and as we need support in other languages, it's very useful to apply things like this. It's not gonna be expensive to develop the technology for these kind of moderation systems. What is gonna be very expensive is the community conversation about the circumstances under which we should be using these. So if we were to go further with this or as we go forward with this because it's gotta be explored, we have to talk about this. We need to have on wiki conversation with the people who are going to be affected by this about the circumstances in which humans and bots are going to interact. If you have a patrol bot who's recommending that a particular user might should be considered for a block because of misconduct, then what is that human and bot interaction gonna look like? What kind of media records are we gonna keep? How public do we make these kind of things? For these kind of data science projects, anything in machine learning, it really helps to get standard labels on why someone was blocked. And what we identified in this project is the most common reason for someone getting blocked is because they were doing proxy editing. But the thing about proxy editing is that it's almost always tied with some other kind of history of problematic behavior. Someone's avoiding trying to avoid detection because they're up to something else. Maybe proxy editing could be deconstructed into multiple categories. Maybe it's tied to something else. I don't know. That's another research project. But right now, it's about five popular labels for reasons why people get blocked. Also, the moderators can apply free form text. Maybe we should have more standard labels so that researchers can pull out the kind of blocks that they're interested in or the kind of bad behavior that they're interested in and research deeply in those. It's nice if more researchers have an easier time coming to Wikipedia, getting access to the data. Definitely many more people are gonna be interested in the future. And it would be nice to have a place on Wiki to have private conversations like if we have this list of users who are engaged in problematic behavior, where can we discreetly share this with experienced Wikimpedians who will handle it appropriately? If anyone's thinking of doing un-Wiki research, my recommendations are documented on Wiki. This way that you can have a communication channel with the Wiki community, use free and open licensing. This is the recommended terms. Be cautious about requesting Wiki community member time. There's a lot of researchers who ask a lot of the Wiki community. Survey them for hours. Be very cautious about doing this. Publish an ethics report. Ethics isn't a side aspect of the research. Your ethical evaluation is an essential component of the research. And whoever's doing research should talk about the problems with the research that they come to know about. And lastly, share credit with Wiki community organizations who contribute something to the project. So going through some thanks. Again, this project belongs to the student researchers. They just graduated. Sharu Arnav and Soyer. The faculty advisor for this is here. Thank you, Wikimedia Foundation, Trust and Safety Team, Sydney Port, Patrick Geerley, Claudia Lowe. Thanks so much Wikimedia in New York City. Wiki LGBT, long suffering. The subject of much misconduct. Thank you for your feedback. Wikimedia Medicine, in medicine, that's where most of the spam comes from in Wikimedia projects. And they were essential in helping this team find out about spam information. Thanks Wikimedia Tool Eps for setting up online database access. Thanks to everyone who contributed to the project. I could take questions. And I'm here at the end as well. Awesome. Thank you very much, Lynn. That was a great presentation, well-executed. As somebody who also sometimes has to present research that I didn't directly do and may not fully understand, I feel like you did a good job. All right, let's start with Isaac. Do you have any questions from IRC for us, Isaac? Yeah, there's a question from IRC. So, Lane, if you're familiar with the Google detox project, this was like a classifier for toxicity. And one of the criticisms of it was that it oftentimes would focus just very heavily on bad words, like it was a bad words detector. And so the question was asked if you could speak to the differences between that project and maybe some of the approaches that were being taken in this project. Yeah, so the Google detox project, the research paper mentions this. This project did pull information from the Google detox project. And it's great that Google did a precedent with this. They made a fast pass to do moderation with their jigsaw tool. This project pulled data sets from multiple sources. And what we anticipate for the future of this kind of research is community curated data sets that we share and develop collectively so that when researchers are developing these moderation processes, there's some standard data sets that they look at and that the community is recommending. So that wasn't available at the time that Google did their jigsaw project because it was so new. There's gonna be Google and Facebook and all these other big companies that are making standardized data sets as well. And somehow in the wiki community, we're gonna have to learn to index and catalog these kind of data sets so that when researchers come to wiki, they can start running with recommendations rather than browsing the open internet, looking for data sets that apply. Great, thank you. Excellent. We're doing pretty well on time. I'll open it up to a question or two from the room if there are those who want to ask. There's some questions, others are not. Yeah, it was a question. Okay, well, it looks like we'll have time for questions at the end. So stay tuned if you have a burning question but are working up the nerve to ask it. With that, let's move on to Marki Wang who is going to be presenting, in this case, this is a new criminal presentation on some ongoing research. So thank you once again, Lane and Morten, take it away. Thank you, Jonathan. It was a pleasure to be here. I'm just gonna share my screen so everyone can see, hopefully, oh, there we go. And yeah, my slide started. I see your screen. You see my screen, can you see my slide? I see your slides. Wonderful, thank you. So I wanna talk a little bit about what we are learning from the partial blocks being rolled out on Wikimedia Wikis. This is work in progress. This is going to be lightweight. I basically just wanted to share some insights into how these are used and what's going on. So first, I wanted to kind of like put up a little bit of context. Jonathan Chang in his presentation also talked about like the historical site-wide blocks in Wikipedia which are kind of more like a wall that tells you to stay out. If you get a site-wide block, you might be able to edit your talk page. Some users are actually not even able to do that. But now there's partial blocks which try to attempt to be more like a fly swatter. You take out the little thing that is wrong or that is not wanted and the user can then have the potential ability to go do other things. And here's a rough timeline of the way I've been involved with the products and what I've seen. The development of partial blocks started or continued in 2018. I worked with the anti-arrestment tools team to discuss and figure out how do we measure block effectiveness. There's a link to our meta page where we discussed this and also had some conversations with the community around this. In January of this year, the partial blocks got deployed to Italian Wikipedia as the first wiki to use them. And now at the end of June, it's deployed to 19 wikis and more are coming. There's a fab task link for those who wanna keep track of what the status there is. And during this process, I've been gathering some statistics and doing analysis around both site-wide and partial blocks. Here's the agenda for today. We'll focus in on statistics on Italian Wikipedia, partly because they were the first to adopt it. They've had it for now almost six months and they've been using it quite a lot. Some of the other wikis have not had them as long or are not really using them that much. And the things that I wanted to focus on in kind of like keep this lightweight and provide some useful insight is to what extent are they being used and when they're being used or how are they being used? Are partial blocks being used to block registered users or IPs? Partial blocks, you can set them to block a specific page or a complete namespace. So I wanted to figure out are they used to block pages on namespaces and if they're used to block namespaces which ones are end, do they end up blocking? So how often are partial blocks being used on Italian Wikipedia? I was trying to figure out some kind of measurement like provide some scale to get a sense of what the counts of partial blocks are. So I ended up with this historical statistic on the number of site-wide non-indefinite blocks that are set every month in Italian Wikipedia for registered users. So this is the past five years from 2014 up through December 2018. And as you can see on this graph, the raw data is the dark blue line from like 2016 through 2017, you're looking at roughly a hundred registered users getting a non-indefinite site-wide block every month. So it has an expiry date set which I thought was a little bit more like the partial block in the sense that, okay, you're getting blocked but you're not getting blocked indefinitely. It's not like we don't want you to be here. You can come back later and do some work. And then we see an uptick in 2018 and at the end of 2018 as well where it's like a little bit more around like 150 to 200 blocks a day, blocks a month, sorry. And then here are statistics on the number of partial blocks set every month from January through June this year. This, the data for June is up until the end of last week. So there's still a few days left in June to increase that number, but I thought I'd add it anyway. As we can see, you're looking at roughly maybe 50 or up to 100 partial blocks set every month. This is partial blocks for both IPs and registered users. So in comparison to the previous graph, we're looking at perhaps around half of what you see for site-wide, non-indefinite blocks for registered users. But kind of like in the ballpark about about 50 blocks a month. The second question is, do they block registered users or they block IPs? And the answer is roughly 50-50. This turns out to be actually roughly 50-50, except for that March month where we saw a lot of blocks being set and where more registered users were being blocked. So in March, it was about 60-40 instead. But often, partial blocks are used to block specific IPs from editing, or it's sometimes it's about half the time that it's used to block a registered user. Another third question is, is this used to block specific pages or is it used to block entire namespaces? And largely, it is used to block specific pages. So here I decided to plot the number of blocks that are set every month, that block names one or two or three namespaces. The most I've seen is blocking free namespaces. In January, none of the blocks block namespaces all of them block pages. I think that's partly because the namespace block functionality wasn't around then. In February, you see five and then from there on, it's about 10, maybe 15 blocks a month. As we saw previously, in March, we had 100 blocks set in total. So 15 of those is roughly then 15%. April, May, June, you're looking at about 50 blocks in total, so then you're looking at maybe 20%. So the vast majority of partial blocks that we're seeing block specific pages. It don't block an entire namespace. But the feature is being used, like there are blocks that are blocking one or multiple namespaces. But if they block a namespace, which ones do they actually block? Well, it's the article namespace. This was not surprising to me, I would expect that, that if they block an entire namespace, they're gonna most likely block the entire article namespace. The other one that shows up on this is namespace six, which is the file namespace. So you can use a partial block to say you can't upload files anymore. And then for the other namespaces, it's mostly like single or very low digits. So kind of like in conclusion, if they're blocking a namespace instead of specific pages, they're most likely blocking the entire article namespace. So in summary, what have we learned? The partial blocks are being used. There's about 50 blocks among Thundy and Talian Wikipedia. About half of the blocks are IP users and half are registered users. About 80% of them block pages, 20 block namespaces. And if they block namespaces, it's typically a single namespace and it's typically the article namespace and secondly the file namespace. I thought I mentioned a little bit about next steps that are happening in this space. I have continued work on this where we're looking more at the kind of like effectiveness of the blocks. So we're looking at two questions, particular do partially block users make constructive edits while they're blocked. So if you're restricted from editing a specific page, will you then go on and make constructive edits, meaning non-reverted edits to other pages perhaps? That's the potential consequence of this. We're interested in learning that. The other is do, if someone gets the partial block, is that block likely extended? So undo this pages added to it, is our pages converted to a block for entire namespace, does it get extended? Or does it get upgraded to a site-wide block? Roughly what I'm looking at from the data, as far as I can tell, the answers to both questions are no, but we'll have kind of like your actual good data on that pretty soon. There's a FAB task, there's a GitHub repository where I have my Python notebooks and stuff. So there's possible to follow along on this as well. So that is my presentation. Thank you so much. Please ask any questions you might have. Happy to be here. And we're happy to have you. Thank you very much, Martin. So I'll checking in with Isaac first. Do we have any questions for Morton on IRC? There's nothing on IRC right now. Okay, good, because I have a question. So Morton, I was curious about, so you say that you're seeing an average of 50 partial blocks a month on Italian Wikipedia. And presumably before this feature was implemented, people were still blocking people on Italian Wikipedia that they were using different sorts of blocks. So I wondered if you'd looked at or intended to look at kind of the, whether people were using partial blocks instead of, so site-wide blocks, or if this was actually being used as kind of an additional graduated sanction approach for people who might not otherwise have blocked. So are partial blocks cannibalizing full blocks or are they increasing the overall number of blocks? That is a most excellent question. And that is one of the questions that we do have in our list of questions that we want answers to, specifically, trying to understand, do the partial blocks come in place of a site-wide block? So are they actually doing these kind of more targeted, very specific and time-limited blocks on specific users? Or do they end up being like in addition to the site-wide blocks? We're definitely interested in that. So that's something that we want to look into as well. So right now I don't have any answer to that, unfortunately. Well, thank you for presenting this, what the work you've done so far and your plans going forward. Any questions for Morton from the room today? Okay. In that case, I've got another question for each of our presenters since we have this generous amount of time available to us. But first, I wanted to check in again. Isaac, are there any follow-up questions for either Lane or Jonathan? Yeah, actually one just came in for, well, for Morton on risk care about what user action is resulting in partial blocks. Are they used as a step down from a full block or is it something, yeah? That's a great question. And that is currently not something that I think we're looking to look into. We're more, like the way we've been looking at it is more on the side of what happens once the block is set rather than why is the block set in the first place. I'm not sure if other members of the anti-raspment tool team that we're working with have good answers to that or have looked into it. But yeah, and the work that I'm doing, it's more about okay, what happens once the block is set. Yeah, it's interesting. It seems like this new technical capability has implications for both Lane's group's presentation and also like potential follow-up work for Jonathan's group, right? Cause we were looking at what are the reasons why people tend to get blocked and then what are the outcomes for block users? And now there's this more nuanced tool. I think that raises a whole new set of really interesting research questions there. So actually I'd be interested if Lane or Jonathan had kind of had responses or thoughts about extensions of their work, given we now have the ability to track blocks in a more fine-grained way and people can do blocks in a more graduated way. Yeah, I remember actually reading about the partial blocks thing a few weeks after or like, I think after we finished our paper or we're about to submit it, I don't remember the exact timing, but I remember thinking like, damn, if we had done about this earlier, we could have incorporated this somehow. But I definitely feel like when Morton brought up something very interesting in his talk, which was, what are people doing while they're subject to a partial block? Or what are they doing elsewhere on the pages that they're not blocked on? And for us, we were limited to looking only at the user talk page because that was the only page they were allowed to keep interacting up. But you would imagine that some of these signals, you could imagine someone being passive aggressive on other pages, right? Like, well, I'm banned from the page about culinary diversity. So I'm just over here, stuff like that, right? But that's right now, it's entirely hypothetical. So I would love to be able to look into that. Let's go for it, Lane. Same with me, of course. So we didn't consider partial blocks in this research, but it's another kind of data. It's very nuanced. It's for a specific reason. It's more closely tied to the activity which we want to discourage. Other things that could be useful are any kind of flagging from users about particular edits, it's problematic. We don't have that right now. And we have this system in Wikimedia projects of applying warnings to top pages, but these warnings are so diverse. Well, there's maybe a hundred different types of warning that could be applied. And also many people do free text, free prose warnings. So it's not as if we can systematically pull all these warnings out and find out what exactly is the problems with users. I'm very hopeful that the more specific we can get with the kind of moderation than Jonathan's project about discouraging recidivism. Bad behavior, continued bad behavior. We can identify that more quickly and then make users behave nicely. Encourage them to do the right thing. Yeah, plus one of that. Isaac, do you have anything from IRC? There is a question from YouTube from Jane for Lane. He asked, he said the research paper at the discussion of the features extracted that all seem to be neutral with respect to behavior. Was wondering whether you could talk to some of the more predictive features found by the modeling. In the modeling, everything's based around whether a user has just gotten a block and then once we have identified a user with a block, then we looked back as far as we could, as far as we had the computational capacity, which turned out to be two weeks of behavior. So this is assuming that if somebody gets a block, then they were engaged in bad behavior and detected relatively quickly. So that's one of the biases of the study. If somebody had done bad behavior before and was only now coming to be known about it, then we missed that. So, yeah, it's not as if we were looking at warnings on a page or particular connections where an administrator had flagged certain behavior and then we went and looked at that particular behavior back at a point in time. It was only that this project pulled out everything that people had recently done. And if we had better data or a better way to flag certain behavior, we certainly would have wanted to do that. I can't think of a clever way to do that easily at scale. So I think that this is a question I had to when I'm looking at your slides and maybe if you just don't have this information available. But so I think that the question there is more about, and I may be wrong, this may just be my question. So you have all these different features. Somebody's toxicity score, their horoscores for the revisions they made in the previous two weeks, their username, activity stats. And these are all, we assume predictive to some extent of whether they were subsequently blocked. And I wondered if you had an idea which of these features turned out to be the most predictive, right? Which was the best source of signal to predict whether somebody would subsequently be blocked. I don't have that information. Cool. Well, that's a good thing that we have the research paper and a video from the students themselves, which is linked on the media occupation. I encourage everybody to go and watch that as well. I had one more question for Jonathan. So, and I don't have your slides in front of me yet, but so I think that, so you were showing that you did this linguistic analysis and you looked at kind of three types of responses that somebody might make to a block and then the impact on recidivism, the word of the day. And I was curious where, how you decided on those three you know, types of, I don't know what you'd say like rhetorical strategies or politeness strategies or things like that. Well, you use the word politeness strategies and that's actually exactly what they're called. So, this actually does come from a, I guess you might remember some of the work that my group presented last year, but for everyone else, there has been prior work out of the lab that I'm part of on anti-social behavior and more broadly speaking on politeness. And one foundational kind of work in this area is there's this notion of politeness strategies which actually comes out of the linguistic literature. Like this is in around like the 1980s. And basically, you know, the literature identifies these different types of behaviors that are associated either with politeness or with impoliteness. Apologies and their questioning were just two examples where apologies is the social politeness, direct question with impoliteness, but there are a lot of others that weren't shown here and the reason that they weren't shown here is simply because we tried them, they didn't do anything. And it does make sense, right? Like there are, for example, one politeness strategy or rather impoliteness strategy that is actually very indicative of like anti-social behavior in general is starting sentences with the second person, right? Like, you did this, you made this edit. Whereas that didn't really come into play here and if you think about it kind of like an intuitive level it's almost because you would expect sentences to start with the second person in this specific context, right? Like this is a person, 90% of the cases probably they're talking to the blocking admin and it's like, hey, you blocked me. And it's like, well, okay. Well, if everyone is starting a sentence with you then no longer carries in the signal, right? So, but yes, there are a lot of other politeness strategies out there that aren't shown here but they are useful for other tasks, just not this one. Awesome, yeah, that's really interesting. All right, this was a wonderful set of presentations. I'm so glad we were able to place them all together today. Thank, I wanna thank Jonathan Chang and Lane Raspberry and Morton Mark Wang. We hope to see you all back in the future to follow up on this work or hopefully other people grabbing this work and running with it as well. Thanks for meeting with us and thank you everybody who listened in, chimed in and otherwise participated. We will see you next month which will be the July Research Showcase. And that will be on the day of, that's scheduled for the 17th of July. So we'll see you back here on July 17th and that's it. Have a great morning, afternoon or evening everybody.