 It says it's streaming. Okay. So the stream link is also in the etherpad. And the IRC client is disconnected. Did we get the IRC or the stream link posted? I think we're still working. Let me see if I can reconnect. Did you have a remote? It may not work. Yeah, it's not working. It's still not working? So I did have a stream link that I know worked. Here we go. As soon as I get confirmation from IRC, it'll kick us off. Okay, great. Okay, so my deco. Wonderful. So welcome to the Outdoor the Dangers and Transparency session. Thanks for joining in this discussion. So as I put this slide up at the beginning of the session, a few people came in afterwards. So this is our etherpad. And our back channel is in the Wikimedia AI channel, which I'm hoping to invite you all to, because that's where most of us AI folks congregate. So I wanted to kick this off with why do we use algorithms like artificial intelligence in our projects at all? What's the value that we actually get out of systems that use advanced algorithmic methods? And so I wanted to use the example of a vandalism detection model because it's one of the AIs that I think helps us a lot to get our work done, but it's also kind of scary. So beneath the vandalism detection model is a machine classification system that takes statistics about edits that happen to an article, to a talk page, sort of thing, and feeds them into a machine learning model. And so in this case, for statistics, I have the number of characters added, the number of characters removed, the number of bad words added, using like a word list of curses and racial slurs, that sort of stuff. And so we asked the machine classifier to learn the patterns, the correlation between these statistics and the things that we wanted to predict, such as good and bad vandalism and not vandalism. And so you can use a machine learning model like this to take a massive review stream like the 160,000 edits that English Wikipedia gets every day and split it into the subset of edits that need review that might be damaging and the subset of edits that we know pretty well are not damaging or at least the machine learning model can tell us is we're pretty sure are not causing harm to the wiki and so therefore don't need at least a media review. And so using a strategy like this, and so I'm using ORS as an example because ORS hosts one of these quality prediction models. By the way, ORS is a machine learning as-a-service system that's running in production on ORS.wikimedia.org. We can talk about that more later if we wanted to get to that. But anyway, without this type of prediction, my estimates based on how long it takes people to label vandalism is we'd have to spend about 267 hours every day to review the entire recent changes feed for English Wikipedia. And so that's about 33 people working for eight hours a day. But with a machine learning model like this, we can cut that down quite dramatically. You know, if people use this model in only review that 10% of edits that we feel need review that the machine learning model thinks needs review, then that cuts it down to about 27 hours per day, which is about four people times eight hours or 33 people doing less than an hour of work. And so you can see why people want these types of algorithms to operate in their system. They make things more efficient. They reduce the amount of work that we need to do. So what's the problem? Well, so this algorithm is making a subjective judgment. And so to make an argument about this, I'm channeling Zanem Tufeki. So Zanem calls subjective algorithms, algorithms oftenated by big data, that now make subjective, or now make decisions in subjective realms where there's no right decision and no anchor with which to judge outcomes. And so examples of this are what's good and what's bad, what's relevant, what's important, what's desirable and what's valuable. You know, there's nothing really objective to measure this against. It's really people's judgment that's the best that we can use to find out is the AI actually doing what it's supposed to do or not. And so the problem with having a vandalism thought that makes these kind of subjective judgments is it can put a barrier on who's allowed to participate. You know, if the AI has targeted somebody who's not doing vandalism but accidentally catches them as vandalism, then it will make it much more difficult for them to participate in our spaces. You know, it decides, or at least has a large effect on who gets labeled as a bad faith editor or a vandal and what types of contributions will be labeled damaging. And so essentially the thing that, one of the things that we can worry about with this type of modeling strategy is that it can get biases. It won't always just predict good and bad with some random mistakes or high quality or low quality with random mistakes or harassment and civil behavior with random mistakes. We can encode terrifying things into these prediction weights. And so this is another quote that I'm pulling from WNT who posted on the Wikipedia signpost after we first announced the deployment of ORS in these vandalism detection models. WNT asked us to please exercise extreme caution to avoid encoding racism and other biases into an AI scheme and goes on to discuss how we might accidentally encode a bias against editors who are working on articles about Pakistani villages because there are a lot of editors who revert all of those edits and delete all of those articles because they have a bias against coverage of those types of things as opposed to every village in the United States where we definitely have an article for every one of those. So essentially we could find that our model that say for example predicts harassment or civil behavior it could instead predict harassment and messages from South Africans because it turns out that the way that South Africans speak people tend to label this harassment and so therefore the machine learning model learns the biases of the editors who are already there. So this is scary and this is why I want to bring this up because one of the first questions... Well so the first question that I want to ask is what do we want to use AI for and why? And so I've given one example of where we use an advanced algorithm to help us solve a problem. But I wanted to socialize the idea that these things are kind of scary so that we can also talk about the kind of problems that we're worried about the kind of things that AIs might do that we don't think are okay or at least make us feel uncomfortable. You know and if we can to move towards ideas on how we might minimize these problems and the general policies we should apply for allowing these things to operate in our spaces to operate on our wikis. So the protocol for this discussion oh by the way did we get a gatekeeper? The gatekeeper... Did I see no? Okay, alright. Oh was that? This is advocate role. Advocate role? I don't know what they advocate role. What's the advocate role? And I'll see if I understand. So advocate role is pretty good for this. It might be a gatekeeper role but if you're advocate then we'll work with that. So this protocol that I would like to encourage us to adopt for this discussion is that if you have something new that you want to talk about you raise one hand. And if you want to continue a thread in a conversation you raise two hands. And so it'll be the gatekeeper slash advocates role to track the stack so that we generally get through the continued conversation before we move on to new things. And we'll have to leave it to the gatekeeper slash advocates judgment for when we've been pushing on a topic too long we should probably move to a new topic to make sure that new topics get discussed. And so, not my judgment your judgment. There is no... Sorry? There's no AI... It will be a subjective judgment there is no objective basis by which you can measure the quality of your work in this regard. So with no further ado I want to open the floor with if it's okay with so like one of the things that I'm most concerned about with AIs is how they might limit people's ability to participate in our projects that we might flag new editors who edit in ways that aren't bad faith and aren't really damaging if you look at it carefully but look different from how the people who are currently part of our communities are working and because they're different they'll get flagged as bad or wrong or damaging at least as anomalies and that might make it hard for them to participate. And so that's one of the problems that I see and I think is something that we should be very concerned about with using these kind of strategies especially around quality control. Oh one more thing. So I'm going to hop into IRC because it's kind of hard to watch the room and watch IRC so I'll let you know when there's something from IRC. Oh you are? Okay wonderful, I'll lean on Matt. And we can use the ether pad to track the stack if that makes it easier. So if anybody has something to add on top of that that concern two hands, if you would like to say Matt I want to start a new topic one hand. You've been working on this stuff for a while you need to take extreme questions to not be biased based on racism. I assume there's stuff you did to actually address that but what's worked so far and have you actually caught some of these cases? So there are two things that we picked on we haven't found like racism exactly, that's much harder because there's no like real explicit encoding of race in the wiki so if we were picking that up it would be hard for us to notice but one of the things that we did find was that our vandalism detection models were strongly biased against anonymous editors. Now it depends on how you look at the world whether this is a problem or not because a lot of the vandalism does come from anonymous editors it's not that the model was learning something wrongly we asked it to use whether this editor is anonymous or not as one of the features that it uses to make a prediction and it did so effectively because if you're editing anonymously you're more likely to be vandalizing than when you're not but we did figure out that one of our models was using that feature to an extreme extent in order to maximize the fitness statistics that we were evaluating it on which is the area under the receiver operating characteristic and we found that after switching to a different modeling strategy we didn't have to give up that core metric which essentially measures like how much of the recent changes feed and you filtered out and still catch all the vandalism but it wasn't so directly biased against anonymous editors we still do have a bias against anonymous editors but it was dramatically lessened by this work. The way that we discovered this was actually talking to people about what they thought the predictions were doing we didn't find this by taking measurements we decided to take measurements because people told us that they thought that this was going on and they were able to do that because we gave them spaces to come together to talk about false positives and so that's worked for us in the past and we've addressed some problems Do we have any mics? Ah yes, there is a mic here but it doesn't look like we can move it I wonder if that goes to IRC Yes, people on IRC say that you're fine but they aren't here being announced Yeah, there's a mic break here for me Okay, if you're going to have to come up then Sounds good Sorry about that, IRC folks we're going to have people actually walk up to the mic in order to continue the conversation Adam, since we kind of interrupted you could you repeat what you said into the microphone Thank you Is this a microphone? Is IRC screaming back? Is this a microphone? Out of these speakers I think it's just to go into the stream so I imagine that that one is working Okay, sorry, it was just a small note and I was just saying that another thing that caught my eye about how to how to find effects like racism but in a... so one way you could do the model is you could say anything that's been reverted forms like that's our training set and then we should revert things that match that but then that will perpetuate whatever biases there are which include ethnocentrism racism we don't know what's in there necessarily so instead they used a labeling campaign to have people key in whether edits were damaging or bad faith then those are maybe more more discreet things to measure and using that as an error signal hopefully eliminates some of the other types of bias and I think we were discussing this yesterday and you mentioned that you deliberately avoided using a bag of words model and some of these trainers so that you didn't inadvertently pick up that people who use certain types of language or biases against people who use certain types of language Hi, Daria, just to make a quick note on the comment you made about racism I got into an interesting conversation lately triggered by a series of articles that were published in The Guardian about racism that is basically embodied in predictive search recommendations on Google for example and someone pointed out that even on Wikipedia that we consider free from these problems if you start searching for Islam the second result that the search engine recommends is the article about ISIS and that can be perceived by Muslims as insulting and racist and in the same way the fact that we have a prevalence of content let's say that is more oriented towards topics that are of interest to the male population but if we build the recommender systems that are based on existing content we'll perpetuate and propagate biases that are gender related so one big question that I have so first off I do think there are already issues around racism that we may start thinking about the other question I have and it's an open question is how do we deal with intrinsic biases that are in the content if most of our recommender systems or AI are based on Wikipedia content and not something that comes from outside of Wikipedia yeah hi this is Stuart Geiger I think that this is a really interesting the labeling campaigns I think and the distinction between that and something useful like revert data because the labeling I get the sense that if there's a labeling campaign that takes place in a public space where people can debate it and discuss it and decide like do we want this particular article to be in the or this particular edit to be in the good faith or bad faith or damaging not damaging there can be discussion and negotiation about that as opposed to if we use something that's we have a whole lot more data on reverts and that can make in one sense if you look at the statistics around it a more robust model because there's just a lot more data that might actually be more problematic because you're not able to actually have that for example like with reverts you might a lot of things in Wikipedia get written and then reverted and then put back and then revert it again and that's not exactly the process that I think should be done for or I think there's a lot of benefits of the process of making a more wiki like way with a discussion based labeled campaign so I understand that I can still be heard from this microphone so that's why I'm not getting up but I think it's been very interesting this conversation I've heard two sort of principles come out one is that the people who are affected by like a prediction model should have the means to talk about it so like Dario asked how we deal with this when we only look at Wikipedia content and I think that that's definitely a thing but part of his point was really also about like the people who are affected by this who might be able to speak to like hey this seems problematic to me and we don't necessarily have those types of people involved in on wiki discussions but the other thing too like this idea that the training data that we give to models should be a proper wiki artifact so that we can employ our wiki critiques and that sort of stuff seems like a principle that we can draw out and see if that's something that we want to adopt and talk about later I wanted to highlight these two things because they're things that I can go away from later and write about on the wiki when I summarize this discussion and try and drive towards principles so thank you that's really cool okay this is from Netram on IRC by the way you can have a similar problem in breaking article quality contributor experience helps protect quality but you might end up encoding the higher quality comes from the experience folks this is probably why Orisa's quality model is the way it is because it's trained to label articles without considering contributor experience so if I might wave the flag of Morton's research who's Netram it's Morton backwards so he really did the preliminary work for the article quality that model that we have inside of Orisa and he made this hard case that we would not include a lot of predictive features like how many edits the article got or how experienced the editors were who were working on the article and only include features that were actual characteristics of the content of the article regardless of how it got to that point or who contributed to it and this has a whole host of benefits that like he explains allow us to not buy a search of articles around the experience level of the editors or how central they are to the community and it actually has a lot of other benefits too this is a really interesting example I think of choosing to reduce the fitness of a prediction because it meets the goals that we have better I think it's very often easy and I'm sure that Morton's reviewers pushed on this to say well we should include all the features that make the prediction better why would you ever want the prediction not to be better but like other subjective things better is subjective but in a lot of our cases we don't want it to be certain kinds of better we would like it to be better for our purposes if you did include that it could also lead to an overfit model that fits some cases but if there's notable exceptional cases for example someone might be working on an article for a class paper so they draft it all in a word and spend a month on it but then they post it as one edit and it might be a pretty good article but if you're using a number of edits as a heuristic that would fail because it's like an overfit model there's also an example I think in terms of the dangers and past errors Erin's told me about whenever you might apply a model that's trained on English data to non-English content I think it's the Italian ha ha in Italian is a verb but in English if you have a lot of edits it's like ha ha ha ha that's vandalism that's a very strong predictor but if you take that model from English and you apply it to the Italian Wikipedia it's all of a sudden finding that a whole lot of vandalism is taking place when it's just people putting the word to have in articles so I think that's an important thing you already had it turns out that I had the slides for this part of that one talk handy so I thought I'd explain this real quick so yeah, the Italian ah which is not literally not a laughing matter so this actually came up in one of the central spaces that we've put together actually this one was put together by our Italian Wikipedia who were working with us on order so they didn't come to our central one on matter they made their own central one but as they were looking at the false positives of the prediction model one of the trends that they saw was this ah and the edit that added ah was picked up as likely to be vandalism or damaging or needed to be reverted and you can see Roger Punk to ask like why is this happening and so it turns out that we use two different word lists in words of predictions and formals and bad words bad words are like curses and racial slurs and formals are like ah-ha and hello and mama and that sort of stuff things that you might use in casual conversation but definitely usually don't belong in an article unless you're quoting somebody or it's about a song or a poem or something like that oh yeah, there's my explanation of the two so for you know the usual examples that I use for formals are hello and ha-ha-ha and so we catch these with regular expressions and so here's the regular expression that catches ha-ha-ha and all variants of that and the test that we have to make sure that we are in fact catching those things and so it turns out that people vandalize in English language all over the place and so we usually include features about English language, bad words and informals in all of our other models and so this is as Stuart said, ha is laughing in English but ah is not laughing in Italian and so we clean that up, we remove that from the set of informal words that we applied to our Italian Wikipedia models and it made a substantial difference and so I love this I love this a little bit because this is the this is the stuff that makes it worth it, you know, nice job thanks from Italian Wikipedia yeah I taking that example how important do you think it is to provide tools for for the community to do this kind of debugging because we have been talking about ways to complain or say okay this is failing but at the end of the day that's an hypothesis that I have is like okay my edit was good and I guess that maybe it has it's because I added this ha thing but what should I do should I create more edits with ha to check if those get reverted do I have a tool to to evaluate what would happen, what would the artificial intelligence do which kind of tools do you imagine that could help in that process so that's a really good question it's something that's been a big part of my thinking about the order system and I think that it should be a big part of any sort of like AI type system like this, this sort of black box subjective judgment system so there's sort of two sides to this question one is that like it's kind of hard to get to the place on Italian Wikipedia where you might even be able to file a false positive and see if what other false positives people have filed to even develop the hypothesis that this has something to do with ha and so that's kind of problematic and so I think that there's a lot that we can do and I think we're about ready to start designing things that will help people gather false positives in a more direct way so that we can even include those assertions that this is a false positive or false negative or is it something wrong with the prediction itself so that you could imagine having a tool say well I was predicted that this was bad but you know somebody else came in and flagged it that it's actually good and so use that in your judgment about what's going on here and so anyway I think we're about ready to start designing something like that but the other part of it was you know what about what Aaron did in this situation I'm Aaron couldn't like the editor do that and I think that we have no idea how to design something like that yet there's a lot of technical challenges and there's definitely some like how would people interact with it kind of challenges but I think that in working in this kind of dynamic where you we're iterating aggressively with the users of the system that we have that once the users have better options for recording and identifying trends then they're going to start talking to us about what they want to do next with those trends and then we'll be ready to start designing so I think that we will be soon you know there's a little bit of preliminary work left but I you know like a big part of working on these projects at least for me is figuring out how to get to those points where we get you know like the bully braille of machine learning in our social spaces where our community has control over it works really well and we have social infrastructure where it ought to be and technical infrastructure where it ought to be but you know if I may continue like I think that that's like another sort of principle that I'd like to argue for is that we don't know what kind of problems arise here and so we have to use strategies that will make it so that problems that we didn't expect can become can be brought to the surface I don't want to get too detailed but I mean like in terms of the case of like anonymous users either just the fact that they're new or they're editing articles like say some small villages in Pakistan they're making new ones they're trying to avoid bias against them like what do you do I mean like for example like when I see a string of edits from an IP there's a certain point where like it's a different user because the IP gets rotated out and like how do you how long does the IP to the last in your experience and how do you detect that and try to then you know avoid bias and how much of that is this like trying to adjust up front in your modeling or something versus but you'll have a way for them to dig out of it quickly I think this brings up some really interesting questions about what we want to model and how we're okay with like getting at certain information about a user so one thing that I should say right away is that we don't use a user's history in figuring out whether they're vandalizing in a particular edit and that's kind of a big loss you know it's a big opportunity to get signal from this I think that anybody who does patrolling work will say that the history matters and it can really help you understand things and humans are a lot smarter than auras so if the human needs it auras definitely needs it so that's sort of like on our future work but it brings up this question of how do we feel about modeling an editor as opposed to just their edit is there something that's sort of problematic with flagging an editor as good or bad as opposed to the work that they did as good or bad you know I'm not quite sure that I have the answer to the question but I do know that we can model that with high fitness and I've published about it in the past it's just not something that auras does right now but I think that the question that I want to turn around is how do we have a discussion about how we feel about doing that kind of modeling as opposed to just the edits you know people yeah one quick related thing is that the whole conversation kind of reminds me of like the all the heuristics used for promoting a user to the class of reviewer or whatever for the wikis that have flagged revisions like German Wikipedia and Russian and so forth and part of that includes things like revert ratio how new you are, your edits, how many edits get reviewed and so yeah I mean if you were again you were editing those Pakistani villages or contentious articles even though you weren't vandalizing anything you might have gotten more reverts and then you have to get manually promoted maybe on your threshold or something so yeah so that's sort of like kind of what you're saying like a good edit it's not meant to be that but effectively it kind of is like a sort of algorithmic you're a known good editor you're not just completely algorithmic which could be tyrannical or amplified biases or it could be fine yeah it's kind of scary you know I think that we have to ask the question oh shoot we have to ask the question of you know like exploring and investing in things that that seem like they have a clear path of people not thinking carefully about the people that they're working with anymore so if we label people then you might feel like you don't have to spend so much time like critically considering their edit history when you're considering promoting them or something like that you know like my bias is that we should experiment with it and see what happens and have follow up discussions to decide whether we want to continue doing that but maybe we also want to have discussions that decide whether we're even interested in that experiment or not I'm not quite sure how to have those yet it just seems like we ought to gotcha and so you probably didn't hear Aaron but he said that that's essentially what they're doing on these wikis is looking into it looking into the heuristic approaches well they're active are there any biases or are they having biases or are they having biases or not nobody's studying that a couple things on IRC which are following up with what you said about if we're grading an editor's history and okay this is a good faith or bad for that or you might take that too seriously and not review it for yourself there's some similar discussion on IRC Darter Daria I think there's a higher risk of people blindly trusting the output of black box machine learning tools and this is in the context of comparing tools like ORAS to tools like abuse filter so basically saying even if people accept that abuse filter is flawed they might incorrectly think machine learning is perfect when obviously nothing's perfect and BSVB said also somewhat related I'm also just searching for the main differences main dimers might be more unexpected biases and harder to spot them and also I've seen abuse filter reject external links at a bi-library which is part of 1RF I have an idea of something that might ameliorate these things that whenever you show a prediction about an editor or like some type of classification or something like that that has to do with a real human or real human's action or something like that then you should also provide some real data about that person so for example like let's say that we were flagging an editor who tends to get reverted a lot we could also provide a sample of the kind of edits that they get reverted for so that it would be we would make it easier to at least take that next little step to look into why this classification might have happened and that maybe we're okay with experimenting with something if it includes those real statistics or real examples but we're not okay with experimenting if it's just the prediction I would kind of like to step does it work? Oh yeah, that's not a speaker I would like to take a step back and also see what we are doing with these tools I mean one way to see all the kind of the force multiplier we just give editors the tools to do their work what I've been doing a bit faster more effective etc and what we're doing we're going to say oh but there might be discriminating against IP users or against new users but we are kind of injecting our own values and this part is okay we want new users so yeah it makes sense not to discriminate with anonymous users questionable but I mean I personally think it's not a case of discrimination as you would say in like racial discrimination because everybody is free to get an account and also it's a trade off for the community if you're an anonymous user you give the community less data to judge you right and again it's great that we have that I'm all for having that but it's also you have some disadvantages you make that choice right so what's happening here we are we in this room I mean there's not a lot of vendors and patrollers mostly developers or people that is in the technical side and that's people who know how to acquire this power right because if you're not doing building this tool or you're not it's not going to be done right and so we kind of have this opportunity to tell them yeah we force multiply this kind of thing but we don't like this kind of thing you're doing so we are the technical advice and it's also kind of speaks to the relationship between the foundation and the community right because mostly we've been saying we are hands off we are let the community edit no auditory decision influence but with or is it such or we kind of get in the region where I like we're steering this a little right we're saying oh you community maybe you're too new be unfriendly and now we have this lever to naturally a bit in the other direction because you shouldn't look at this criteria in so much when you're judging edits so yeah not entirely saying it's a bad thing but I also want us to be really conscious what's happening here and what it really means to the international community and also the value of judgments we're doing what we're saying new editors are good I'm totally involved with that we're saying anonymous editors should be treated exactly the same thing I personally am not sure if I agree with that and so it's about interest and values and let's not forget the main outcome is readers right readers want non-vitalized articles and that's why we are set in the work of editors and that's something we should also keep in mind not just the the right to edit but also the right to read I'm not too much afraid of AI per se given that we continue as already been said to keep humans in the loop now my real question is how much of it and for what value if the value brought by AI is so widely important in comparison to what can't be done without yes there'll be mistakes yes and there'll be probably even bad mistakes and that's not cool that is true but if we can't do without AIB there is not even a question you could say well how many labor hours of a patroller is a false positive work you know we could say that you know like if we if we can reduce false positives but it requires us to increase the labor hours of each patroller by another half an hour a day then well maybe that's okay but if it's five hours a day maybe that's not okay you know there's sort of like nonics that we can do with this that exact problem is behind issues around racial profiling if you want to increase the rate which you find crime in some areas just target some demographics and you'll make your system more efficient and so I guess this consideration apply completely to what we're discussing here one really good point I was making on IRC is the fact that we lack demographic information at all almost completely and I don't know how we can handle these complex issues around discrimination lacking entire almost entire demographic data so one thing that I think ties a few conversations that we're having together is this notion of protected classes which is part of US law about hiring and that sort of stuff and so they're formally defined protected classes of individuals and we essentially do statistical analyses on hiring approaches and filtering strategies and that sort of stuff that decide whether somebody is eligible for a program or something like that to make sure that the filtering strategy is not substantially biased against a protected class and so I think this comes back to what Tillman was saying because it sounded like the anonymous editors aren't really a protected class because they have the opportunity to register an account and they can be just as so they will have a name and they will no longer be anonymous but it's arguably more private to have a pseudonym and edit with a registered account so in a lot of ways it might make sense for them to do that and therefore they have the choice of what class they appear in that regard whether it's somebody with a particular skin color or gender representation doesn't get to choose and so maybe it makes more sense to consider those things protected classes but I think the cool thing that Dario just brought up and that we've been sort of pushing on this idea of how do we balance benefit to some group with effect on another group and in the case of protected classes you know US law says you can't we're drawing a hard line here you just can't it doesn't matter how much more efficient it is you can't but for non-protected classes then we're sort of allowed to do this trade-off and see how we feel or what's important for like hiring for your business or whatever and so I think it would be very interesting for us to decide or our communities to decide what classes do we want to consider protected where we say that no trade-off is good enough and what classes or cross sections of users or edits or articles are not protected and it's okay if we have a slightly higher positive rate there so long as it balances with the utility of the system right now I don't think that we're I mean we're just on the cusp of being able to have that conversation just from a general perspective it seems like what's going on here is that AI is able to ease a moment of a social of what people do together right and so that doesn't include what's in the social system which is social change be it positive or negative right so that's just like trying to frame it in a general way how can you in a specific case slightly more specific here would be I guess how could you incorporate ideals for change into a system like this and and yeah then the whole issue is also what are you missing as a social group which is also mentioned also could I ask maybe when people come to the mic maybe it's only me but I was having trouble hearing understanding what people were saying so maybe we could speak louder not only into the mic but also if that's okay thanks so I think there's like two ways that I interpret that and one is that any sort of prediction should account for change in the dynamics around that prediction so say for example in English Wikipedia in 2006 we decided we were going to delete a lot more pages and so a prediction model that decided what kind of pages we would delete at that point because we suddenly changed our practices towards deleting more stuff but the other side of it too is that I think that there's other things that we can do to be you know looking towards the future with the modeling that we decide to do and so one of the things that one of the ways that I've tried to do this with ORS is that like the models that were dominant before ORS focused on vandalism and not vandalism which kind of stretch broadly to damaging edit and not damaging edit and I like that distinction because there are a lot of edits that are damaging that aren't intentionally damaging so therefore they're not vandalism you have to have a bad faith intention for it to actually be vandalism we had a lot of good faith newcomers who were getting caught up in this damage versus not damage dichotomy and so I modeled good faith in the system in order to try to see if people want to take advantage of that to push forward social change and so I think that coming back to the two sides of this one is our models should reflect how we behave today because we're going to change and how we behave tomorrow with like what we want, what we value we should reflect our values as our values change but also I think that there's an interesting idea on how we can use certain modeling strategies to if not push forward an agenda, make it easier for others to adopt an agenda to open certain doors and not others open doors that allow us to find good faith newcomers but not doors that help us find more anonymous editors to revert I think actually maybe we could do this there you go so I really like the point about like has also how I see a lot of these algorithmic and AI systems is freezing a certain set of assumptions at a certain point in time but I want to say something around the protected class issue and around the issue of and thinking about for example the argument with me that like people can register and then they they won't have they'll sort of be in the community and then start to build a record and that sort of thing I encounter a lot of people who are academics who when I talk about Wikipedia they say oh yeah I edit and I'm like what's your user name and they're like I don't have an account and they've edited it for years but they fix typos and they add a sentence every now and then and these are very casual contributors maybe not even an edit a month but they're adding a particular kind of value and they have been for a long time and they're largely lurkers and it makes me wonder what would be a thing that would be a socially good thing for them to do like when I tell them and a lot of times the conversation will go like I made an edit I made a contribution a lot of times it gets reverted and I'll suggest to them to make an account and that idea is a very new idea for them they didn't have any idea that like having an account versus not having an account meant anything in Wikipedia and so it makes me wonder about these decisions and thinking about protected classes and bias and that like how we make this is going to be a feature that we use prediction on how do we communicate that in a way that's like open and transparent and let people make those decisions do that we probably wouldn't want to do but it's sort of a fun idea anyway is have Klubat whenever reverts an anonymous editor include as part of that message if you think that this was a false positive that might be true in fact it's more likely to be true if you're not registered and if you would register your account then I might judge your edits more fairly as a robot we might not want to tell that to anonymous editors because you know when they do vandalize and Klubat does catch them then that's great and so we're sort of saying here's how to circumvent our AI bots but on the other hand we wouldn't be empowering those people who were caught in false positive you know maybe we could only have our bots say that when there's some high likelihood that this could be a false positive you know it's not a high likelihood if you're adding curse words and racial slurs but maybe it is a high likelihood if you added a particularly long token that doesn't seem like that normally belongs in the article or the prediction probability is just a little bit lower than it might otherwise be yeah I feel like one way in which AI can become more empowering is it becomes less of a black box and it becomes you know more of what's going on inside those algorithmic decisions so for example to take that idea one step further the Klubat could actually even show the details of how the person's edit scored and why it was close to but not quite above the threshold of not being reverted for example are there some metrics that let us see in you know what's I don't know how so there's some research that I'm aware of on explanations for these types of predictions where you have a totally black box model and neural networks are particularly problematic for making it very difficult to get at any sort of meaningful thing that are used to make the prediction however we don't use any neural networks in ORS it turns out that you can get a pretty high fitness without investing in a neural network with like tree based strategies and those tree based strategies have weights on individual features that have meaning they were engineered like the number of characters added the number of passwords added and that sort of stuff and it's hard to say how much weight each individual thing had but you can take the same observation and send it back to ORS making minor changes to it to see how ORS prediction changes in fact this was one of the features that was requested by Sage Ross as he was working on using our article quality model to help students edit articles in Wikipedia know when their articles were ready to move to main namespace he wanted to be able to ask the question of ORS what if we added one more header or a few more references or another paragraph or some images what would your prediction be like then and then you can turn around to the editor and say you know I'm not sure it's a machine making a prediction but we think that this is the next best thing that you can do here you can imagine doing that for vandalism and saying that hey we flight dreaded as vandalism we auto-reverted it we don't want you to keep vandalizing Wikipedia and but if you wouldn't have done this we wouldn't have thought that it was so vandalismy I think that this is scary because we can train vandals to circumvent the system but you know I think we come back again to helping people understand is also training people to circumvent I'm not sure that we can have both but we may decide that we want and we're worried about them circumventing for what it's worth vandals generally aren't very clever yeah so like you're saying I think there's always a trade off between giving people more information and then people using it to vandalize more but for something like Qubot which I think does use neural networks it might be interesting to break the change that made the diff into logical levels just like a diff is and then do automated counter effectuals on each of them and see how the score changes which part changed it the most so you could sort of say if you didn't do this line here or if you didn't delete this whole paragraph that was the largest contribution to the prior confidence being so high that it reached first you might be able to do some things or give users some hints as to why they're at it was reverted when they don't think it should have been so yeah that might be interesting since Qubot does a large portion of the reverting I just think about what just came up about are we training the vandals to be like better vandals like a way to look at it could be that we are training editors we're sort of using that as a teachable moment like silly, silly example not real but like if the problem is this is a racist edit and you get a message that says this edit would have been treated would have been seen a lot better if it wasn't so racist and then that teaches like potentially still the racist editor to like make their racism more secret and hidden and just making like good edits again I think you'd have to pick possibly hand pick like what would make a good teachable moment in that rather than here's the way to game this like saying like you made three edits in the last minute like and that makes us flag it as vandalism like that's the thing there's nothing teachable there but if there is actually some kind of content or words or things that you're doing it's an opportunity so we've been talking a lot about vandalism prediction and I think that that's one of the things that we should be particularly concerned about when it comes to AIs but I thought I would offer to the conversation too that there's other types of AIs that are very different that we might be concerned about as well here's heard about the idea of the filter bubble so the general idea is that if you have a recommender system or a filter system that recommends stuff to you that you would like then you can get into a bubble where you're only showing things that you like sounds really great when you're watching movies but not very great when you're reading political opinion and so we might for example only recommend certain types of articles for editors to edit in our article recommender system and exclude others we could for example perpetuate biases by recommending that people edit the things that they've already been editing I don't think that it would be very likely that a collaborative filtering system that recommends the two edit articles that other people who edit the same articles as you have also edited it just couldn't recommend articles that don't get many at it because no one's ever edited those no one's really editing them that much and so if we have a coverage bias there now we have new coverage biases so it's a completely different problem space but still causes similar problems to wikimedia as a whole sorry late response regarding vandalism are not being very smart dbb on IRC says not true it depends on the motivation and skill level of the vandal see our wiki attacks for an example we had high skill high motivation attacks on our beloved wikimedia instance another thing I have to say about vandals being stupid I guess so you know like often when I look at my watch I'll see somebody you know the info box of like a birth date by it's say 1987 and somebody might change it to like 1787 and like that's not always easy to catch and like I'm curious like if you want to catch that and you don't want to rely on the editors past history like do you check against wiki data like I don't like what do you like it would really help to use the past editors history but if you don't want to do that and L.R.S. seems to not do that I don't know if that's like a kind of permanent decision or just how it is right now but how would you go about that so right now like artists will tend to flag edits like that especially by editors who are anonymous or who haven't been around for very long because like essentially we base the thresholds that we set on an evaluation metric that I call filter rate at recall where essentially you say I want to catch 90% of the vandalism how much of the recent changes feed can you filter out and so if in order to catch 90% of the vandalism you have to review every minor edit to a number then it's going to flag every minor edit to a number for review it's not really saying that it's damaging it's saying that it's not confident enough that it's not damaging and so that's how we deal with that right now but your second question is this like a long-term decision I think that looking at histories and pulling in other information from other spaces is a great way that we can improve the sort of fitness of the stuff if somebody changes the number to something that's the exact same on wiki data it would be really great if we didn't bother a patroller that we just don't do it yet it's art, it's performance issues and complexity on top of the system that we have in place yeah it seems like on that on my experience also in quality control and vandal fighting there are definitely I feel like the obvious blatant stupid vandals and then another time I don't even know how to call it vandalism but the malicious subtle that even I feel like it's more dangerous and it might be worth thinking about like the people who and I've actually gotten to talk in an interview with some of these people who they transpose numbers like 57 to 7 1957 to 75 and with a kind of like I want to take down Wikipedia I want to test its resilience and that sort of thing and it makes me wonder about thinking about use cases for these and I think a lot of this conversation imagines I think we're talking about kind of a general level and I think we've talked about that might be something that can't be done with a machine learning classifier like that's just something but so I'm just wondering like how or maybe like it could play a little bit of a role but instead that's going to be a different sort of kind of work that's going to happen a lot farther down the train than the chain than like recent changes patrol or something like that my research and the development of or is what features uses how it makes predictions and that sort of stuff we've generally made the assumption that no automated machine learning strategy is going to catch subtle handleism at least the release subtle handleism happily so I said and really what I should have said is almost all handles are totally unoriginal and how they've analyzed and that means that we can get a high filter rate by addressing those unoriginal handles I think that we're generally going to have to rely on humans to catch the subtle handleism and really I think watch lists are the best strategy that we have for something like that people who are familiar with that sort of subject space who can know that that's obviously the wrong date for this this really didn't happen that's a dubious assertion whereas most recent changes of controllers I mean they're reviewing and it's to all articles ever it would be very hard for even a human in that case to catch the subtle handleism sorry it's looking so we're just about at a time I think that would be a good time to wrap up that you know I don't want to take away from that though maybe we should have one more quick call for people who haven't had a chance to talk so my plan for this discussion and the notes that have been taken on the etherpad and the discussion in the IRC channel is to bring them to a research page on meta where I'll summarize this this is linked in the etherpad if you go there now there's nothing there pretty sure there's nothing there that's where I plan to create it and so sometime in the next couple weeks there will actually be an article there that you can go check out I think you can actually watch a page before it's created right like it's based on page title yeah so if you if you want to watch that page now I'll make sure that even if I end up renaming the article you know as I'm writing it I'll make sure that it pings there so that you can you can see that it was created you can you can make any amendments you can explain something that I that I summarize you can correct the mistakes that I make you'd be very welcome for that I'm hoping to continue this discussion not sure quite what form that it will take next but I'm thinking that a summary of what we talked about today will be good fodder for taking the next steps towards best practices and policies and that sort of stuff so thank you