 Hello everyone and welcome to the July 2016 Metrics and Activities meeting. Thank you all for being here in person and on the stream. I want to start out. First, here's the, oh, if you haven't met me, I am Chuck Rosloff. I'm one of the lawyers here. If you haven't met me yet, you can say hi afterwards. This is the agenda for today's meeting. We'll start with the welcome, then we'll have a community update, metrics, feature, research, product demo, and then time for Q&A at the end. First, I want to welcome our new hires and our new contractors, interns, and volunteers. We have Gretchen, who's a new hire, and Daniella, Helen, and Blanca, contractors, and volunteers. Yes, give them a round of applause. I'd like to acknowledge our anniversaries. We have celebrating one year here at the Wikimedia Foundation. We have Katie, Trey, Cherie, Mikhail, and Emerald. Yay. Celebrating two years, we have Kristin and Josephine. Celebrating three years, we have C. Scott, Dennis, Nick, Ty, and Brian. Celebrating four years, we have Lynette, and Michael, Jeff, Nicholas, and Tillman have all been with us for five years. Arthur has been with us for six years, and finally, Erin has been here for eight whole years. Give them all a round of applause. Next, we have the community update. Hi, my name is Maria Cruz. I'm communications coordinator in the community engagement department. And for this metrics meeting, I wanted to share with you a different story because it's too light and we're doing retrospectives on the past fiscal year on other things. I wanted to talk about how we consulted with communities in the past few months. So in the last eight months, we consulted with communities many times. Edward has been tracking this on the calendar on office and on Meta. And we have tracked 42 of these consultations. This is an important clarification because some teams, for instance, in the community engagement department, actually have dozens of conversations every quarter. So this is only a sample. This is how consultations were distributed in the past eight months. Some of them were lengthier than others. As a reference, that is the harassment consultation and the harassment workshop. And this up here is the community wish list vote. The biggest circle represents 45 days and the smallest one is two weeks or less. The majority were conversation consultations and they are also the ones that take the longest. And you all see this. Should I move maybe here? Yes. Protein were surveys and they also tend to be shorter. Nine were events and one was a community vote. All together from a very colorful consultation map. So why is this important? Conversations are important because they help us make better projects. Projects that volunteers can relate to, they can own and keep engaged with. And they also help make a better foundation for the movement. And part of the reason why I wanted to share this and make these visualizations is also to pass on how much time and effort these consultations take. Why do we take so much time? I know. Yes. We spend so much time because as we have seen in the last few months, listening to communities matters. And we listen in order to make decisions as an organization to improve the overall Wikimedia experience. And so I want to, and we can also improve how we listen, how well we listen to communities. And so I want to introduce a new project called community engagement insights. This project is surveys for listening to Wikimedia communities. There are two surveys this year. One larger survey in December and a smaller survey happening in April and May. These surveys are going to five audiences. These audiences are very active involved editors, active editors, Wikimedia affiliates or program leaders, volunteer developers, technical collaborators. And the last one is external movement partners such as Wikipedia Zero Partners or researchers or external organizations. And so if you need data from these communities, if your team needs data from these communities, we have a, it's not a letter of intent, but it's an intent form is due next Friday. And then we will spend the month of August developing and working on questions. I'm going to guide people through that process. But talk with your team. Talk with your team if this is important for you, if your team needs data from communities. And this is also a very collaborative process. So yeah, letter of intent due next Friday. And I'm going to be sending emails about that. So yeah, looking forward to collaborating with everyone on this project. Thank you. Thank you. Thank you. So everybody, I am Neil Quinn from the Editing Department. And before I start, I should say I know the photo of Neil Quinn on the staff page does not have a beard. But I have checked. We are the same person, so there is absolutely no need to be alarmed. But I'm not just here to talk about my facial hair. I'm also here to talk about our high-level editing metrics. So this is, of course, the main big editing metric we always come back to, which is global active editors across all our projects. And the main thing you can see here is that over the time period of this graph, which is six years, or quite a long time period, we do see a slight decline. There are two things to say about this. The first is that Wikipedia and Wikipedia are not dying. That's a narrative you hear a lot, particularly in the media, not so much within the movement. But it is not true. So there's a long version of this rebuttal, but the short version is right here. This is not what imminent death looks like. At the same time, it's not what success looks like either. Because as we know, there is a huge amount to do still within our mission, and there's no intrinsic reason why we should be right at 80,000, the size of our movement should be right at 80,000. Another thing to keep in mind is that during the time period of this graph, the global population of internet users increased by 75%, and that definitely didn't happen here. But all that said, in the short term, you don't see a lot of variation to this graph. So on the whole, active editors is fairly stable. But there are some interesting things going on when you dig into it. And one of them you see when you split active editors into new active editors, so people who are active in their first month after registering, that's the area in the red, and everybody else in blue. And the thing to note here is that new active editors only make up about 20% of our total active editors, but over the time period of this graph, they account for about half the overall decline in active editors. So there's definitely something interesting there. And also, when you pull out new active editors just by itself, you see that it's a lot more volatile than active editors. There's a lot more movement from month to month. And in particular, if you look at 2015, you see that starting in about March 2015, which is where the black arrow is, going through the end of the year, there's a pretty dramatic decline. To put it in context, that decline kind of takes us back to the level we saw in 2013. But still, it's a pretty dramatic thing, and it'd be very useful to know what was driving it. I don't know the full explanation, but I do know a significant part of it. And it has to do with mobile editors. So this is a graph of mobile edits, not mobile editors. And you can broken down by whether the edits were made through a registered user account, that's in blue, or anonymously, and that's in red. And you can see that from when mobile editing really got started in mid-2013 through early 2015, basically all the edits were made through registered accounts because we really didn't give people the option to edit anonymously before then. That changed in April 2015 when we turned on anonymous editing pretty much everywhere on mobile web. And pretty much immediately we see anonymous edits skyrocket, and registered edit stayed pretty much flat. From the perspective of the users and the product, this isn't necessarily a bad thing because it seems like people were switching from registered, were editing anonymously when they otherwise would have edited registered, but they were still making the edits they wanted to make. And in doing so they were avoiding the hassle of registering for an account and logging in, which can be particularly intense on mobile devices. So from that perspective it's not a terrible thing, but it's not necessarily a terrible thing. But it did make a really big impact on our statistics because a lot of our metrics, like active editors, ignore anonymous editors completely because with anonymous edits there's no way to say how many edits were behind it. So for example, this is what new active editors looks like when you break it down into editors who edit pretty much exclusively from mobile devices, which is the area in red, and everybody else who's in blue. And you can see that even back in 2013 and 2014 when mobile editing was quite new, there was a significant proportion of our new active editors were coming from mobile devices and that's increasing up until about April 2015, which is where the black arrow is. When it starts to decline pretty significantly. And so this is not just because of our editors who are now editing anonymously, our missing mobile editors, because you can see that the non-mobile new active editors also declined in the same time period. But I did a very rough estimate of the number of editors we might have expected to see if they hadn't switched to editing anonymously. And you can see that that does make the decline significantly more shallow. This is the same graph, but for all active editors, not just new active editors. And you can see that the mobile editors are a significantly smaller proportion here. But still, even among our 75,000 mobile active editors and these kind of missing mobile editors are a noticeable proportion. So there are two main things to take away from this. The first is that a lot of our editing metrics ignore anonymous editors completely and really affect how we understand events. So it's definitely worth looking to see whether there are alternative metrics that better account for the impact that anonymous editors have. And the second thing to keep in mind is that even among our huge numbers of active editors, there are thousands of editors who pretty much edit exclusively from mobile devices. So there are a lot of interesting questions there. What kinds of contributions are they making? What quality levels are they? And in particular, could they make more contributions and higher quality contributions if they had better editing tools available to them? So that's also something we hope to investigate. Thanks. And so now Maggie for the feature update. Hi, I'm Maggie. I am today's feature. If we can move to the next slide, please. I can't see. Have we moved to the next slide? There we go. Okay. So I am the director of support safety and programs and the interim senior director of community engagement. And I'm here to talk about community culture, specifically harassment and healthy environments on the internet and on our projects. Next slide. Harassment is not a Wikimedia only problem. Almost half of internet users have been harassed, nearly three quarters of women. It's a serious problem that causes lasting distress and reduces participation in sites across the web. There are a lot of people talking and thinking about the issue of online harassment. There's not a lot of clarity on what it is or how to handle it. There's no silver bullet solution. Next slide. So we know that our culture and the behavior of other contributors affects Wikimedians. In the last editor survey, 27.6% of responders, that's more than 2,500 people, called out critical contributors as a deterrent to their contribution. Almost as many called out editors who are unpleasant to work with and 10.8%, that's over 1,000 people, indicated they found harassment a substantial reason to reduce their participation. The support and safety team hears about these situations regularly. They investigate cases that may rise to the level of Wikimedia Foundation global bands, but often don't. The stories that they hear from people highlight their frustration, their shame and their helplessness. So last fiscal year we set out to learn more to see what more we can and should do to help. Next slide. So we wanted to know how harassment affects people, what it looks like, what its impact is, what's working and what people want. We also wanted to know what other people are doing and thinking, what's working elsewhere on the web, what the research shows, and how our technical and social realities may present additional challenges and could be used to give us additional strengths. Next slide. We're still learning, but we've turned over a lot of stones. We've been collaborating with academic researchers at the Berkman Center at Harvard and Berkeley University alongside members of our own community on defining common terms and identifying possible approaches. We've been exploring specifics of other online companies by both reviewing and evaluating their approaches and talking to their representatives. And critically, we've been talking and listening to our community, both deeply in our harassment consultation and widely in our survey. Next slide. With our survey, which ran last fall, we set out to learn more about the impact, not just on the people who reach out to us, but those from whom we might never have heard. We want to know how it happens to them. We wanted to know how it affects them. We want to know what they're doing about it and whether what they're doing about it works. We wanted to start more people thinking and talking about the issue and help us focus on areas to improve so we can build better systems and so that later on we can figure out if our efforts are proving effective. Next slide. We invited people on all our projects and over 3,800 people responded. People reported issues across the board, more for women and marginalized genders. And of course, people find it distressing. The top approach people take to seeing or experiencing harassment is to hope it goes away. It often doesn't. Instead, the victims do, or at least they reduce their time in our projects. See more than 50% of people reported being harassed, reported subsequently decreasing their participation in Wikimedia. And note that those 50% are the ones who bothered to respond. We invited both current and a sampling of departed users, and some people when they leave they don't come back to tell us why. We learned that our users find reaching out to the harasser or to other users or even to us at the Wikimedia Foundation often ineffective. And that those users who choose to intervene to try to help somebody being harassed frequently wind up targets themselves. Next slide. So harassment takes a lot of forms. It ranges from content arguments to become too personal and go too far to prolong sexual harassment to gendered political or racial bigotry to doxing and sweating. The top form of harassment reported to us in our survey was harassers vandalizing the content of articles their victims work on, which impacts the quality of our content itself. Secondarily, people reported trolling or flaming and personal attacks or name-calling like these on the next slide. We have many more examples, some far more vicious. Harassment runs the gamut from the extreme to the everyday, and comments like these do happen every day. Persistent, unacceptable, personal attacks intended to hurt and discourage, intended to drive people away. So the good news here is we don't see this as an insurmountable obstacle. We do see it as a challenge. We see it as an opportunity to turn around. It's an opportunity for the Wikimedia movement to take lead. Next slide. To many in our community it's time to commit to change. The second most supported point in the 2015 Strategy Survey for Community was a call to reduce harassment in the gender gap to facilitate a safe, welcoming and supportive environment for contributors and editors. The community wish list during our deep consultations gave clear guidance. People want to understand and learn. They want to get support, technical support such as through better blocking tools, educational support through functionary training and better techniques for managing of disputes. They want to get protection. They want clearer policies and consistent and effective enforcement. Next slide. So we're doing a lot of things as a movement. We've been clarifying expectations and policies to protect people at events and also in specific areas like grant making and technical spaces. We've been providing capacity development training on how to resolve disputes with the first pilot program in the Ukrainian community in May. We've been becoming part of a growing community of academicians, social scientists and experts also looking for ways to improve internet culture and laws. And we've been supporting experimentation with new options like detox, which you'll be hearing more about right after I finish. Next slide. There's additional work in the pipeline. We recently concluded an Inspire campaign on the issue of harassment. For those of you who don't know, Inspire campaigns draw people to share ideas on specific problems, some of which then receive grant support. This was our third Inspire campaign after one on content creation and one on gender gap. We had 280 proposals from 700 participants in 30 days. There were a lot of good suggestions. I've listed here a couple that caught my eye. The top one proposes giving users the option to protect their user space or doing it by default. I imagine most of you know that user pages are like portraits people create to represent themselves. Some have a lot of personality, some of a little, but they're the public face of our people. Trolls and harassers frequently use them to bully others with vicious alterations. Some of the other popular ideas included guiding people to help for harassment and doing further process specific research. I'd recommend looking on Meta at the leaderboard for more. Some people will run with the ideas that have been created and in October will begin funding practical ideas from this campaign that require financial sourcing for community to achieve. We may also take up some of these ideas ourselves, just as one of our major annual plan projects for this coming fiscal year is based on ideas proposed in our last consultation, translatable online modules to help train community leaders. We're starting with two. One will provide best practices for handling harassment complaints to functionary groups. This will be created in consultation with community and the expert advisors we've been coordinating with. And the second will focus on deescalating in-person conflicts at events so we can help avoid reaching the point where events become necessary. Next slide. So we face a lot of questions as we look forward in determining how best to address these challenges. Where does user protection fall in our global movement values? What responsibilities do we have to the people we invite to join us? How strongly are we motivated to move forward on this issue? What will happen if we make changes? What will happen if we don't? What would change look like? How do we create good policies and enforcement mechanisms in a diverse global movement? And what role do we have in confronting these problems in the broader internet? There's going to be opportunities to discuss and collaborate on this issue throughout the coming year. You can also reach out now to ca at wikimedia.org if you have thoughts you'd like to share. We have a lot of work ahead, a long way to go to do this right. Culture change is slow. Pushing too fast may lead us in the wrong direction, but the health of our environment matters. The safety and security of the people who contribute to our projects matter. We believe we need substantial, thoughtful action to support a stronger community, one that will continue to thrive and improve for the long term. And that's it for us, so let me move on to the research presentation by Ellery and Nitham, who are talking about that project I just mentioned. Hello, my name is Ellery. I am a member of the research team here at the wikimedia foundation. And I'm here with Nitham, who is joining us remotely. Nitham is a researcher at Jigsaw, which is a technology incubator within Google Alphabet. And we're here to share with you an overview of some of the recent work we've done on modeling personal attacks on English Wikipedia. Thanks for the introduction Ellery. So as Maggie quite clearly demonstrated in her session, harassment is a major problem on all of the Wikipedia projects. And not only does it hurt the individuals who are being harassed, but it leads to issues with the culture and engagement and contributions to the community. So what we wanted to do in this session was talk about one approach to tackling harassment through the development of technological tools. And this is a collaborative project between the WMF and Jigsaw. So our goals for this project were two-fold. The first thing we wanted to do was to develop a set of algorithmic tools that could be used to detect harassment on the Wikipedia projects. But because this was such a broad goal, we decided to focus initially on one specific form of harassment, personal attacks, and on one specific context, which was English Wikipedia. Our second goal was then to use the algorithms that we've developed to do a large-scale analysis of personal attacks and harassment that occur on Wikipedia and to do an analysis on a scale which was either cost-prohibitive or impossible before. In order to achieve our goals, we had to undertake our project in three phases. Next slide please. So in the first phase, we constructed a data pipeline in order to extract data from English Wikipedia, clean it, and then annotate it with crowdsourced annotators so that we could feed it to our machine learning models. In the second phase, we used this data to train machine learning models so that they could detect harassment at the same level of accuracy as a group of humans could. And then in the final phase, we used these machine learning models to label large scales of data which would have been very expensive for humans to label and to try and come up with better insights and understandings of the behaviors around harassment and personal attacks on English Wikipedia. Okay, so as Nitha mentioned, in order to build our models, we need to collect examples of talk page comments that are personal attacks and we also need examples of comments that are not personal attacks. So to do this, we start by processing the raw revision history and we extract all comments that were added to talk discussions. Then we get human judgments on whether these comments are personal attacks for a subset of the comments. And here we use a crowdsourcing platform called CrowdFlower. In particular, we have each comment rated 10 times and we use the fraction of people who thought this comment was a personal attack as our labels. Next, we can use the subset of labeled comments to build models for detecting personal attacks. Here we use a conceptual framework called supervised learning where we feed numerical representations of the labeled comments into a learning algorithm and this produces what is known as a classifier. And this classifier can then take an unlabeled comment and output how certain it is that the comment is an attack. Since we want to rely on our model to study personal attacks on Wikipedia, we need to evaluate how reliable it is. To do this, we can compare our models judgments to the pool judgments of a group of people. It turns out that the model produces results that are comparable to pooling judgments from six CrowdFlower workers. So since the model is fairly reliable, we can use it to get approximate labels for the entire history of talk page comments which would be prohibitively expensive and time consuming to do via crowdsourcing. So we've made a demo version of our model which is freely available online. If you go to wikidetalks.appspot.com, you can see it and start playing with it and evaluating its accuracy yourself. So when you log on to that website, what you'll see is an interface like the one in front of you. You'll see a text box where you can enter either a piece of text or a revision ID for a revision on English Wikipedia. And then when you hit score, what the algorithm does is it'll look at the text and try and understand or provide two scores about its confidence on whether the comment is either an attack or not an attack. So for example, for this comment, congratulations, I don't know whether you're aware of this or not, but you've shown your qualified stupidity. The algorithm scores that this is an attack with a confidence of 82%. And this is an interesting example because our algorithm sort of had to navigate between positive keywords like congratulations and negative keywords like stupidity, and it did this navigation successfully. We'd also like to talk about some other examples of some successes of our algorithm. So unlike just looking for bad words, our algorithm's very flexible to different types of spelling. So here you see an example where certain expletives have been changed to include random symbols. And again, because we're using something that is looking at the structure of the sentence and the characters in the sentence, it correctly identifies that this is an attack with a confidence of 69%. So it doesn't really care how any of the particular attacks are spelled. And in the next example, what we'll see is that the algorithm is also able to use the context of the sentence to distinguish between different usages of words. So in these two examples, the word punch is used differently. In the first, I will punch your lights out, punch is used in a very aggressive manner, whereas in the second, let's drink punch, punch is used in a neutral manner. And again, our algorithm, unlike a bad words list, is able to look at the context of the sentence and figure out which use is aggressive and correctly identify that one as the attack. I don't want to leave you with the impression that the algorithm is infallible though. So in our last example, you see that the algorithm misclassifies the sentence your intellect is lacking. This is not an attack. And the issue here is that this word pattern hasn't appeared anywhere in the training data nor has anything similar to it. And so this sort of highlights the needs to continually train our algorithm going forward so that it can continue to learn new patterns and modalities of attacks and harassment as they occur. So despite some of its quirks, the model does open up really exciting avenues for analysis. So as I mentioned earlier, we can run the model on the complete historical data set of talk page comments and get the probability that each comment is a personal attack in the eyes of the model. And this in turn allows us to investigate questions surrounding the prevalence dynamics and impacts of personal attacks on English video. So the most basic question we might ask is, how many personal attacks are there? Or maybe more precisely, what fraction of comments are personal attacks? So this plot here shows the fraction of comments that fall above different attack probability thresholds broken down by the user and article namespace. The takeaways are that there's a greater proportion of personal attacks in the user talk namespace as compared to the article talk namespace and that at a conservative threshold of 80% certainty, roughly one in 400 user talk comments is a personal attack. So efforts to moderate personal attacks involve warning or blocking users. And here we show what fractions of users who have committed at least one attack have been warned or blocked. So for users who have made at least one attack with 90% certainty, only 25% have been warned and 15% have been blocked. So what this means is that most attacking users go unmoderated. So this is kind of a complicated graph and I didn't want to get into the details, but I think it illustrates an interesting point. So what this graph tries to do is it tries to understand how the total number of revisions a user writes is related to the total percentage of attacks that occur on Wikipedia that that user group is responsible for. And by the graph you see that there are two very different types of attackers that are active on English Wikipedia. So the first group is the vertical bar on the left. And that group is the group of users with a very, very small number of revisions, but whose revisions have a very high proportion of attacks. And it turns out that if you want a group of people that's responsible for 75.7% of the attacks that we found on Wikipedia, so three quarters of the attacks, are conducted by users who have less than 10 total revisions, but most of their revisions end up being attacks. These users are likely to be sock puppets. The second big group of users that we are interested in studying are users with a very high number of total revisions, which is users with over 200 revisions. Now these users tend to have a very low proportion of attacks amongst their comments, but because their total revisions are so high they're still responsible for a significant number of attacks which occur on English Wikipedia. And so the purpose of this graph was just to demonstrate that it's important to look at the different types of attackers that occur and to come up with different solutions for the different segments. Finally, we wanted to indicate that we're still at the very early stages of this research, and in the next few weeks and months we're going to continue to improve our modeling, extend our analysis, and release our annotated data sets so that other researchers can conduct their own research. In addition, we'll be trying to integrate our model with the Aura's API system so that extensions and tools can be built on top of the models that we've developed. If you have any more questions or if you'd like to participate, please check out our research page. Am I live? Good. All right, great. I'm Joe Matazzoni. I'm the product manager for the collaboration team, and I'm here to show you some of the improvements the team's been making to the notification system recently. Go ahead to the next slide. In particular, as you can see, I'm going to talk about the notifications page. I'm going to give you a quick tour which has been rewritten from the ground up recently. So go ahead and launch the demo, please. Great. Great. Okay. So this is the new notifications page. Those of you who are familiar with the old notifications page will see that it has a new, much cleaner, more logical design that's based on a kind of an email metaphor with selections on the left and messages on the right. And after spotting that, your eye will probably go to this recent activity panel which is brand new and which has, you know, designed to do two main things. And the first is to give people really a big picture of you of their whole wiki world. And as you can see, there's various wikis that are listed over here. And those are the wikis where this user, this demo user has received new notifications. And if I click on those, I will now see all of the notifications that have been arriving. Well, actually, I see the new and the old notifications on that wiki. So that so-called cross wiki capability, as some of you will know, was launched last quarter in the notifications panel. The dropdowns that come from the top up here. But in the panel, you really only see a small subset of your notifications and you really only see your unread notifications. Whereas on the page now, it's really as though you were on that wiki. So you see your complete messaging history on the various remote wikis. And the second thing this page is designed to do is to let users focus down and to focus on a particular discussion or page. And you can see, if I click on any of these pages, I'll see all my notifications on those pages. Right, so I can just concentrate on what's going on in one discussion I might be interested in. I can also focus by, at the top, you'll see here, I can focus by read or unread, which lets users decide to concentrate on what's new or maybe surface old messages they're looking for in the read thing. And while I'm talking about read and unread, I'll mention that there's a new interface. Oops, I'm still filtering. There's a new interface for that. So this blue dot over here is now the, lets you toggle between the read and unread state, which is much simpler than the old way we used to do that, which you had to go to an X, which confused people, and then you had to go to a menu to mark as unread and things like that. So more logical, a lot cleaner. And that's really an example of many, many changes to the notification system that have been made to make it more convenient overall. And I'll just mention one more before I go, which is not on the notification page, but up here in the bundles that, I mean, sorry, in the panels that come down. For some time, users complain, particularly very active users about getting repetitive notifications, particularly about things like thank yous and page links. Some very active users get scores of messages saying that a particularly popular page has been linked and things like that. And so-called expandable bundles make that a lot more manageable by allowing you to, by sort of bundling those notifications into a group that you can either collapse or expand. If that looks familiar to some of you, it's because expandable bundles were actually introduced last quarter as part of cross-wiki notifications, but they only used to work for cross-wiki notifications, and now they are reducing clutter across all notification types. And I think, let's see, so collaboration teams have been working on, you can go back to the demo. I mean to the slide deck, and yeah, that slide. Okay, so collaboration team has been working on notifications for the last 10 months, making improvements all along the way with this last round of releases. We're really sort of concluding major development for now, but of course we're always interested in new ideas, and we'll be fixing bugs and making minor improvements. So here's a lot of ways you can get in touch with us if you need to, and that's it for me. Thank you very much. Okay, now we have some time for questions and hopefully answers for those questions. If you have any, there's a microphone right there, and I believe Guillaume is on IRC to take questions there as well. So we have three questions from IRC so far, we can ask maybe one or two, and then we can alternate. So the first question was from Matt Flaschen, for Neil, and Matt was asking, how is the number of mobile anonymous active editors estimated? So the number of, so the mobile active editors aren't really estimated, that's just looking at how many of their edits, how many of these editors edits were tagged with the change tag mobile edit. So that's not estimated. The anonymous, the missing mobile editors anonymously, I estimate very roughly because we had the discontinuity from March to April when anonymous editing was switched on. I just assumed that if mobile active editors as a percentage of desktop active editors stayed constant from then on. So saying, okay, then if mobile editors fall short of that percentage, we can very roughly estimate that maybe the remainder is missing editors. Also for Neil, in the editing, is there anywhere graph to compare how many new articles versus just edits, to see if what changes there are there, see if some of the editing that fall off is because there's fewer new articles being created, anything like that? I don't know if there's a specific graph that gets that question, but I guess it sounds like what you're asking is low-hanging fruit, is it possible that there's somewhat of a decline in editing because a low-hanging fruit has been picked? That's a good question, and I'm not exactly sure how I would address that. And I think some of the researchers have tried to get at that before and I think the consensus that's not a huge driver of the decline in editing, it probably has more to do with community dynamics, but that's not, that's like, firm knowledge. That is a good question. I have a question about harassment, scoring, assessment stuff. I was wondering if you guys had thought about looking at the edit summaries as well, because it seems like a very large percentage of personal attacks are put in edit summaries so that they can't be removed, and that will also affect things like assessing the percentage of comments that are considered personal attacks, so it might be like something that might skew the results and should be considered. Yeah, so we had not considered labeling those, but that's a good point. We do have those in our corpus, but we have not sent those comments through CrowdFlower. That's a great suggestion, we can look into that. Okay, so I also have a question about that algorithm. I'm just wondering, I think it's amazing to see like the data in front of us. I think it's something that can be a little problematic to find the solution only from that kind of data, but I'm kind of wondering, do you consider also checking by context? Because it seems to me that it might be a little problematic when you see only specific comments, and we know that a lot of the harassment that has gone on, especially in bigger events that we know about, have been kind of a passive-aggressive barrage of things that if you see only a single comment, you might not consider that specific one as harassment, but when you see the entire thing, that is clearly horrible. Did you consider doing something like that? Yeah, so you're right, harassment is like a huge complex of behaviors, and right now, in the early stages of this work, we're focusing on our unit of analysis is the diff, the single revision, but clearly although individual personal attacks that might be a good way to go about it if you want to sort of get at all the other ways in which harassment manifests, you need to look at the context, the conversational dynamics, and that is something that we're planning on doing, but that's significantly harder given the way the comments are structured. And right, the fact that they're not really structured. Yeah. Thank you. So this is a question for, I guess, both Joe and Ellery, and I just, I see such strong connections between the great work that you both have done, and I wonder how do we make these two great tastes great to get? Like how do we start integrating this kind of you know, this kind of filtering, scoring of comments with the systems that this conference certainly impact our community? Really? Which part was the sausage, all of it? I was just sort of, how do we make these things work well together because you know, it's the tools that are how these comments and the harassment gets presented to our communities. Could you clarify what two things are working together? So we have a system that can detect the tax, and what is the other system that it should work together with? Like Joe's system that notifications. Other notifications, right? Yeah. So at Wikimania we had a session, a brainstorming session about how to, you know, given that you have this oracle that can go through and take a live stream of revisions and score them according to whether they're personal attacks. So what sort of moderation tools could you build from that? So there were several discussions that came out or ideas that came out of that. Some of which was to basically you can subscribe to a feed of the worst comments coming in and if you'd like to intervene and moderate that's something that you could do. There were suggestions about as you're typing maybe giving a notification that this comment is unlikely to be well received. An idea of basically once you see detect that a comment has been posted that has a high chance of being a personal attack that you basically delay the rate at which you can submit new comments. There was a whole bunch that were documented in terms of how to integrate it with notifications. So maybe Joe. I don't know if the notification system is the relevant place because I don't know so many of the messages that come through the notification system are actually formatted because they're about certain events like you're been approved for new rights or whatever it might be, but Collaborations Team's new project is also focused on the Edit Review Improvements project is also focused on identifying using ORAS to identify good faith users and to try to create a more supportive Edit Review process for those new users and certainly the issue of harassment is also aimed at retention of new users. Certainly there is a very interesting crossover that we should be exploring with Neaton and Ellery's team on that I think and ultimately this harassment type of identification could become some part of that more supportive Edit Review process I would think. Another thing that would help tremendously is if comments were more of like a stronger notion within the software that there was sort of a notion of a coherent comment that would basically make it much easier to researchers to work with the data but also to annotate them. So I think the big thing is basically being able to click on the comment and have anybody sort of tag it as this comment is harassment for example that would basically facilitate building the models in the future. Yeah Sorry, I think on like to Muriel's point of context is important sort of having some notion of that context inside the system will probably give you the signal better signal I mean it strikes me that the other system that's extremely relevant here is would be flow and that team is certainly exploring ways to get back working on that project as a comment discussion tool and so there might be some very interesting crossovers there as well so thanks for suggesting We have a few questions from IOC and I don't want them to go unanswered so there was a question also about detox from Pine is there a plan to create an anti-vandalism tool or to integrate the tool into the existing anti-vandalism tools so that the edits that get detected by the attack algorithm are flagged for review by editors so like from our end we're primarily working on refining the models and building an ABI so we want to integrate these models into ORS and then yeah we'd like to help members in the community who are interested in integrating building tools around that as much as we can but yeah from our end currently the plan is the ORS integration to effectively enable other people to build tools and the other question was from Chris about the same topic so his question is does the detox tool evaluate the text holistically or does it basically answer the question does this text contain an attack or integration somewhere in it yeah you can think of it more as so ideally so if you have a piece of text and anywhere within that text there is a personal attack and you should label that entire text a personal attack in my opinion that's currently not exactly how the model works right now it's you can think of it as additive in a way where if you put it in a personal attack and you masked it with lots and lots of things that have particularly positive sentiment then the model would get confused and that is the current shortfall so although in the demo it got some of the cases right with the congratulations if you could try this you can just say thank you thank you thank you thank you thank you in an expletive and it would probably all the positive words would override the one negative word which is sort of a shortcoming ideally you would just have like it would be like a trigger if there's an attack anywhere in the text no matter what the rest of it is but that also makes it a little bit brittle right if it could fail if it was tripped somewhere so yeah that is something we have no more questions I had a question about the new notification features I noticed that the new special notifications doesn't really seem unusable on mobile and I was wondering if you guys were planning on rolling out a mobile friendly version that fix should be rolled out when Monday something like that it should be we're on it just checking thanks back to Ellery on the comments when you were looking at those and you had people rating them and saying this is or present a negative harassment did you look at how many women said they were harassment versus how many men because quite honestly men will miss a lot of things because we say a lot of shit that is harassing without ever knowing it yeah that's an excellent point so we have in I think we have the genders of the writers but nothing would know more about that but that's something we haven't seen but we could look at that we could see whether people's how bad a comment is varies by gender along the similar lines right now right these the people who annotated this aren't Wikipedians they're basically people who have signed up for crowdfower from around the world so that's another hypothesis that we'd want to test is how do people who on Wikipedia often differ from sort of someone in the general population just in terms of like their experiences that they've had on Wikipedia how does that change how they evaluate comments but yeah I really like that point hopefully we have some information on the gender to look into that I'm going to say this will be our last question that we have time for so my question is for all of the teams working on harassment how is all this different work actually together so for instance I think Ellery in your graph that talks about there's a lot of less than 10 revisions who are doing like 75% of the attacks on English Wikipedia and I'm just wondering how that information is actually used when we are developing new things and processes new trainings like I'm just kind of wondering how it all comes together I don't also know if any one person can answer this question yeah so this research is sort of brand new and sort of we're right in the stage of producing all of this sort of knowledge and how you know Patrick and Maggie want to pick this up you know those are conversations that we need to have but maybe they would like to chime in yeah I'll speak up briefly just to say that the answer to that question is Patrick Patrick has involvement all across the organization with lots of different teams doing harassment so that he can make sure that we have a global view of the work that's being done and a global view of the information that we're collecting yeah and I mean I think that's thinking of these all sort of a lot of these aspects coming together into you know I know sorry to mean a single thing but you know there is sort of the social aspects of the issue that have to find social solutions to and the technical aspects that can be solved through technical tools that hopefully will come together in the end but I mean one of our focuses right now is providing better tools for our contributors to make human decisions around these problems and so I think that Nathan and Ellery's work is going to come into that pretty strong one and yeah I mean I wish I had an answer for like the one thing that we can do to bring all of these together but I think it's going to have to be a multi-pronged approach and I think we're sort of finally getting to the point where those prongs are a bit more defined than they were previously but it will take a combination of things and it will take some leadership from the community as well to decide how and where these things should be used and those decisions have to be made after some good discussion and consultation but I think that we're getting to the point where there is quite a few possibilities for progress and we just have to decide where and when these are going to happen and by we I mean everyone in the movement not just the foundation I think it's a great question and just to put a little bit finer point on it like how do Niel's metrics how are they influenced by the war harassment very briefly are you talking about the metrics I presented today I'm just saying I don't think I'm the only one who sees sort of a bright line combining all of these things right and I think sort of to follow up on Sati's question how do all of these things fit I think that's a very good point because Robert asked about is it low hanging fruit that there's not as much easy stuff to do and that's why we're seeing more people and while that is a possibility and I haven't seen that conclusively refuted I think a lot of thought and a lot of work Aaron Halfacre has done particularly has gone into this is thinking that it has more to do with the social environment of the projects and how it's not easy to get started as a new editor and that has put a hard damper on our growth so I think harassment is a really big part of this and I think this could be a very important part of changing that situation and sort of the the most ambitious analysis that we have planned for this project and it's sort of unclear whether it will work is to sort of evaluate what the causal impact is of being attacked and how that changes how you edit where you edit the frequency of editing there's a lot of like methodological issues there but that would be one of the really exciting questions for us is can we really measure the impact of being harassed using the algorithms that we have it's unclear whether we'll be successful in that Thank you for your questions everyone and yes now we want to set aside a couple minutes for WikiLove if anyone has any shout outs or thank yous that they want to say this month you can step up to the microphone or let you know an IRC I know you all want lunch okay I think that's it then thank you all and we're done