 So I put up a log post that's on emails, it's a non-list, it's like, what does it mean, is there anything that contributes to that list here? Yeah, the things that, okay, this is what we're doing. We're trying to make the, I think we need the answer to do it in a very correct way. I was one of them. I was the one who was trying to do the list. Okay, I'm figuring out the discussion about it. I think we need to do it more. Oh yeah, that's what we're doing. Okay. Okay. Like, here's two variations, which would you prefer, right? Or like, we're gonna do something with location, where it's like, here's some, which is most important to you, which you say, like, I'm traveling, I'm on to find an area, I'm in my local area, so basically, we get a sense of, like, what is, what is the goal to say that one name is? I don't know. I would be very willing to ask people to vote. Ask people what they want to do is if they don't have another horse, they can vote for that horse, but like, that's true. In part of it is, like, it's very true that, and that's what we're trying to do. There are some people out there who do it, which is great to find the people on their own, but many people aren't. That's true though, Okay, everybody, we're going to get started. Thanks for coming. I'm Corey Floyd. This session is about where to surface AI in our projects. If you went to the previous session on AI this morning, just to talk about how this session is different from that session. That session was more about discussing types of models for analyzing our data and analyzing user behavior. This discussion is much more focused on our products. Where do we show things? How do we show things like results from AI? And just how we're going to change our products to incorporate RAI? We will. I was leaving it. Thanks, Howard Gordon. Yeah. I think there's one like into that area. Well, then that might go back. Okay, that shouldn't show up. Yeah, where was that? Yeah, so this one's more like product-based. So before we get started too far, Erin was just going to give like a brief talk over existing models. Another difference is the previous session was like, hey, what can we do? We can do anything in the world. This is going to be like talking about like, hey, what things do we have already in AI and where can we show them? So Erin's going to give like a brief overview of those things. And if you open up the ether pad, you can check. There's some links in there too and some resources you can look at. Okay, so I want to talk to you about a few different AIs that we have available right now or they're right on the cusp of becoming available. And so I'm actually going to steal this second link because it's got a bit of a reference and I'm just going to move to a different URL quick. So who here has heard of ORS? So ORS is a web service that will give you predictions about various things. There are two major types of predictions that we do right now. One is called EDIQ quality and the other one is called article quality. And so for EDIQ quality, we have a few different types of predictions. The most used one is called damaging and if you give it an ID for an edit, it will tell you whether that edit is damaging or not. And so essentially we use this to filter the recent changes of Wikipedia. So essentially the internet makes a lot of edits in Wikipedia and ORS will be able to use this damaging model to flag edits as they come in so that they can be reviewed by editors. That's anyway one of its use cases. We have two other models that fall under this EDIQ quality type. One is called good faith. It predicts whether an edit was probably saved in good faith or not. The opposite of good faith is really actually vandalism but obviously the opposite word would be bad faith. And reverted, which predicts whether an edit will need to be reverted or not. These predictions, is the edit damaging, was it saved in good faith, will it need to be reverted, are pretty similar. But it turns out that reverted is a superset of all damage, not all reverted edits were actually damaging. And not all damaging edits were saved in bad faith. And so by using these models you could say, for example, build a tool that would differentiate why you might need to review a particular edit. Okay. And we have a few tools that are using it to do that right now. The article quality models, which we refer to as WP10, are based off of this assessment scale that a few of these use, English Wikipedia, French Wikipedia, and Russian Wikipedia. We've been able to get a lot of observations of Wikipedia's labeling articles by the level of quality that they're at. And so based on that, we can build a prediction model that can tell you, as a given version of the article, what quality level is it approximately at? And so we're currently using that to help people find the articles to edit and suggest file, which is one of our article recommender systems that I'll get to in a little bit. And I also use it for data analysis. We currently don't have it surfaced anywhere else on the Wiki. And so that might be an interesting product idea that I could throw out there. Okay. So those are the predictions. Oh, wait, no, I forgot. So there's a few other predictions that I would like to talk to you about that come from this Aura system, but they're not quite online yet. Maybe... How do tabs work on Macs? What's that? Oh, don't know how to use any closed-source systems. Command T. There you go. Oh, what was the etherpad? Oh, etherpad. Oh, yeah. So you can do a four-finger swipe. Four-finger swipe. I'm sure that's patented or something. So one of the other models that we're... One of the other models that we're just about to release predicts the type of edit that was saved or the type of change that was made in an edit that she saved to Wikipedia. And so we worked for a while with Wikipedians to come up with this taxonomy. Essentially, we asked Wikipedians that, hey, when you're looking at a history page for an article, what would it be nice to know about the types of edits that were happening on that history page? And so regretfully, I didn't find the mock-up in time to show it to you, just like put little tags next to the edits. What should those tags be? And we came up with this list of copy edit, clarification, simplification, point of view, which is really usually moving things towards a more neutral point of view, refactoring, which is moving around content, fact update, elaboration, which would be adding substantial new content that's not just updated in fact. Verifiability, which usually has to do with removing unverified text or adding a citation to make text verified. Disambiguation, which has to do with fixing links between pages. Wicification, which would be mostly formatting and linkifying text, turning bold text into proper headers and that sort of stuff. Vandalism and cum remandalism should be pretty obvious, destroying articles or reverting back to a good state. And then process, which is a lot of activities on Wikipedia people will put clean-up templates on articles that say this article needs sources, this article needs to be expanded, this article needs images, that kind of thing. So that's one more of our models. We're pretty good at predicting what types of changes somebody made in Edit Based on this. And then four fingers swipe up. Ha ha ha. And then the final one is draft quality, and I'm realizing that I don't have a very good page to describe what this model does. So I'm not going to click on this link, but you can see one of my own product ideas that I'm going to follow this link. Essentially, reviewing new article creations in big wikis is a huge burden. And so people don't usually have that much time to work with newcomers when they create new pages, but they need help getting them up to the quality that Wikipedia will accept. And so everything that's not up to quality just gets removed. So we built a model that will flag the things that are truly concerning so that they can be separated and work. So this model, if you give it an article that was just created, it will tell you if it's likely to be a personal attack, spam or vandalism. And of course, things that are not good that fall out of that are not notable and that sort of stuff, but they're less urgently problematic than these things. So those are the... I was just curious, why didn't you do semantic intent? Why didn't you break out vandalism here? So we actually even dropped vandalism from the taxonomy because we already have prediction models for vandalism. And so we already had good ways of predicting that class. But we included it in the high level taxonomy because the taxonomy is what Wikipedia's want to find, but we're actually modeling that the types is not. So it turns out that the other types model and the good faith, bad faith model, find out, well, this is bad faith. And it was also clarifying something. It was just clarifying it to something that was offensively wrong. Yeah. Does that answer your question? So the last thing that I want to show you is article recommendation. And so the best example that I can give to you right away is Command-C. Okay, Command-C. Command-N. Actually, I bet you this is going to work. So SuggestBot is a bot on English Wikipedia which... Oh, darn it. There you go. It's very weird that it's not actually linking to the user page. You can find the research report about it and the repository that it lives in. Yeah, well, so SuggestBot will recommend articles for Wikipedia instead. But I'm really hoping to show you all the user page. Which, okay, so I'm not going to find... Anyway, if you search for user SuggestBot on English Wikipedia, it's actually active on a lot of other large wikis. What SuggestBot does is it takes your edit history and it will predict articles that you haven't edited that you might like to edit based off of that. One of the major ways that it does that is finding editors who edit the articles that you edit who have edited articles that you've never edited and recommend those to you. So we cover filtering strategy just like how Netflix recommends movies that you might like based on people who like movies that you like also like these other movies. And so we're pretty good at recommending articles that people might like to edit. And as you saw, there's research reports that show that this actually works in practice. And so we have pretty good strategies for making those kind of predictions, too. All right. And that's the list of AIs that I have for you. All right. Thanks, Aaron. So let me go back to the Etherpad real quick. So we're going to try something out here. We're going to do a little brainstorming and discussion. So on the Etherpad, if you open it up below where Aaron was discussing is a place for ideas. So I realize this is pretty short notice. We just start coming up with the ideas. So to kind of seed you a little bit, actually the reading product team where to use AI about a month or two ago and made a nice spreadsheet of several ideas. So there's a link to that in there. Josh from the reading product team is actually here now. And just to kind of get started, he was going to talk about one while maybe you guys throw some ideas down. And then what I'm hoping is, you know, come up with an idea and then kind of discuss it and pros and cons and things like that. So, yeah, Josh. Yeah. So by the way, this list was not made for General Motors. I apologize for the advancements. Pretty shorthand. So one thing that on the reading side, for those of you who don't know, the reading team at the Foundation is focused on the consumption of Wikipedia. So readers and people that actually read the content, not the editors or some of the other things that were focused on in the other verticals. And one thing that we've been thinking about a lot is the trustworthiness of Wikipedia, especially in the context of the fact-free world and on the sort of topics that are in the media right now is basically getting people to understand where it comes from and whether they can trust an article and the information contained in that article. And right now on the apps, for example, you just, you don't even see the hat notes for a lot of things. You just get the article and it looks nice, but you don't get that sense of is this a good article? Can I trust it? So one idea that we were talking about is using the OREZ scores. Did the changes in the OREZ scores to just tell the users, okay, here's what we for the articles that don't have a pan-edited quality score. Here's what we think the quality of this article is. Or you shouldn't trust this article maybe because it's been identified as potentially being vandalized recently. So that was one idea. And sort of building on that was also the trend in the article. So one feature that we're reaching in the reading side is reading lists or saved sets of articles. So things that you're pretty interested in to and one thing you might want to know about that is what the quality of those articles that you've saved is getting better or worse. Did the version that I saved two weeks ago was that a better version of this article than the article that's alive on the Wikipedia now version and giving users, using the OREZ scores and the change in OREZ scores potentially as a signal to users that, okay, yeah, this has gotten even better. So come back and read it because it's been cleaned up or it's been subject to much vandalism and maybe skip it this week. So that's a couple of specific ideas up of this list. As Corey mentioned, to the general thing, what we're trying to do is build nides for new machine learning and taking some of what are in the research that you have done and actually figuring out ways to apply it to users' actual usage. Thanks, Josh. So I guess just to play a little bit to advocate for some of those. Did you guys discuss any downsides in displaying article quality? There's some stuff in the notes there. I mean, obviously, there's some social aspects to it. As an editor, if I work on an article and maybe it's not the best article and I open the app and it says this article is terrible, that may not be the best, most encouraging editing experience. And so there's a lot of things to consider with sort of letting a machine have judgment over the human editor. And it's one thing I think in the literature to find articles to improve or helping to monitor cues for patrolling. But when it be presented to millions of people who are reading that article, it becomes maybe a little bit more of a sensitive subject. So I think there's a cons or like concerns column, but it's definitely, you know, with all these things, we have to be careful about the social impacts and particularly on the motivations of editors and the upstream when we're talking about how we present things. All right, thanks. So open floor, anybody have any ideas of where they would like to see like AI, like surface maps or what type? So I just came out of that photo review session and one of the things that I was suggested is that you can touch a little bit on previously mentioned topics. One of the things that's hurtful is that you're making just style index. It's not really a problem when your content is a problem of how you structure it and you need to be alerted as part of the editing process hopefully before you even make the edit, but it's not that after. I know that if you did it before you made the edit, you would have to have UI and the more expensive feature to implement. So like before, actually before the edit instead of after. Why before instead of after? Just because I don't think you should commit things that shouldn't be committed ideally. I don't know if that makes sense, but from that good point of view, I wouldn't submit a commit that knowingly failed test that could be made before that. I think it was one of the problems in 1998 when it was open to both AI and the world. So to have some sort of agreement I'm making changes making edits and I think that I should be able to do my way about my changes in the sport before. So like know how the impact you're making on the article right now. So like just to throw it out, it could be kind of cool like maybe I saw a color like I feel like green and yellow. Maybe you know yourself without actual seeing it. Yeah. So like before. So instead of like telling like heading for it, it's going to go through after it's made. You can make like a say like a red button. Yeah. So real time. Yeah, real time. Not even like they see my button. It just like a special stage in my investigation process. Oh right because you could hit the button and it could tell you like oh this is kind of seen as damaging or this is seen as like missing like a reference or something like that. Yeah, that would be a visual preview. Yeah. Sorry, who's next? Thinking also another idea we're discussing in another session. Well, a couple of ideas appeared related to reading and winners. One was about in the context of current events like article about Olympics some other news, actions an accident, earthquake or whatever. The way people keep updating you it's reloading and I'm looking for a change for going through views that are at the granularity level that are more interesting for editors than for readers to answer the what's new, what changes now in a way that is useful for readers. I think there's also an opportunity there for using AI to really bring those new meaningful updates. I'll give a response. Just that I think that that would make a lot of sense that any sort of quality jump in or as article quality prediction would represent a substantial change to a reader, at least that's how it's intended to be. It seems to work that way in practice too. I guess my question to that is are you picturing almost like highlighting a section that is new rather than the actual changed content of the diff but actually extracting a meaningful part of the article and saying that's the new thing? Yeah. Instead of using Twitter to keep updating the subject, we could just go through the article and see how it's evolving with a more meaningful point of view. So you could even almost have a new speed with the article of extracting things that have happened to it. We can discuss how better to present it, better to highlight the content. Well, that's the purpose of the session. Yeah, that's cool. So one question I have is how much basically does the organization want to have control over how the content is being presented whether it's through services or other types of content versus basically giving the building blocks to editors or users and then letting them decide. So for example, the case of course whether the store is shown before you save or after you may make a different decision about this depending on your culture. For example, if you're in Japan, your users may react differently to seeing the store before setting the button versus entering the site. And this is a decision which is very important to make in the foundation level but it's much easier to make at the community level. So I'm wondering, and now I was saying this, the technology that by providing these building blocks they don't provide best practices or frameworks, but how is that basically they won't have a homogeneous system like which it has its own benefits and that just won't be used in that case. But I think providing some sort of building blocks at the next level but also at the service level because if there's practice management for each community, they also will be something that you won't consider except that we're on time there for watching the COI across all projects. So like treating this as a tool like we would treat like almost anything else like our templating system or something like that that editors can use. Like Aaron, do you have any specific comments on that? Obviously you're doing a lot of AI work. So the current status of things is there are a lot of tool developers who generally target individual communities but some of them do target like Wikipedia projects generally. So there are a lot of tools that help with countervandalism and SuggestBot was an example that targets internet and wiki communities. But I don't think that we have anything that sounds to me like UX building blocks. I think that we do have service building blocks. I'm not sure that we have a best practices manual but we do have a little bit of this like per community user interface design that services like an interface to an AI. Interesting next step. Anyone else? Do you have categorizations and important part of both Wikipedia and the accounts and more projects that are special in those two? I think that maybe AI can be used for somehow either suggest categories or something that we just don't know the position or to I guess this issue is especially important in accounts because there are blocks that we don't have in any categorization because of that they are impossible to actually reuse in the future. So that would be kind of a fun to use. And just because I think categories mean different things to different people can you like maybe elaborate on what you made specifically by categories? Yeah, sure. If you think you're good but now that we have accounts for example from here AI can put categories in it so that we can be better found. So for example a picture of the window can be categorized under Ray San Francisco the Presidio several other categories. So in the future someone wants to find pictures of San Francisco or be a comprehensive theory of categories but if you simply take a picture and upload it without any category the picture simply gets lost because you cannot find it in people. The categories like that metadata allows people to reach the event. This is especially important because yeah sure if you use this search bar you can't find images because by title but titles are always so sometimes I wrote DSC-001 and that doesn't help. It's probably very difficult to write on AI to find a category for a picture but I'm simply very fine with my relation to copyrights. If I understand correctly that categories are just like a fancy name for tags yes the software world then I need a lot of other software platforms on using AI to suggest tags a lot so that would be an amazing use case if that's not what you meant. Does anybody have thoughts on whether we should be suggesting editors of the tags they should be adding or whether we should auto-generate the tags and put them in directly? Suggest? When you write out of the long though we suggest that people there's a gadget that's pretty popular that does something like this Mine's really quick this is kind of the same thing but there's a liner edit checkbox it seems so strange that I have to tell the software that's right maybe people just feel really different about what I'm saying that's probably another problem that would be great but then the checkbox would just do it we're just going back to categories and tags even if you just do a suggestion they can still make you leverage the search as well even though we only surface to one editor's pick we could still under the covers re-running our searches against one to sync that might take a long time to mention It's still not fixed but it improves the way that the categories are defined because at the moment Josh mentioned you might have a category where you're like red men, men, wearing masks is there a way to work Aaron? I think she means can we organize the categories to hire these or other things artificially basically because right now they're not actually a taxonomy they're just a loose list of tags can we build a taxonomy out of the categories a loose list of tags that are at different levels of the hierarchy so I'm not sure what we could do with a prediction model but there's been some research published about ignoring that there's any sort of hierarchy at all to categories and just doing a assuming that categories between articles or common categories between articles are using graph methods for that and you could say take the article history and then say give me all the articles that are closer to this article than say military and you know food and art and that sort of stuff and that works really well for cutting off a wiki in English wiki wiki projects work really well for that too because they have essentially a single layer hierarchy for that and they're no loose which is really the biggest problem of the category hierarchy so yeah I'm not sure of that a prediction but we do have some strategies that actually might manifest as like a service that looks a lot like an AI we could still do it, I'm not saying it's not doable just it's anybody's opinion what you call an AI of these days if we should be telling them that there's a minor edit what's she saying if there are more options they can tell them as part of the wiki label opposite way so almost like training the model by maybe you can ask people who are all teaming to the environment they just want to get a nice ton of trouble out of it but they can make edits or just use the AI to set the default stage so using AI to suggest that's probably a good thing to think about as opposed to making decisions and putting things on articles or on edits using them as the default suggestions from people what she said in terms of training the AI so which means it's a little bit different from the other software it's kind of like a vanilla kind of like a community so you can ask them more than the other you know like Facebook or whatever you can ask of a normal average user so I would say definitely go for it if you can get more AI just right there I just wanted to go back to the what she was saying in terms of like she was talking about standardization I'm very much in favor of standardization free software but good standardization I think we need that a lot sorry I'm sorry but weak media in general it's just all over the place the style guides are just amazing so I think I think AI can be really a good influence especially because AI actually comes from the community so it's not like someone who is dictating something so it learns from the community the best practices the community decides and propagates and becomes higher and higher so this way writing titles citations so many things like that and I mean writing the articles commenting so many things there just like a little bit of suggestions of like even like something like what Hope Power does with the utility I think they say 30% of the people in your area did this so something like that just not being able to do the right thing or just like to follow what the community really does or not there's still a suggestion but usually people want to do the right thing or the better for other people so standardization in general interweaving stuff also if I can add that there I forget a whole topic sorry I have a list of grievances you're not the other one just to tease out I think what I would add there is it sounds like almost like education because I think some people would see AI as replacing some of the roles of the community but I like the way that you put it as it's empowering the community to actually affect larger spans of articles and they can ever do just by themselves manually by us training in AI with their taste and their concerns and deploying that across just a mirror without being able I think that's interesting not to plug a project but period so Andrew has the ability to add what we keep the description to articles and which allow us to within the app and I think something that AI could be helpful is like saving the love in terms of marketing in New York that's an article and you want to add so if the AI there could be a lot of stuff that tells you these are different, these are all the articles about parks that have a human description and this is how they format it allows the editor okay I should format the human description of public property in New York in the United States just helping people kind of as well as suggesting established right on the power of format but they comment about the standardization AI your concerns and your views as the editor and as the reader there are many times that I would be looking to see as a project that I wish was in the United States to find so many things but I just want to say that in general machine learning and artificial intelligence is a very hard task and I think industry in general is not doing a very good job there are many voices of industries currently encouraging incentivizing people so I'm not even on the edge using AI but it's I would just be very marketing about these kinds of incentives when we get what the ratings are currently working with the chance that we can best take up under the umbrella and that's something that it shouldn't stop us from trying but we need to keep that in mind as soon as we start taking away the disability in editing and reading we're entering in an actual process and we're inserting our voices Are you trying to hint at the recent elections and how machine learning is much simpler about how you write your stories on your own how you write your news and impact it in a dream by the way that's it all in the system in the basic form in some sense so things like thinking about that and thinking about that one of those social things with AI I don't know if we got it off track I used to sell a software that did document review for lawyers through AI and what you would always hear is computers not going to be as careful or as good as a trained lawyer it turns out that computers were better but the trained lawyers had never been measured so they thought they were better they thought they'd get 100% of the cash certifications correctly which turns out when you do it systematically they only got 92% correct or 93% or whatever so I think sometimes with the bias issue I think it's a legitimate issue but I think it's like those biases already exist the AI is not making them worse it's just making them measurable essentially because now you can see the bias and measure it with a computer and then it becomes like it's in your face and also the owner of the AI is responsible this machine that I made is biased it's just reflecting a mirror of a bias that already existed and no one existed because no one had bothered to measure it explicitly but it is a hard... you can say that to people all they want I had friends that worked on the grammar checker at Microsoft and their thing was that they had lots of studies that showed it was 99% effective and correct and they would still get emails saying your grammar checker is stupid and it got something wrong that my high school English teacher would get right or my idiot brother because it's like the computer's got to be perfect the humans doesn't have to be perfect I think the other side of that is is that while it might expose the biases in that particular system just like in analytics what we choose to measure also introduces inherent bias so if we're looking at a recommendation system and we're looking at sessions of a certain set of people for a recommendation system just what we're choosing to measure also does initiate a bias so things both ways or off track so one of the one of the proposals that I linked to in the beginning for talking about this draft quality model is so there's I think that there's a lot of cases around quality control for our Wiki projects where we started out thinking that the only thing that was valuable to think about was good and bad you know because we needed to get rid of the bad stuff and let the good stuff stay but we learned after a while that there's more nuance there it's not just the dichotomy there's good stuff and good stuff is always good but within the bad stuff there's unintentional bad stuff and then there's intentional bad stuff there might even be intentionally bad but not really trying to hurt people just like key mashing and then hitting save to see if it works is very different than putting racial slurs into Wikipedia and so I think like as sort of like a general product strategy we could take the values around this nuance and encode it into both the AI and the interface so coming back to that proposal about new page creation and this attack span of capitalism we can have interfaces that allow people to target the bad and make space for them to interact with the good or like socialize or socialize the almost good the newcomers who are legitimately trying would totally appreciate some feedback would probably do something useful to it and so like essentially what I'm imagining is splitting review streams into separate parts where we can have people review the terrible nasty stuff with one stream and review the stuff that's good or almost good in the other stream we can have the good faith and damaging models for edit patrolling and we can do it with the spam advertisement attack models for new page creation we'll be able to you know what I'm talking about splitting that in a separate stream we're talking about like labeling them or giving them that is about information as part of the process so I think it's different work to remove the damage than it is to when I say split I really mean like a whole different UI maybe like a different speciality UI where like one community focuses on damage and another community focuses on socialization they might have some overlaps but yeah splitting it in a way where people would even think differently about what it is that they're looking at so one thing that we were talking about in our group in the session was things like the code of conduct and limitations could be supported by an AI interface so that would be a social community part again so harassment things would be easily flagged or even given a template response for could you say more about like what you mean by limitations encoded on a policy basically going for contributions on dog pages on a many lists is another topic technically but many lists and see for certain patterns in harassment so that's actually but it's an AI that I forgot to bring up we already have some models that detect personal attacks in messages that are posted on top pages reasonably used for mailing lists you know I think it's very interesting to imagine like what would we do with that would we normally enter before they post would we notify administrators that they might want to review something that looks like it's being a tense discussion would we maybe want to prevent that post from being made a total advisor what do you think it would be really really interesting to kind of have a result about what happens if I'm writing to you something which is offensive and I'm told and I'm encouraged to be more polite in my reaction to kind of disguise the message or is that actually have to be flagged and say okay you're going to be passive aggressive about it or if you're actually going to change what you said and I think that experimenting with that and also it's a kind of an area where we're making mistakes as important consequences I think that getting these kinds of answers and I don't know if there are any results but that would be really really interesting especially where more discussion come came from my view that a lot of this harassment takes eats up a lot of contributors energy and so there would be an ability to address that and I don't know what the sound source means was sense is it the mechanical voice that shouts at you but still that's really interesting I think that's one great thing that has come up with several things which is basically that we're not a kind of community where we're going to accept that the computer is just going to make some black box decision tell you no, we're not going to send the email because it looks like it's rude we're not going to let you do the post because it looks like it's vandalism we're going to then tell somebody this looks like it's vandalism, we're going to tell you you look like you did something bad but it's like suggestions and paternalistic nudging rather than the bots are going to now enforce our rules which is I think for all of these that seems to be kind of the general tool just one more addition but then it would be for example to take weight from the shoulders if we have appropriate answers that are also researched on like templated answers for the contributors to address those issues that might be a step further step also just in this area even really really small details and the difference in one direction or the other a case with Facebook they were explaining how the way to report issues that you have with that image by adjusting the language that in which they were communicating that I'd be really interested to see this kind of feedback that the tools we're talking about by adding these tips to editors or users or administrators I'd be interested in seeing feedback just as much into our processes so for example maybe maybe you guys see that there's a lot of really negative negative interactions on certain wiki and develops a UI that shows a picture of a random person who's at the end of the program this is Bob he's the volunteer on this wiki I hope that your message is kind enough to be from Thomas Jefferson angry count to ten very angry than a hundred something like that I think we need to see an influence design decisions or product to be aware of this in a very explicit way so I got one kind of going off the harassment stuff is a on talk pages interesting with the good faith the damaging edits how many really good articles you've contributed to almost if you could see someone's scorecard so if you're maybe an editor and you're reviewing someone's edit this person in general is a very good editor or not a good editor I think seeing those types of things maybe how you deal with that edit whether you have a conversation or something and this is I've run some studies on whether you can take like these edit quality prediction models and then figure things out about the editor and it really takes like two or three edits before you can confidently predict whether somebody is kind of a bad or not I think it would be really interesting if we could do the same thing with these types of talk page posts and so we wouldn't just say like you know has this person contributed to a lot of articles and then forgive them for being a jerk we would say well is this person generally a jerk or are they kind of having a bad day that would be probably very good to know not sure how we could surface that without getting in trouble but we can always run the experience lots of good one of the things that you can do on the talk pages I'm just thinking like anger sometimes most of the time is bad but it can also be weird at some instances so what I'm trying to say is that maybe a talk like waiting an entire talk page rather than just this edit is good edit is bad can also be an interesting holistic like if the talk page is like a toxic talk page in general or a healthy one yes because maybe we can just say the components of the talk page are like if we just look at them they're all bad but people when they look at the entire talk page and what came out of it was good so it's a very so she is right like she was talking about that there are natural classes in how people interact in general that we have to be careful around that's one of the things that it's very important to design design a fragment make sure that you are designing you the AI design you make a lot assumptions as you said before are very critical so try to make it holistic and also scoping so part of the talk page, the entire talk page so that's these are different ventures like if it's improving right yes and the outcome yes how do you see this as a conversation on your sessions you write well as I said before that you have more of the novel in this community is that there would be some hopefully to provide feedback at some point of time the biggest problem I would say for you guys is that you are dead and hungry so scoping for you is bad you don't like scoping I know that we more want more data so the model becomes better so when I tell you scoping is but it's so I know it's I'm throwing a lot of contradictory things here but reality is a little bit complex so why do you think scoping is important well as I said if you look at the talk page a talk page is basically like two people having an argument an argument or sometimes arguments even like between you and your spouse or your boss or whatever even they were emotional and they were had and they were screaming maybe or sometimes whatever so if you just look I understand sorry I call it a dump AI dump AI component component component component I would say this is a very bad conversation they really had a horrible so I would give it score minus three three hundred if you allow they are but if you talk to people afterwards maybe that actually was good maybe it saved your job for your marriage or whatever because you just realize something that already other part so many so many things so so scoping this is why scoping is a little bit different also like standardization of what we are talking about so if you don't do scoping properly you will just say that for the entire English video we will have this is the standard this is what you guys mostly think is the standard and you should follow that but if you really do some scoping I'm not going to deny it because you are still making a judgment on which scoping and how but if you do some scoping maybe you will have some pockets or cultures within the English Wikipedia where they need their templates and standards to be a little bit different for this for political articles it's a little bit different than for the economics articles than for the big authors the classic example is by our B.L.P a variety of people and persons which tend to be more controversial for example of some other so without scoping it's you just like I don't want to I'm hearing many of you type in there but that's a question so I know we can look at these individual edits of the talk pages and find out where they were bad and we can look at the whole talk page can you actually detect whether the resolution was positive I feel like that's a different actual problem because there might be a lot of bad stuff still on the talk page but it came out well so some past work in looking at resolution article deletion discussions and they were able to make some sense of the quality of the discussion based on whether the deletion was the deletion decision ever was changed and so if say we decide to delete it or decide not to delete it but that upon review it had to get you know, unbelievable or deleted after all then that's an indication that that discussion didn't progress the way that it ought to have I think that we might be able to look at something and say that discussions that generally lead to consensus are good but I mean a lot of discussions shouldn't lead to consensus because there's no consensus to be found either against or for so yeah I think that would be intensely difficult it seems like where you have a clear outcome like a deletion or a create page is created or something that would be a lot easier than like if this paragraph was added and this subject was covered in a fair way or like good article content you know even then with these analyses people were running models that can account for large amounts of error with large numbers of observations but when we have an interface that's trying to affect these discussions they're working with an individual example you know essentially like it's the problem of if you ever want to predict what a population of people is doing it's super easy you want to predict what an individual is doing impossible and I think that it's a similar to the situation if you want to talk about generally what happens in relation to discussions then yeah we can do that it's going to have a lot of error but it's okay if we come out a little way but for the individual ones it might be it might be different like flagging things probably good deciding probably not so good anybody else I don't know who you are but thank you thank you I think we're what's time for this? 3 and 40? 8 minutes so we can just like keep you guys not going to keep making a couple ideas so maybe just like wrapping up I guess so what do we want to do with like this information on there you're typing pretty weird I think you want to use this for something you got any thoughts? I'm typing your wrap up what's your time? so generally what I do with these kind of sessions is some of the you know it takes a bullet and sits down with this for several hours and summarizes it pulls out proposals tries to contact people who you know proposes them or might be interested in them or suggests some interest at our discussions today to see what it would take to at least bring them to the point that they are lost here and so like I had a session earlier today about wish list for the kinds of AIs that we want now we're recording them into products which we did here and my plan is to take all of those proposals that aren't funny but laugh at the funny ones but take the other ones and turn them into tarot care tasks and start pulling resources together to see what it would take to actually turn those into an AI and I think that doing that kind of thing with these interface modification ideas would be a great step forward because you know a few or I might not carry them forward but somebody might and they benefit from having the discussion that we have here collected yes I think like so just go up that so some might not know this but so on the reading team we have been starting to work with research to figure out how to create like AI in the product so we're like trying to solve like this exact problem like before as I mentioned in the beginning the session, Josh like with the other PMs starting to come up with some ideas so I think what we can do with this is we can take this and start like actually augment that list and like start having like more discussions around some of these ideas and how we might bring them into the products but yeah and I guess beyond that I just say no I guess over the next quarter we're also like actually making it like a proposal like a plan for the executives to figure out how we're going to actually implement this like where the responsibilities are whether we need more people to help us so yeah there's going to be potential for input on like hey AI is a big part of like our future like with the media so like yeah you can say something like hey you might actually help us get people hired or actually form a team around that step so that I guess we're done right wrap it up good so thanks everybody for coming you really you killed this yeah you are a life shorthand man you have a career like stenographer yeah like it won't be like she's not a career I think it's I don't think you can walk in I can try to look at the audience the next one yeah that's like just like like we're on the next quarter and we're just talking about the the product ideas we come up with I mean most of us think it seems appropriate anyway I think yeah there's one thing I mean it's a bit of a stretch but a lot of extra work but taking the ideas and throwing them down like in a public space you know let me start to be fair I'm taking the ones that we actually we think that's worth it but we can't do and putting them up and saying like if somebody is going to do this they're going to have to do some work in engineering like I don't know like one of the three party kind of management tools or something like that and then somebody else and if we were to get there we could use that to like blow the faces of a proposal you know like I like I'm hoping through a few of the ideas that we've brought and the AI wishlist to the idea lab but like I was saying it's a big it's like it's you know in our community it's pretty good the idea lab once we actually get to that point you know people will see that you've been discussing this publicly for a while even though it's really for four months you know putting it in your lab and discussing it but it's kind of a play hidden and it's a good one you know people should be humbled by that because we're an operation that's important for people to talk to you about but yeah I guess one thing that we should talk about is like where to so that we in our craft can contribute to it effectively so that people who are looking into this sort of stuff can kind of passport it almost seems like we should have a sort of like a AI machine learning or like an algorithm focused project where that socializes the ideas that have always been a lot of ideas that have just forwarded some of them and implemented we have an index of them and you know if you have an idea you should sort of set it in that sense if there's something that needs more reference or maybe something more global I think a lot of people are going to be happy that we're going to take a hand to these great stuff yeah I mean we have some more attention here before that I was just like a girl that's a great idea I mean it's just one of those things you can trust me you can trust me you can trust me you can trust me next we're talking about what this does involved as well as is coming out so next That's why I think the idea of that was a great place because it's already activated and I'm going to stop with this idea of worth-hid惚 here's what I think you should all both accomplish I think it will take to put together. And then like IUG is essentially to take your ID a lot of proposals and propose a small grant for it. And so you might make an alternative, which is, okay, take your proposal and do a business case for the MMI plans or like the grants for, sorry, advance of teams that they can help you get like a major grant. So it's kind of, I mean, we might be able to do one for ourselves just to like, get more people to sign the process. Like, we can do a grant with our decision. Whether it's like volunteers or like, something like that. So you said you're like, I would say that with the AI in the front end, we need like, it really helps when I work with researchers if I can say, hey, there's a problem that we've been discussing for a while. You're just trying to get us about where we can get some training data here, so we can do my grant in two. Do you want to look into this? Do you want to dig into this? And a couple of the times the MMI has been done. So I'm just trying to get this done. If they were kind of part of the decision, they would say, which I'm not sure, but then we would be having the gates. So it's interesting. And finally, I think we can see it. We're going to have to do the same thing. You're going to have to think just about how to do the same thing. And then, of course, the MMI will really have to do has been a challenge with the MMI. And it's going to have to do with the MMI. And I don't know if it's different. I think this is one of the things that has to do with the MMI. I think it's almost completely different. As much as I... So then there is an extension as imported on the English language. But we need to make sure that the flyer is on the English language. And this is a very different way of speaking. Because the flyer is on the English language. It is a very different way of speaking.