 to some of the showcases, you know. All right. Good morning, good afternoon, everyone. This is Daria from the Wikimedia Foundation's research team. And happy new year, and I'd like to welcome you to our January edition of our monthly showcase. As usual quick introduction, we have today a guest speaker, Jan Chen from University of Michigan, and our own Caroline Sinders, who will be presenting in the second part of the showcase. Very excited about these two presentations today. Jan has been here in the past. She's been doing extensive work on incentives to participation in various types of communities. And today she's going to present about expert participation in Wikimedia projects. And Caroline's going to talk about methodological challenges in understanding of a form of a digital stock that we call Wikihounding. The format is going to be the usual. So we're going to have about 25 minutes for each presentation, followed by a short Q&A. And it will be ample time towards the end of the presentation for anything else. Diego, who's here on the Hangout, is going to be our IRC host. So please relate to him, any question he may have, and he will ask him at the end of the presentation. So with that, Jan, say just yours. Thank you. Thank you so much for having me here. Very exciting to visit the foundation. So I'm going to present a joint project with Rasta Farzan from Pitt and Bob Kraut from Carnegie Mellon and two of our Michigan graduate students, Imah Yakizari and Ark Bonjoja. So this is going to be a field experiment on motivating contributions to public information good. And I think I'm going to skip most of the motivation part. So user-generated content is important. And what we're trying to figure out is what motivates domain experts to contribute to public information goods. And because it's public goods, we know that it's vulnerable to the free writer problem. Lots of our colleagues say, oh, I just read this Wikipedia article. It has this problem, but they don't fix it. So we want to look at two factors. One is how motivating is social impact, so by manipulating the number of readers, the number of views. And the other part is how motivating our private benefits, such as, in this case, citation benefits. So I will skip the introduction to Wikipedia. And so here's an example which shows that it's something that has a problem that experts can easily fix. So this is the article on instrumental variables. So it's an econometric subject. And it says that the method of instrumental variable is used to estimate causal relationships when controlled experiments are not feasible. This is actually accurate when this method was first developed, but right now it's used extensively in experiments when the randomization itself oftentimes is a good instrument. So if someone studies instrumental variables or uses it, it's very easy to actually update this information. So I'm going to do a very quick literature review which says that there is an extensive literature and field experiments on public goods. There are two landmark surveys of the lab experiment and field experiment on charitable giving and public goods. There's also a large literature on what motivates Wikipedians who are already editing, who are already insiders. So this is a very incomplete list. So what we're interested in is how do you motivate outsiders, sort of domain experts who are not editing yet. And so as far as we are aware of, Dario and colleagues has a survey study that looks at what motivates domain experts. And we're going to look at a field experiment. So methodology-wise, it's quite different. So I'll dive directly into the experimental design. So we're going to implement a two by three factorial design. So on one dimension, we look at the social impact. So we download it, we processed a data dump a month before we started our experiment. Now we computed the average view of a Wikipedia article that was 426. And then we're going to only select articles which has had been viewed at least 1,000 times. But we're going to vary the information that we give to the experts. And the other dimension is on the private benefit dimension. So we either don't mention citation benefit or we mentioned that we'll recommend articles to you that might cite your work or might include some of your publications in the references. So the three different ways of talking about citation. And the third one is pushing the private benefit further by adding, we'll acknowledge your contributions publicly on the Wikiproject economics page. So this is the basic design. And this is a view of the basic design and the number of observations in each cell. So altogether, we intended to treat about 4,000 domain experts. These are economists I'll talk about how we selected them. The people who actually opened our first email, this is the treated group, is about 84% of these. So 3,288. So we retrieved our participant information from REPEC which is a working paper archive used in economics. And their data use policy does not prohibit the crawling of emails which we use. And the other convenience is that they have an open paper archive. So we can match the experts papers abstracts with the Wikipedia articles to ensure close matching of expertise. We can also get their ranking information which will be interesting as a control. So we select experts who have at least six articles in English in REPEC. And this is because of the accuracy of the recommended system. So if we recommend one, it might not hit. And we're hoping that with six, five to six recommendations we might have a good chance that some of these recommendations are close matches. So this is an overview of the distribution of the number of papers or publications by experts on REPEC. So what we have done is primarily selecting at... So we basically ignore the one, two, three's and from four on, more than 99% of our experts have at least four papers uploaded on REPEC and 83% have at least six papers. So the article selection, there's a set of criteria. So they are not edit protective, they're not a stub. So there's something to comment on and they have to have at least 1500 characters and viewed at least 1,000 times in the past 30 days. So which says that every article that we recommended have been viewed at least 1,000 times but in some of the emails that we send out we don't tell them that. So the experiment is a three-phase design. In the first phase, we send out a personalized email show what it says, invitation to the experts. That's where we implement the treatment. In the second phase, we recommend relevant articles to the interested experts. And then in the third phase, we send a thank you email and we also give them the URL to the posted comments on the top page of the relevant Wikipedia article and also a tutorial on how to edit Wikipedia articles. And the reason we do that is because they actually don't go to Wikipedia when they make their comments. So here's an example of a personalized email. So the first paragraph is the same for everyone except that field of expertise which we retrieve from REPEC. So in this case, related to behavior and experimental economics, it could be labor economics or industrial organization. And then we anchor everyone's beliefs by telling them that a Wikipedia article is viewed on average 126 times each month. And this person is in a high view condition. So we say, if you're willing to help, we will select only articles with over 1,000 views in the past month. So everyone knows the average view and this person is also in the high view condition. And this person is also in the private benefit condition, the citation condition, which says that the articles might include some of your publications in the references. When we send out the email, these two small paragraphs are randomized. So sometimes they see the public impact first, sometimes the private impact. And then they can click and Bob and I send our names. So if they say, yes, we immediately send them an email saying that here are the six articles. We again reinforce the experimental treatment. We say these articles might refer to some of your research and here's the number of views in the past months. And these are the actual number of views. And you see that they're way above 1,000. And if they are willing to review an article, they can click. And what this takes them is our server. So this is actually not Wikipedia. So Iman built this interface. It's primarily to lower the entry costs of the experts because of the 4,000 people we contacted, only one person says yeah, I already added Wikipedia articles. So we have a split screen design. On the right hand side is the Wikipedia article that we recommended. On the left hand side, as long as they know how to type word, they can make comments. They can also make referrals. So this design basically separate the experts comment from the incorporation of these comments into the Wikipedia article. So it separates the two processes. The emails are sent always during the daytime of the experts' local time and evaluated from their primary affiliation. And 84% of the people open their emails, the first email. And we treat them as the treated group. If they respond yes, we immediately send the phase two email. If they say no, they're dropped right away. And there's a third type of response, which is they don't respond. And if you don't respond after four weeks, we send you four reminders. And if you still don't respond, we drop you. And the comments are, once they submit comments, they're verified before posted to the article top page. So what happened to these comments? So Iman developed this bot called Expert Ideas Bot. So last time I was here, I was trying to figure out how to get the bot approved. So it was approved. And so they post the comments. The bot posts the comments on the article top page. And alert the Wikipedia editors to watch this picture. So there are three scenarios to the comments. We curated about 1,200 comments in this experiment. The best case scenario is that the Wikipedia editors incorporated these comments. The intermediate case is that they comment on the comment but don't incorporate it. And the worst scenario is nothing happens. So a little side project, which is not part of this, I worked with the Wikian Foundation to incorporate these comments using my game theory class. And students actually really like the assignment. So it becomes a homework assignment. And we checked four months later, and 100% of the edits stayed. So it has to their quality. So our hypothesis is derived from a very simple theoretical model. And I'm just going to very quickly talk through this. So basically, we treat the Wikipedia article as a public good. And the number of consumers of this article is the viewers. And the contribution level is A, and it's costly to contribute. And the social impact is captured by the first term. So Vn times y plus Ay. So how much you contribute, that improves the quality of the existing public good. And contribution might also be good for you. So that's the W of n. But there are two types of costs. One is you could have used that time to work on your own paper. So there's an opportunity cost to a ton. And the last part is that the editing process itself is costly. So if we assume quadratic cost function we can actually solve for a unique optimal. And that gives us a set of hypotheses. And the first, so all of these, all four hypotheses are what we call comparative static. So we take the optimal solution and vary. Keep everything else the same and vary the parameters. So the first one says that experts should be more interested to contribute when the citation benefit is made salient. And the second one says they should be more interested when we say a higher number of views. And the third one hypothesis said that an expert with a higher reputation will contribute less. It's because the opportunity cost is high. And the last one is better matching between the content of the public information that the article and the expertise lead to increased levels of contributions if there's a necessary and sufficient condition which is satisfied, which is the benefit is high. So we're going to test these hypotheses. And so let me just jump to the results. The first one is looking at the first phase, which is treatment effect on positive response. What you see on the graph is the first two bars is the no citation benefit mentioned between average view, which is the white bar and the high view condition, which is the gray bar. And the middle one is when you mention citation benefit. And the last one is when you mention citation benefit and public acknowledgment. The most astonishing part of the result is actually the baseline for us. Because if you don't mention citation benefit at all and just mention average view, the baseline positive response rate is 45%. So 45% of the people say, yes, I'm interested. And so that's very high compared to, for example, the American Psychological Association's campaign to over 9,000 psychologists. And there, the positive response rate was 2%. So basically, this says you need to ask to illicit contributions. You need to ask people. And it's important that someone in the community does the asking. The second result is that the high view by itself, if in each pair of bars, you compare the white and the gray bars, there's no significant difference. But citation in combination with the high views, it has significantly increases positive response rate. So now I'm going to show you the regression analysis, analyzing the average marginal effect. And so this is the multinomial logit regression to fit into this one page. I cut out the interaction terms, which is in the paper. So what you see is citation benefit reduces negative response. So that's the first blue line by about 6 percentage points. And moving to the second panel, the bottom panel, is that citation at high view increases positive response rate compared to the baseline by about 6 percentage points. So essentially, the two conditions combined give you another boost on top of the 45% response rate. And the last part is acknowledgment at high view reduces negative response rate by about 6 percentage points. So that's the first part, which is on the average treatment effect. So just in the actual condition, what was that? So we say, it might set your work, and we will acknowledge your contributions in the Wikipedia project economics page. So when we conducted the experiment, there were a number of economists who contributed, and then we organized it. And we also sent that to you. I can show you afterwards. Yeah. So you said that you find it was important that it first comes from inside the academic community? Yes. So the emails were all signed by, yeah. So Bob and I signed every email. Was there a control group? So we don't have a control group. So the control group would be, you contact them, but you don't ask them, which is bizarre. So it's all the comparisons between treatment conditions. So I mean, the control group would be, if you don't contact them, the natural control group. And I mean, somebody who was not well-contacted without academic information. Yeah, so we didn't run that condition. We don't have enough power to do that. The second part, which is essentially, this is a continuation of the same thing, which is looking at covariates. Author abstract views is how many times this author's abstract has been viewed, which is recorded by Rippek. The lowest is 51 times, and the highest is 46,000 times. We normalize that to an interval between 0 and 1. So we find that, as predicted by our model, a 1,000-view increase in the number of author abstract views is associated with basically 0.83 percentage point should be decreased in the likelihood of a negative response. So basically, more famous authors, people, academics are less likely to respond positively. But they don't hesitate at all. They either don't respond, or they say no. They don't hesitate at saying no. The second interesting part is social distance, which is behavior and experimental economics, which is our field. Behavior and experimental economists are 21 percentage points more likely to say yes, and 13 percentage points more likely to say no, less likely to say no. So people with closer social distance are more likely to respond positively to request to contribute. So this is an overview. So we contacted 4,000 people, 1,600 said yes, and 1,500 people opened the second-phase email. That's about 94% percent. And then all of these people who say yes, I'll do it, a third of them actually commented. So about 512 experts commented at least one article. The average is actually two. So all together we collected about 1,200 comments. So the experiment was conducted between May and the end of November 2016. And now I'm going to move to contribution quantity. There's a huge variance. So some wrote one line comments, and some rewrote the entire article. And by the way, I also wanted to explore this. Of the people who rewrote the entire article, every single one of them were deleted on the page. But the shorter comments are actually more likely to be incorporated. So here's an example of the type of edit. So this is on the article, Travelers' Dilemma, which is a game theory article. This is the original, which is the blue, what's in blue. And what the expert proposed is added in red. So the first one, if you know the literature, the first sentence is a summary of the first experiment on the Travelers' Dilemma. And if you look at the proposed change, it actually summarizes the whole body of experiments. So it's more nuanced and complete. And so the expert is a professor of economics at Rome. And how do we know his expertise is because he's published a paper on the Travelers' Dilemma. So this is fairly representative of the type of comments we received. So I'm going to skip these two examples and talk about determinants of quantity. So we use a measure called cosensimilarity, which people are fairly familiar with that, so which is a standard measure of how close two documents are. So we basically process the expert subtract as the first document and the Wikipedia article as the second document, and compute the cosensimilarity between these two documents. So after we do that, we run a compound Poisson linear model to look at log word count as a function of the treatment and also a number of covariates. So at this stage, the treatment variables themselves is no longer predictive of how much people contribute. But cosensimilarity is fairly robust, so it predicts how much people contribute. So basically, the more similar the recommended article is to your abstract, essentially, it's a close match, the longer your contribution is. And again, behavior and experimental economists write longer comments, so that's on quantity. We also have some results on quality, which we, on the overall contribution quality. So cosensimilarity also, higher cosensimilarity also increases overall quality. So how do we evaluate quality? We use raters. So these raters are either doctor students, they're doctor students in information economics or master students, or junior or senior undergraduates. We actually, especially for the junior and senior, we look at their transcripts and the courses taking the assigned comments based on their expertise. So here's the distribution of the overall quality on a one to seven litre scale. For quality, I'm just going to do a one short mention of a result, which is kind of fun. So we look at self-citation. So we look at people who come from different treatment conditions. Actually, it doesn't seem to make much of a difference. But if you're in the high view and acknowledgment condition, you're less likely to set yourself. So if we say, we're going to put your comments in this Wiki product economics page so there's a public acknowledgement, you're less likely to set your own work. So some self-citations are justified. And we also worry a little bit about do people come to edit just so that they can set themselves. And we show that actually there are certain experimental treatments which discourage this behavior. So let me quickly wrap up. So this is the first of this sequence of we could actually scale this type of experiment, field experiment, which is to elicit interest and contributions from experts. We showed that to elicit interest, citation benefit at high view increases participation and also public acknowledgement at high view decreases negative response. We also show that longer social distance and higher reputation decreases participation. In terms of contribution, quantity, and quality, here the technology is very important in the sense that the recommended articles should be close to an experts publication. And if it is a close match, then the contribution quantity and quality are both higher. And the lessons learned in the charitable giving literature, we know that you need to ask people to give. And who asks makes a difference. And so this one highlights a third component, which is specific to information goods, which is what do you ask people to do? So the matching between the quality of the recommender system and the recommendations matching to the expertise is important. And this is, I can't see, but the other part is that this is the whole experimental setup is generalizable to other communities. So all we need is a publicly accessible body of work. So archive, which the physicists and the computer scientists use, is a natural second step in pushing this out to other communities. So I think I used up my time. Thank you so much. Thank you. We're going to applause. Thank you. Thank you. Let's see, we could have like a few minutes now for questions then continue at the end or move on to the next one. Is there any immediate question from IRC, Diego? We won't be able to pick many. Yeah. We have two questions from Layla. Layla is asking, in slide 21 and in the models in that page, N is the number of consumers. N is assumed to be the numbers of page views. While page view count can be from a smaller set of people. N, who are the consumers? Have the research thought about this and the potential implication on the theory side of the research? For example, do these models require independence between consumers? Because that would be, belate this case. So let me pull up the model. So does it require independence of the viewers? Oh, yes. And by the way, can you please click in the hide sharing screen? There are some people commenting that they can see the message of sharing screen. OK, so I'm trying to pull up. It's 21. 21, OK. OK. So the question is, so. The question is, whether this model requires the independence of viewers? As a theory model, it doesn't. So the viewers can all know each other, but it still does not affect the comparative statistics result, which is what we care about, which says what happens to the optimal contribution level when N increases? So that should still hold. I'm not sure whether that answers your question, Leila. We will wait. Leila, say something. We have another question from a computer in MacGyver. He asked, did all the phase two emails have the actual number of page views for each page in them? The page view for you mean the experts article, so the Wikipedia articles? Computer in MacGyver. Let's see. Yeah, we're a bit of delay. We don't have a string. In the meantime, there was also another question in the chat. In YouTube, there was someone asking about if there's any evidence of Wikipedia articles, sorry, of publications that are citing Wikipedia articles, getting more citations than others that are not in Wikipedia. I don't know if you know. Yeah, that's the question. So OK, so whether the published articles, the experts articles, Wikipedia articles, we did not look at that. I think it's going to be, if it happens, it's going to be rare, because there's this culture and tradition of citing primary resource. You know, I've seen it lately, but in general, for at least in economics, it's not common. I think the question refers more to the actual article, get more citations. If you are citing Wikipedia, this will imply more citations of your article, I think the question refers to. I see, if your article is cited in YouTube. So that's one of our original intent for implementing this condition, which says that Wikipedia is viewed by the general public. And therefore, if your paper is cited there, it brings more awareness of your work. And so that was the primary reason for implementing the private benefit condition, the citation condition. OK, do we have time for one more question? I think I'd like to put questions on hold. I think they're also going to be more from the audience. So if you can stick around until the end of the second presentation, we're going to have the second Q&A at the end of it, running a bit tight on time. Yeah, that's OK. Yeah, sure. All right, thank you, Yan. And Caroline, I'll let you take this egg. OK, how should I just go about? You're in the hangout, right? No, I'm not actually in the hangout. OK, sure. I actually closed a whole bunch of my hangouts. Sorry, can you give me one sec? Sure. Dara, actually can you gchat me that hangout links? Yeah. Thank you. Looks like I'm in there. But here we go. OK, great. I will start to make sure I can do a presenter view or if that will, nope, that will not work. All right, well, that's fine. I always do without my notes. Hi, I'm Caroline. I am a design researcher with the anti-harassment tools team at the Wikimedia Foundation. And I'm presenting a work in, I guess, a project in progress that I'm collaborating with Diego and Jonathan from the research team on. So it's Analyzing Wikihounding and Harassment. Quick overview, this is my team. These are the people that I work with on a day-to-day basis, Trevor, Sydney, David, and Dalen. I want to give a little bit of a background on what our team is focusing on. So we were formed in part by a grant from the Craig Newark Foundation to analyze online harassment inside of Wikimedia. This is a part of a bigger initiative sort of bridge the gender gap amongst editors. We have four focus areas. It's detection, which we're thinking of as prevention, reporting, evaluation, and remedies. Some of those remedies do include blocking. My background prior to the Wikimedia Foundation was I worked at IBM Watson as a design researcher on chatbot software. And I've been studying online protests and online harassment for the past five years. I had a fellowship at Buzzfeed where I was looking at various forms of online harassment inside of commenting sections. I also did some work with the Mozilla Foundation, specifically a part of the Coral Project, just analyzing online harassment inside of commenting sections as well. So I spent a lot of time studying harassment. Usually I make an awkward joke that the way I got this job is probably by harassing people, but that's not true. But I find harassment really fascinating in a way, in the way in which you think about how people interact online. How does digital conversation sort of flow? How's it caught? How's it saved? How's it differed from the different kinds of conversations we have offline? I think online and offline are equals spaces to exist in. I don't think that they're necessarily different, but I think the ways in which we converse or the ways in which we sort of are engendered to relate to each other and not intended become slightly different. We lose one, we lose a handful of things when we start to converse online, we lose context. You can't see someone. You can't see their reaction of their faces. You have to go based off a variety of smaller things. Written text, emojis or emoticons if they're used, different affordances that the platform can give you. Right now that can be things like gifts, can be things like interactions like hearts, favorites, barn stars that you're thinking of Wikipedia. Different things like that. You do lose some context. One of the things that I think is really fascinating as a design researcher is thinking of all the different ways in which we can analyze conversation and how does conversation differ culture to culture and also differ from subculture to subculture. So what are the affordances of the platform you're on? What are the different subcultures that exist there? What are the different kinds of communities that exist in those spaces? Do they differ when they're placed in a different platform? And also how has the platform been designed and how platforms differ? So that's what I've been working on for the past five years. We have goals on our team obviously, because we meet them. But our main goal is to reduce the amount of harassing behaviors that occur on Wikipedia projects. But we're also then thinking about this as a way of how do we sort of start to create more agency and autonomy so editors and admins can start to resolve a higher percentage of incidents of harassment that do occur around Wikipedia projects and how do we empower individuals and create tools or at least research or awareness so they can protect themselves across media wiki. Some of these do result in tools. Also a lot of this is resulting in new research. One thing I will be showing is the interaction timeline. We're creating more user muting. We are looking into abuse filter improvements. We're specifically analyzing blocking tools. We are currently starting research on what a reporting tool could potentially look like across Wikipedia projects. And then eventually perhaps a dashboard for admins to track open cases of harassment. One of the things I would like to caveat with is I think that debate is really important. And I think open source projects need debate. But it's important to highlight that debate and conflict while they're necessary to digital spaces, there is a difference between conflict and harassment and then when harassment becomes abuse. One of the things I also think is important to highlight as individuals on the internet, we live all of ourselves on the internet. We should have the ability to be angry and upset online. We should also have the ability to be tenacious and bold. And also the ability to be sad and upset. And how do you start to think about emotions that exist on a collaborative space like that? How do you create the space to have conflict and conflict resolution, but also then create delineations or policies or parameters around what is defined as inappropriate anger or inappropriate actions. But I would like to highlight conflict is incredibly important. What we're trying to do is not remove conflict but rather create better ways to mediate, mitigate and study conflict. So this is a wireframe we made. We're actually in the alpha version of this tool. It's called the interaction timeline. So one of the things we had noticed was a volunteer named Sigma had made a tool called the editor interaction timeline. It was designed primarily to look at stockpuppet accounts and we noticed it had been used also to look at cases of wiki hounding. Wiki hounding is a form of online stocking that exists inside of Wikipedia. So we started to think about what could be a better, more readable way to visualize editor interactions. So an admin could then go and quickly scan and see the interaction history of multiple editors and then get a much more visualized view of these interaction patterns. Could they be able to see very quickly if something was hounding or not? Could they see that there was a problem occurring? This is just a close-up of this. I can send you all the link to the tool. We're still testing it. But so while we were building this, every step of the way we were showing variations of wireframes and getting user feedback from the community. So after this presentation, I can share our community health page where you can follow and track all of our work in real time. The kind of design we engage with is participatory design. We cannot move forward unless we have buy-in from the community. And that's really important when you're making things for an open source community, when you're making things with the community. But then we started to really wonder, so what are different forms of wiki hounding? Are there, is there one form or are there thousands of forms? And this led to us having a conversation with the research team and reading a bunch of different cases of wiki hounding. This is a good time to share of caveat that harassment inside digital spaces is not just qualitative or quantitative, it's both. So harassment is incredibly contextual. What are the differences between, for example, we look at wiki hounding, there's a form of friendship called wiki mentoring where someone will follow you around the encyclopedia and clean up your edits. And the difference between hounding and mentoring is the context. One is an invitation of we're friends and we've had positive interactions and I need help as a new editor for if I'm making edits that are not quite correct. The context of wiki hounding is to antagonize someone and push them off of the encyclopedia. But in general at a high level when we think of harassment even outside of wikipedia, it's important to acknowledge that harassment in all digital spaces is qualitative and quantitative. So all of our interactions that we have inside of a space like Twitter or Facebook, it's analytical data. How large my friendship groups are, how many followers I have, how often I interact with one person, how often we favor to retweet each other, how often the status is shared. All of this is sort of caught and captured as a form of data. But also then on the other side, what is not sort of caught in this data is the context of what I'm saying. What I've been interested in as a researcher is can you start to look at quantitative methods to analyze forms of harassment? There are some aspects of this that you can. There's a form of harassment called dogpiling where it's often used in harassment campaigns. This is the most popular on Twitter where a handful of people can send a large number of users to attack one user. And so if we think about that of like the network analysis of that, we could probably start to very quickly build or analyze different models of what dogpiling looks like almost from like a mathematical level. You almost don't even need the qualitative aspect of it because we're talking about volume. We're talking about volume of interaction. Something like sea lining, for example, which comes out of a meme, but it's when a user will ask another user effectively why over and over and over again. So if it's like, why do you think harassment's a problem? I think it's harassment because of X. Why do you think it's harassment because of X? And so it goes back and forth until the victim is sort of worked into a tizzy and or is working to an emotional state that has probably will have like a negative response. The other person will delete all of their questions of when they worked this person to having this response. And then usually do a screen grab and say like, look, they just got mad at me. We call this sea lining. That's incredibly qualitative. But there are quantitative aspects to it, but what you need to see and read are the interactions between two users. You actually need to read the conversation. You couldn't just look at the interaction pattern or timeline of when they're talking to each other. You have to see the content of that message. So to hit it on home, harassment's at digital spaces is both qualitative and quantitative. And this is true of WikiHound Day. Can I ask a question? Yeah, so could you count the wives? You could, but you'd still need to see that they're wives. Like the thing I always try to copy up with people is when you think of harassment on like one hand, try to think of all the ways in which you had a similar positive interaction. So think of like a friendship of when you're, you are debating with your friend or you and your friend are talking about a TV show on Twitter. Why did you like that show? Or like, why did you like this one? So you liked it for these reasons. Oh yeah, I didn't think about that. But why this other scene? Oh, I also liked it for these reasons, right? So it could be challenging. Yeah. Challenging in a positive sense versus. Totally. So what constitutes is WikiHound Day? Well, a good place to look at what WikiHound Day is is maybe the different definitions that exist on Wikipedia. So this is, there's a lot of long definitions and I'll try to go really quickly through them. But this is the singling out of one or more editors and joining on discussions on multiple pages. Didn't you continue, this is a parent aim of creating irritation, annoyance, or distress for another editor. WikiHound Day usually involves following the target from place to place on Wikipedia. This sounds like a pretty decent definition. The problem with this is it doesn't really give you use cases, doesn't give you clear cut examples. It's a definition, but it's not actually defining a lot of different things. It's not saying how many times you're followed. It's citing irritation. But how, how could we argue that someone is perhaps maybe is deserving of their irritation for lack of a better word? Or the use case that often pops up when arguing something is not WikiHounding is someone was making not great edits and they need to be corrected and now they're upset. So the problem with this definition is that there are many workarounds to argue if someone's engaging in WikiHounding but they're not actually engaging in WikiHounding, they're protecting the content of the encyclopedia. This is why the use case or an example, a very literal example can be really helpful. So let's go to another example. This is from the harassment page. And it's a lot more in-depth. And towards the bottom, you'll see it actually highlights the important component of WikiHounding is disruption to another user's own enjoyment of editing or to the project generally for no overriding reason. If following other user rounds accompanied by personal tax disruptive behavior, et cetera, it may become a very serious matter and could result in blocks and other editing restrictions. I think this is a better definition. The problem is how can users find both when we already have two separate definitions that exist inside of Wikipedia. And then we have a third and final one, which is per policy, which is referenced often on A&I. The policy definition of WikiHounding comes down to a vision of more than three times on the same page. So what we actually have here are three very different definitions that somewhat reference each other obliquely, but not more transparently, not actually. They're not literally referencing each other. The previous pages did not outline anything about three times. So I decided to, as a qualitative researcher, to read a bunch of cases on A&I. A&I actually stands for admin's notice board slash incidence. An admin is someone that volunteers and is elected to have more, I guess we're lack of a better term like rights inside of Wikipedia. Often admins will help mediate certain interactions and on this page called admins notice board slash incidence. This is the, it's at the beginning of the pipeline for reporting online harassment. It's where conflict goes to be mitigated, but this conflict is tried openly. Anyone can weigh in. That's often up to admins to sort of mitigate and interact and come into this space and help provide feedback. And this is where we actually have the debate of what is considered harassment, of what is considered wiki hounding. This debate is played out. So when you have policy that is not, that is not specific in what is defining wiki hounding like the first one, where it's just talking about the intentionality and irritation, then it's very easy to argue that wiki hounding can be all these different things or that wiki hounding isn't happening. And in this case, where it gets a little bit more specific with personal attacks, we're starting to actually create a much more stronger definition and really outline really, what are the parameters of wiki hounding? And I think this outlining is incredibly important. And then if we look at the policy definition, we start to get into hyper specifics as to how you can almost create a checklist of wiki hounding. What's important about this is this is a good basis for starting to think of wiki hounding and a more quantitative aspect. So if it's more than three revisions, you could start to actually write a model to analyze cases. Which is what we're potentially focusing on. So with Jonathan and Diego right now, we are analyzing cases that are labeled hounding and stocking from ANI. The reason this is important is we have to have user-generated definitions. This cannot necessarily be something that's policy and decided by just people in this room. We have to engage with the community, especially on my team. We have to work with the community to start to define or redefine any forms of harassment. What's great about cases posted to ANI is that users or rather editors are coming and labeling, like I think that this thing that's happening to me is wiki hounding. So it's already a user-guided definition. We're also looking at cases that have been labeled or resolved as hounding and stocking by ANI because in their admin labeled cases. So what we're actually getting here about having to necessarily ask for it are already user-labeled cases that the community is considering hounding and stocking. Which is incredibly important. We don't have to infer that we can go based off this. Yeah, okay, that's a good question. So the three times if you have three emotions is that based on the cases brought up to ANI. So I'm not sure of the history of that rule. I think it was created a while ago. Actually what you'll find or what we found in reading these cases is very rarely do they adhere to the three R rule because it's so low a number. So one of the things that we found so when starting this collaboration, Dario actually tasked me with a great task that seemed very easy to do at the time and now seems much harder. I was finding canonical cases of wiki hounding that we could use and analyze to start to build a model. What was hard about this is actually what I noticed is a lot of wiki hounding cases were not labeled wiki hounding cases. They were wrapped up into other cases of harassment. So there would be a case of like a lot of personal attacks. If you go and actually looked at the case in Reddit and then compared the editor interactions, the interaction timelines, it looked like that there was a case of wiki hounding hidden in this other case, but it was only being labeled and discussed on terms of personal attacks. But one of the things I noticed is that or rather the hypothesis has been can wiki hounding be analyzed first without context is time, frequency, and location. So what do I need by that? So frequency, generally interactions, if they were gonna be taken seriously and A and I had to have more than five plus interactions. So people were not necessarily paying attention to the three revisions rule, even though that was being referenced a lot, length of time. So generally had to be longer than 24 hours that could be a little bit less than a month. And then location had to be more than three pages. And this included an article page and article talk page because to try any case on A and I, you have to tell someone I'm taking you to A and I. But here's the thing. So the length of time can actually differ. What I'm curious about as a researcher is what happens or what does the case look like if it's longer than a couple months? Would the frequency have to rise? If the frequency is lower and the case length of time is longer, is it wiki hounding or is it just randomly, like random interactions between two users? So frequency, length of time and location are all important to analyze. If you have, let's say like more than 10 interactions under 24 hours, that's a very, that could be an easier case to argue that this is in fact wiki hounding and it's like a high amount of wiki hounding. And then especially if you start to add more locations to that and especially if those locations are not considered similar ones. So if you jump from, when you start to jump from different domain specific pages, so from makeup to spiders, that starts to build a much more clearer case of this is a form of hounding. So I'm going through and checking your edits. So one of the things that we're working on now is breaking down perhaps these identified characteristics of cases and then trying to plot them out. So does frequency rise as a length of time continues? And is that considered or labeled a more clearer case of wiki hounding? Disfrequency play into this at all? Does time matter? Is it about the amount of locations or can something be wiki hounding or argued successfully as wiki hounding with minimal amount of locations? But these are just the ways that I started to break down how to think about ways to analyze the data inside of these cases. Is there a standard model of wiki hounding is also a question? Probably not. And then how do we start to break apart these different cases? Now these are not real terms inside of the encyclopedia. There are ways of which I have taken notes and tried to think about how do you define and break down like different buckets of wiki hounding? Is there a short term wiki hounding, escalated wiki hounding, aggressive wiki hounding? These terms would have to be determined in conversation with the community, but these are more internal labels that I've been thinking about. And the reason I think about that is this is a project I worked on for Fusion News a couple of years ago now called Splinter, a part of the Gizmodo group, of how do you start to break down different aspects of trolling? I think trolling like wiki hounding perhaps as a metaphor, while viewing them as like metaphorical relations. Trolling is a very big umbrella term and you can have a lot of different kinds of trolling. Trolling can be positive and trolling can be negative. So how do you think about dividing up interactions of trolling? And this is one of the reason I'm showing this is this is kind of what I'm thinking about when we start to break down different forms of hounding. How would you plot out short-term hounding or escalated hounding or aggressive hounding? The way in which you can start to plot out or diagram what could be harmful and serious trolling versus casual and surface trolling and analyzing the differences of these cases. If you're interested in reading this story, this is a close-up of the diagram I can send it to you. This is what we've done so far. We've created a very large 30-page document with notes on potential canonical cases of wiki hounding and this was built with our support and safety team. So that's the team that actually will intervene if harassment is incredibly escalated and needs any kind of immediate foundation interaction. We started by identifying characteristics of those cases. Jonathan Morgan has grabbed 269 ANI archive threads that have the word hounding or stock in the title and we placed them in a spreadsheet and we are currently reading and sorting and plotting those cases. It's a new project and we're looking for feedback and we can have that feedback now or if you wanna wait, you can find us here. That's a bit late to our research page. So thank you. Thank you, Caroline. All right, so now I have time for questions about this presentation. So, Jan's presentation, so I'm gonna... I can stop screen sharing if that's helpful. Okay. I gotta ask again, Diego, if there are questions in the queue and maybe we can start from questions for Caroline first. Yeah, there's some questions. I don't ask for... Is there anything in the literature or anything that you can think of that looks like this tool in other contexts like Twitter or Facebook? I think he's talking about the tool that you showed. Timeline, so throughout my research of online harassment I've had the delightful opportunity to meet a handful of designers and product managers at Twitter and Facebook and they've alluded to the fact that this tool exists but they will never share their versions of it because it's proprietary software. So part of the downside into making this tool is we recognize, my team recognizes that we're probably reinventing the wheel with no idea as to how like all the leaps forward that maybe Facebook has made in creating something similar. How are they analyzing larger harassment interactions between three people to 10 people? The tool would have to radically change. How are you plotting that inside of a specific kind of graph? What worked best for your designers? What was the most serviceable way to look at this kind of content? Most likely a lot of the things that we will be making publicly because we can make things publicly at the foundation has probably already been invented in some former aspect at another company. They just don't share their research. Okay, there's another question here from Layla. And so she asks, so based on the presentation it seems that the intent of person X in doing what they are doing has impact on whether a case is wiki hounding or not. If this is correct, how are you planning to capture the intent of the users? So that's specifically why we're going based off cases that have been labeled on ANI as wiki hounding or stalking. So we're not trying to guess based off, I would say more benign or unlabeled or innocuous cases. We're going specifically based off cases that have been already user labeled and cases that are all at this point cases that are also resolved and have been labeled by admins as hounding and stalking. That's incredibly important. We don't want to create, I would say like a very, very noisy dataset of things that, you know, like, I'm trying the best way to say this. There are plenty of cases that this is the grab in multiple conversations we had with SUSA of wiki mentoring can look a lot like wiki hounding. But if you're being wiki mentored you would not bring a case to ANI and label it wiki hounding. Okay, Dario, I think you have a question. I do, yeah. So you said that one of the issues is the fact that we don't know if people who report are those who are most familiar with the actual labeling phenomenon. Basically, we don't have a good sense of whether the population of reporting users corresponds to the population of who are actually affected by this problem. And so I'm wondering to what extent there's definitely a component of, you know, tooling for reporting. But I guess the other question is also about the socialization of the label itself and the understanding in the community that is an instance of something that may have happened to you. And I'm curious, there's any thinking about how to get the awareness on the labeling, you know, spread out in the community? Well, I mean, that's one thing Sydney and Pornais specifically have discussed is how we reach so many more diverse voices inside of our community, especially places, especially community members that don't report an A&I. So part of this very large A&I survey that the support and safety team, Sousa, and my team, the anti-arrestment tools team recently put out and we're now analyzing was specifically on what are the problems with A&I? Was it do well? What do people like about it? What does it not do well? What are improvements that could be made? And there has been feedback that some people just don't report, right? That that's not a place where they weigh in on that it's helpful for things that are like vandalism or stockpuppeting. It's very hard for things that are much more personal forms of interactions, which could be something like wiki hounding, which could be much more direct harassment, where you have to spend a lot of time with complex cases. And I may not be the best spot for that. And so one thing we are really interested in and we're right now, we're starting to reach out to about 60 different community members of people who express interest over the years in discussing harassment with the foundation are where do people discuss harassment? Where do they go? What does it look like? Why do they not report things or do report things? And this is more of the basis for our user interviews for quality of research in what a reporting tool could look like, but also the usefulness of this tool. So I guess the TLDR is we're incredibly concerned about what is reported and not reported, and also the awareness of how to use these things. But even more importantly, the people that decide to not ever report a case, because they don't think that it will actually benefit them. That's I think our biggest concern. Thank you. What do you know about the impact, sorry, about the Harnam editors? I mean, I think there's a sense that people actually leave because of this threat, and it's possible to quantify this or how this happens. That's something we've talked about definitely further down the line, which is how could we study when people leave and why they leave? And one thing we're hoping to do with a reporting tool is be able to see how cases grow, and grow sadly in a negative way, like when someone reports something, when cases are reported again or over and over again by similar users. Right now we actually sort of lack that data of what are the most frequent kinds of harassment inside of English language Wikipedia? When does something grow from conflict to harassment, to abuse? And what are specific use cases we can look at? And this is where I want to highlight the policy and looking at like legislative policies important. We really lack use cases of seeing, like specific cases to then maybe anonymizing it back to the community in terms of training to recognize harassment of specifically labeled harassment cases in a way. You have to sort of dig through A&I. If you were to try something in a court of law, you can pull up like a history of court cases, right? And you can sort of look at like, oh, this one case was tried in this particular way and like it was one in this way or we know, for example, the way certain kinds of cases can look like if you're looking at like domestic violence, for example, like there are court cases you can reference and pull in research. And we don't have that in terms of online harassment. We don't have that in a very real legislative way. There aren't a lot of wins for online harassment cases nor have there been a lot that been brought forward. But also we don't have that in terms of even our own policy. We have A&I and that's what we can look at. But one of the things our team has been thinking a lot about is how do we sort of create more knowledge around, again, like better definitions and examples of like this is wiki hounding versus, you know, a personal attack which people generally know what a personal attack is. But what do you do with the more complex cases? And then how do they grow? Which is a bigger, a much bigger thing for us. When does wiki hounding, what does wiki hounding look like before it's wiki hounding? And what does it look like when it's an aggressive case of wiki hounding? What are the different steps? So there are a number of externally developed tools used by community members, especially the very highly active ones. And what I'm wondering is what role they play in either facilitating harassment or in addressing or preventing it. For example, a lot of tools that can be used to follow people's edits around, or there's one that's straight up called stock toy, which kind of creeps me out. What I'm wondering is if you've researched what research there is on the effect of those tools on harassment, are they positive or negative? So that I don't know about. Like I've never heard of stock toy, so we should chat after this meeting. But I think they're, given I'm still pretty new to the foundation and the wiki world, but there are a lot of tools we didn't realize existed because there's not necessarily a giant repository of if you're a community member and building a tool that you must register it here and put it here, right? Because we're a collaborative open source community and the way our community functions is different than the way I would say the private company functions. So there's like through user interviews we discovered by someone mentioning this brand new tool we had never heard of that is specifically designed to analyze edits also for stock puppeting. One of the things we've noticed is that there are a lot tools that community members have built to analyze stocks. Those can also be used to then stock people because they're looking at edits, right? And looking at edits as specific user makes and the places that they go to. I haven't heard of cases that doesn't mean it doesn't exist of per se the editor interaction tool being used to harm someone. That doesn't mean that that doesn't exist though. All right. Thank you. You're welcome, thank you. I have one more question. Have you thought about doing some sort of form of community design outside of the participatory design within your convenient community? Like going out to design schools and giving them a brief and asking them to solve the problem as sort of a way of like bringing in different options I guess for and then using those options as something that we take back to the interesting. So before I worked here, I did that a lot. So I designed a series of online harassment workshops specifically for NGOs and nonprofits. So I did a workshop with Tactical Tech based out of Berlin on analyzing online harassment and sort of walking like researchers through how would you illustrate harassment and how would you start to define parameters around cases of harassment? What do you consider harassment or not? And then how would you start to think about interventions in this space from a policy level to a technical level to a design level? And I also have got to do this again at Data & Society in New York but I haven't had the opportunity to do that yet while employed by the foundation. I would like to do that. Do you know about architecture or humanity? No, no I don't. That's like a great precedent for that. Yeah, a brief and we'll get all kind of participation. Prior to the foundation, I was at a harassment hackathon at Intel where I learned a lot about how sometimes engaging design firms to study harassment can be incredibly problematic. And that's sort of what spurred me into being really interested in running or trying to do design workshops as a harassment researcher and design researcher. But that's something I would like to do. But I will also caveat with especially on my team you're very aware of not presenting a finalized solution to the community but really trying to actively involve the community because it's important because we're designing with the community not trying to hand off solutions to the community of every step of the process. And sometimes sort of going and saying, well, we did this thing and people decided on these things. It can look like a decision even though it's not quite one yet. It can sort of look like, oh, we've already come to the solution and then what if you have some sort of participatory leading of 10 or 20 down to five and then down to three towards a progressive? I think that that's something that could be incredibly useful and something as a designer I would definitely like to engage with and do. And I think just even as a thought experiment or as a role for design advocacy that would be a fantastic thing to do especially get a bunch of great people in the room together and say like publicly, like we can do this and this doesn't have to be hidden behind an MBA and this can also just be some design thinking it may not actually, this is not like we may not create a tool of this but it's good to like sort of align expertise around that. But I think that that would actually be a thing to try to do with community members maybe have a space like Wikimedia and then see how many community members would also want to be involved especially in that workshop because they are the experts. Thank you. I believe we have a few more questions for you. So I'm checking if we've exhausted the queue for Caroline. Have you got anything else from IRC? Yes, we have a question from Aaron I think. No, no, no, no. I don't see that question. No, I think we have no one. So I'm gonna claim the queue and give Tim an opportunity to ask a first question to Ian. I've won myself. Yeah. So you mentioned that you were categorizing the Wikimedia editor's response to the expert comments and whether they took their response and edited the article whether it's this one or a talk page or nothing happened and I wonder about the distribution was in this how often each happened. Oh, okay. And then, yeah, there are further thoughts on how to increase that weight because from what I've seen, I'm not sure reaction. I'm a little, it seemed a pretty low weight most of the time. Yes. But it's a negative reaction and also I was talking with you about it last year it means it doesn't get popular. Yeah, so unfortunately, most of it is nothing happens. So I can pull up the Wikiproject economics page which has a listing for our pilot project but it's indicative of what has been done. You're not sharing the screen sort of. Yeah, so I'm still, okay. So, yes, okay. So what is the, what can you do? So it's Wikiproject, how do I find it? I'm on the Wikiproject economics. I think it's, if you're such an expert idea. Ah, okay, yeah. Yeah, okay. So, can you just show me your screen? I have nothing to hang on with. Oh, okay. So, yeah, let's take the Wikiproject economics slash expert ideas. Oh, that's it. Can you read from the question, Tillman? So you were saying there's an existing initiative on Wikiproject. So what happened to the comments? Yeah, so the first one is just how often each of these cases happen. Someone reacts and makes an edit. Right, someone reacts and just posts a comment and I think happens. Correct. A student expert ideas. Am I not doing that? So, yeah, we actually posted, we finally posted the link. Okay, so I should just click on that. That's great. Okay, so where is that? Yeah, that's in the chat. Okay, I'm sorry, this is such. All right, okay. So this is basically, this is actually one of the points where I would like to get some feedback because we're not quite sure what to do with that. So when we did the pilot experiment, we send it out to a number of experts and each of them has made some comments. And it's one of the doctoral students at Lancaster who was a Wikipedia in his spare time was editing Wikipedia articles in economics and he found this project. So he organized this into basically alphabetically of all the people who contributed content. And so he used a notation where the green checkmark is, when the comments on the top page have been organically incorporated and the little smiley face means that someone has noticed but nothing has been done. And the red checkmark is that it's not going to be incorporated or so far, they have not been incorporated. If you scroll down this, you see the green checkmarks is fairly rare. So this is fairly indicative of what has happened to the 100 and the 1200 comments. So very few have been incorporated. So my students did some and the most of them, I will have, I think by the end of the month after we completely finish rating the qualities, we will know exactly how many have been incorporated. So the question is, what one should do about this? So once it's on the top page and the generic format, for instance, the internal density estimation, is expert ideas but always say, doctor so-and-so's comment on this article and these are the new references and so it always takes the same format and we say, we hope Wikipedia's on this page can take advantage of these comments and improve the quality of the article accordingly. And why is this person's comment useful? We say that she has published the following research which is the basis for us to recommend that article. And because these are extracted from REPEG, there are lots of them were posted in working paper format and but they have been published in a journal at this point. So it has a recognizable format and in this case, it says done. So it was the user who incorporated these comments. I think if I may chime in, the underlying problem you're trying to emphasize here is that we don't really have a good mechanism to flag something as a suggested change and to identify whether anyone has actually been aware of this. We have very poor ways of tracking whether something has been seen unless that is actually being acted upon. And it seems to be like the current way that we're doing this via top pages is a really blunt tool to do this. I know it's a common way of doing this. There are some experiments that I think involve annotations by tools such as hypothesis, for example, where expert can make annotations and they could be flagged as a result. This is another set of problems because it means basically making this annotation a layer that is not the same where community members participate. So that's a separate problem altogether. But I think the issue that you have here, which is similar to what we're seeing with other types of recommendations that are posted on top pages, is that it's really hard to understand whether something is basically pending because nobody's paid attention to the request or because really somebody processed the task and forgot to decline it in a way. Does that make sense? I see, yeah, yeah. Yeah, I'm sorry. Actually, I was talking about this briefly last year with Iman, the published reviewer and that's probably that it doesn't fit under the existing process that editors have for processing feedback. I see. And I mean, one thing that's actually happening is when an article is candidate for featured article or a good article. Yeah. So for quality assessment by the community, there's no operation going on to get that article ready. And maybe there would be an option to, at this point, when somebody actively searches feedback for their article to the experts, I don't know. I see, so there's a, so how would we know which one is up for, sort of right at the boundary, up for the next quality? Oh, yeah, I mean, there are points, but somebody wants some for feedback, that's probably happening particularly, but there are featured article candidates, I see. I see. So it is noted on the article itself, this is a feature article. Yeah, I mean, on top of it, but again, there's a separate notice board, basically, for that kind of candidacy. I see. There's an article, a community of editors who come in and see what's up and comment on that, spend some time evaluating that article and commenting on it. Yeah. And then basically what the value that article is, gets set, probably, to say this or not. I see. So that's the stage where editors are much more responsive to any suggestions. Yeah, again, there's a main author that put them up and they're open to feedback at this point, and they're also incorporated. I see. But it's like this way though, not only they're not always watching the talk page. I see. And the other thing is, there are some, sort of probably additive requests, that's usually used for articles are locked, administrator locks are locked, so that's not a really competitive. There's a process where you can go to a talk question and say, I want to have this edit make, and then it gets flagged for administrators or, so there's a set of articles or all these. Yes. And that's only for complete suggestions, but you could imagine that you put something similar for people who are interested in getting experts' feedback and cover it, specifically, and then it will be a system where they get notified that here's an expert who wants to feedback. Okay. Reacts upon. Yeah. So I mean, one other question that I actually have for you guys is, we have several cases. So for instance, Josh Anquist, who's a econometrician at MIT, was asked to edit several econometrics articles, regression discontinuity design, instrument of variables, and he says, oh, he looked at the articles and said, actually, this would be something that's really good for a doctoral students. They're preparing for the prelims and this is, it's like an assignment, but at a higher level. And one of his doctoral students actually went in and rewrote everything. So it's a wrote a new article, and that was very promptly removed. So I think we lost this potential Wikipedia because he's spent so much effort to rewriting the entire. It's like diving too deeply too soon is what it sounds like to me. So it looks like incremental improvements. We can also characterize the type of suggestions into very specific. This paragraph should be modified to this and the sentence should be modified to this. These tend to be incorporated and stayed, but these fairly radical suggestions. Well, I think we're rewriting previous contributions. I'm not surprised that it's not received well. Okay, okay. So I wanted, we have a few more minutes. I want to check if there are any, thank you. If we have any final question, I do have myself one quick comment if you have time about this research about our first checking if there's anyone else. That there's one comment from my RC which I relay about quality classes. So Aaron, I got you covered on that one. Yeah, so briefly one question. So let's see, two quick questions. The first one is you mentioned that survey that we ran in 2011. It was a very unstructured and pre-designed survey just to collect some initial directions to do later on, maybe a more controlled survey. So even the recruitment method there definitely suffered from selection bias and we're very much aware of. However, in the results, the single factor that stood out as an incentive for people to make a reach to Wikipedia was their general attitude toward open access and open science, probably fine. Now again, this could be, and it definitely partly is a function of a selection bias in people who responded. But I'm wondering, I mean, considering the types of incentives, if that's something that you have any thoughts on or you try to incorporate in this line of research. Basically the idea that people who have, for example, publish extensively in open access venues or release data sets, they may have a higher propensity to also contribute to public goods in areas of expertise. Yeah, yeah. So we might be able to identify these people just from their publication records. Sorry to talk, yeah. Although one can found is the fact that the open access journals at this point are not the highest ranked journals, at least with the economics. And so, if I'm an assistant professor, I would try to hit the big five. And if that doesn't, none of these work, then I might move down. And there I have a choice of open access journals. But to my knowledge, this is true in lots of the sciences as well. So traditionally the open access journals but another related path is open courseware. So for instance, there's an OpenDOT Michigan site where instructors volunteer to post their course material online and open to the public. And MIT has that, I think lots of universities have that. So there is both the willingness to share and also, so that's actually a very good, I think, signal. Yeah, so people who publish in, so let me give you an example. So one of our Kiva field experiments was published in the PNAS and then you have the option where it's the proceedings of the National Academy of Sciences that have the option to make it open, open access. And I think the price tag was something of 2,000. And at the time, it didn't strike me that that was important but my co-author said, I think we should just make it open access. Then when Kiva started to post these articles on their website, they said, oh, we can post this one because it's open access for your games and economic behavior article is under Elsevier's behind the paywall. We're not going to post that one. And then I realized that there's this section. So it is expensive. So if you are, you know, so it's confounded with a lot of things. If you're in a resource rich university or you have lots of grants to pay for that, you might choose open access otherwise. So yeah, so I think the open courseware is probably the best bet because it has nothing to do with your resources or with your, with the journal rankings. It's just a willingness to share something. Or any mandates from funders which could also be another compound. Yeah, that's a fair point. Yeah, I had a final question but I think I'm gonna save it for lunch. What's about the policy implications of self citation and regional research is something that the community has been discussing for a while and there are different views on how this applies to peer reviewed research. Original research typically describes a very different type of research. Especially people asking questions in the street and reporting them on Wikipedia as opposed to having work that has been peer reviewed. But I've seen in the past that there are definitely instances where the original research policy has been used to also curb self citation. So I'll save this for lunch because we are timed out. Okay, okay, yeah. So let me thank both of our speakers and the audience both here and our IRC. And we'll see you again mid-February on the next showcase. Thank you, everyone. Yeah, thanks everyone. Thank you.