 Okay, we're live. Hi everyone, my name is Kinarit Gorder and I am the lead research community officer on the Wikimedia Foundation research team. I would like to welcome you to this month's research showcase. Research showcases are monthly convenings organized by our team to recognize and share recent research on our relevant Wikimedia projects. For those of you joining live, we welcome you to ask questions from the speakers in the YouTube chat. We will monitor this channel and pass questions to the speakers at the end of their presentation. We kindly ask that attendees follow the friendly space policy and universal code of contact. Before we get started, I have one quick announcement that call for proposals for the upcoming round of the research fund will go out in the next couple of weeks. Please expect an update via our regular communication channel such as wiki research L and wiki research Twitter account in regards to the exact dates in the next couple of weeks. I will now pass it over to my colleague Pablo will introduce this month's team and speakers. Hey, thank you, Kinarit. I'm Pablo Ragon, research scientist of the Wikimedia Foundation, and I'm glad to introduce the speakers and the topic of our showcase today. Wikipedia and the projects of the Wikimedia ecosystem are quite different from many of the platform that exists on the internet. And one of the most notable differences is their governance. This generated content in many social media is expected to comply with a usually very general set of terms and conditions that are designed by the owners of the platform. In contracts, Wikimedia content has to be aligned with a set of rules and policies that are created by members of the community. And some of these policies have become core pillars for Wikimedia projects to be what they are today. That rules might differ across projects and also they both over time as they need to address emerging challenges for information literacy. For that reason, or our September 2023 recessional case will focus on rules in Wikipedia. And for this exciting topic, we have outstanding researches today. First, Zachary J. McDowell, assistant professor in the Department of Communication at the University of Illinois Chicago. His research focuses on access and advocacy in digital mediated peer production spaces, in particular on digital literacy, self-efficacy, and how digital mediated tools like Wikipedia save these areas of inquiry. And Zachary will be presenting with Matthew A. Beta, professor of English and affiliate faculty in the composition and applied linguistic piece, the program at Indiana University of Pennsylvania. His research asks questions related to technology, writing, and digital culture. Their talk will focus on Wikipedia community policies and experiential epistemology, critical information, literacy, social justice, and inclusive practices. And after that talk, we will have a presentation by So-Jeong Hwang, PhD student in the Media, Technology, and Society program at Northwestern University. She will present a research on variation and overlapping the peer production of community rules. And in this talk, she particularly will focus on a comparative study of self-governance and rulemaking activity in the five largest language edition of Wikipedia. As usual, after this talk, we'll have 10 minutes for discussions. And we are happy to take your questions in the chat on YouTube. My colleague Isaac will monitor that channel and relay the questions during the Q&A. So with this, let's pass it into Zachary and Matthew. Thank you very much. Let me just share my screen and we can get going. All right. Okay. So thank you very much for having us. As you can see, the title of the talk is there's a lot of words here because this is going to be an overview of quite a bit of research probably over the last eight years between Matt and I. And this research is not only informed by all the kind of interactions we've had, but also combined 20 years of maybe more now of teaching with Wikipedia. My name is Zach McDowell. I am an assistant professor at the University of Illinois Chicago. And I am in the Department of Communication. I started teaching with Wikipedia as a grad student. And that is actually how Matt and I met. And we ended up, here's a plug for our free book. It is available open access as it should be from all things Wikipedia. Matt. Hey everybody, I'm Matt. I'm at Indiana University of Pennsylvania, which is confusingly enough outside Pittsburgh in Pennsylvania. So happy to be here like Zach, a long time kind of education, plus Wikipedia researcher. The book, though, is really more of a policy analysis and a cultural analysis of Wikipedia's kind of impact as an arbiter of knowledge. So we'll make frequent references to the book throughout our talk. We'll also be referencing a few other research articles and studies that we've done as well. So a quick overview of our talk today, which is really building from a study that we published recently in the journal Social Media and Society. You can see the citation there at the bottom. Wikipedia as open educational practice, experiential learning, critical information, literacy and social justice. So we'll go through some of the slides kind of fairly quickly, but all of the relevant research is available on the bibliography slide. I can also share our slides in the in the chat, which are hosted on comments or you can you can kind of search on Wikipedia comments for that as well. We'll be discussing, you know what we're calling the information literacy crisis in the context of the changing media landscape. So much is happening right now with generative AI. There's a lot of challenges that that in particular the Wikipedia community is facing. We'll also be focusing on a framework for information literacy put out by the Association of College and Research Libraries, which I think has been very forward thinking in terms of being able to capture some of the necessary information literacy skills and practices that that anyone needs right not just students but anyone. The main thrust of our talk is really on Wikipedia policy as an opportunity for experiential learning right and Zach's going to talk more about that experiential learning or experiential epistemology. We're also going to be making connections between opportunities for information literacy and opportunities for getting newcomers to the encyclopedia to think about knowledge equity systemic biases gaps, etc. Finally, we'll discuss the dangers of what we call the Wikipedia detour and this is where we'll kind of get into more of the recent changes to the information landscape to media landscape as AI systems become more and more frequent throughout our lives. So, as everybody's pretty much aware, I don't, I think you'd have to be living in a cave somewhere to not understand that a lot of people perceive that there is a major, you know, fake news crisis information literacy crisis these things are all very much connected. And, you know, despite all that being pretty terrible. What is really nice is that Wikipedia has been has come out as a shining star. You know, from 20 years ago, when you know people were told don't even look at Wikipedia because it's just a bunch of people who you know randomly edit this and it's it's made out of garbage. And it's Wikipedia has now become the the place in which even mainstream media is saying hey maybe you should go to Wikipedia and look at that, you know, rather than some other place. So, it's, it's, it's very nice because, again, a lot of our research points the same things and a lot of research throughout through many years have shown that Wikipedia is as or more reliable than other sources. These are just a couple of papers that we've released in the last few years that really kind of like shows what we're going to be pulling from a lot of throughout this to try to kind of blend together to show you like the different aspects of how this comes together. Exactly. So this slide you can kind of see a little bit of our research paradigm here visually illustrated. We are doing kind of a meta analysis of previous research studies. One of the biggest ones is McDowell veteran Stuart from 2019 computers and composition, large scale survey of both students and instructors using Wikipedia education or enrolled in that particular program. But we're also building from some smaller studies and in general are, as Zach mentioned kind of decade experience of teaching with Wikipedia doing workshops participating in the movement itself. So we're doing kind of a meta analysis of previous research to illustrate some of the main takeaways that you can use to bridge research and practice. We're hoping in particularly to spark some conversations regarding information literacy representation. Even the future of Wikipedia right we're going to we're going to talk briefly at the end about what we call that Wikipedia detour. Because we recognize that even in some of the open meetings of the movement. There's a direct indication of some of the challenges facing Wikipedia, such as knowledge graph or other ways that users are being kind of directed away from the site and don't always understand how important Wikipedia is downstream right as it's being used in all these other kinds of applications. So, many of you who have, you know, edited Wikipedia or work with Wikipedia, you know, we might have forgotten some of the kind of steps in which it takes to to participate in in the Wikimedia authorship. And so what we want to remind you is that there are a lot of kind of like steps. Even if it's not steps this might be a terrible infographic because it might not even be steps because these are things that you have to learn throughout and increase, you know, such as the evaluation of information. Not only where it comes from but also what's missing how you choose articles or topics. All of these are really important steps in that pair with very fundamental kind of ways that you need to learn information literacy. And really, this is going to pair with the six frames for information literacy here. The next few slides are going to go over some of the research and that that article that was mentioned here and the Wikipedia is open educational practice. What we did in that was read the a variety of kind of policies and on Wikipedia as well as kind of how things work through these frames to understand overlapping ways of thinking about information because the the six frames here. Rather than being just prescriptive, you know, such as the ways that we used to think about information literacy like do this do this understand what this is. Instead, these are descriptive frames to understand kind of how people have these kind of interconnected ways of understanding knowledge and that helps build robust information literacy or what Dana Boyd calls antibodies not to be deceived. But more than these antibodies these frames illustrate where learners are gaining insight and deep understanding where information comes from how it's used and how to understand it in a more comprehensive and interconnected way. And this research that we've done on and assessed how people learn to use Wikipedia is really kind of illustrating the the learnings that what we refer to as experiential epistemology. It's essentially like they are experiencing that the frame and the, all of these kind of like the way that information comes together rather than just trying to learn about it again this is. It's about experiencing it rather than just having it told to you, which really embeds these kind of skills in a much more robust way. So the first frame put out by the ACRL again Association of College and Research Libraries addresses authority authority is construct constructed and contextual. And this is a longstanding concern. In much of our research you know as a writing studies person, I come to the table trying to think about how can students understand authority how can they gain authority in their own writing how can they understand the authority or the lack of authority of sources when they're evaluating information, etc. And you know what we found in our research is that through this kind of experiential epistemology. novices to Wikipedia students and other novices can begin to kind of recognize the complex system of authority that scaffolds Wikipedia and how to interact with it. So chapter two of our book is actually really, really goes into how we kind of understand and theorize that complex system. You know it's an interplay between these important policies like neutral point of view, verifiability. It's an educational research right it's an it's an interplay between those policies, and the genre, as well as the kind of long standing encyclopedic epistemology which is, you know, as a tertiary source source always operating from other secondary sources right. So for each of these frames we actually have some quotes from students from some of the qualitative data, but we also have some relevant research from some from some library science researchers actually Lori bridges to kind of reflect on and interpret some of these frames. This is a student quote I love this. And this really reflects the kind of the value that qualitative research can bring to these discussions as well. I'm not going to we're not going to read the quotes you guys can kind of look at those and you can access the slides as well. To me this is really an amazing kind of quote because it represents, it demonstrates the kind of massive perception shift that the student or the novice to Wikipedia undergoes, as they're kind of starting to learn about some of those behind the scenes operations right like what's actually happening behind the actual article on talk pages on Wikipedia project pages that host those policies throughout the editorial history, etc. You can go to the next slide. Oops. Okay. So moving right along the second frame from the ACRL is all about process. A big thing for teachers of writers but what's a, you know what's especially important about this is that the process of information creation in Wikipedia comes with these particular checks and balances for vetting the information and through that experiential epistemology novices have to also go through that particular process, they have to navigate some of these policies like neutral point of view no original research and verifiability right. So here we have another student quote, and the student is referencing really the, you know what I think of as the, the kind of radical transparency of the encyclopedia right as it manifests in Wikipedia. So we're kind of talking about the writing process, but it also has tremendous implications for processes of information literacy and vetting information, right. A lot of the emergent kind of generative AI chatbots, they're still not really able to be transparent about what they're where they're getting their information what their sources are coming from right. So the draft is doing a little better with this in terms of like okay I got this from Wikipedia or I got this from a movie index etc, but a lot of the emergent AI isn't doing enough. Okay. So the third frame for information literacy is related to the value of information right, which for us and for our research really resonates particularly with the ways that Wikipedia has become such an important arbiter of public knowledge and index of public knowledge, and thus also without getting philosophical, an arbiter of reality itself which of, of course is the subject of our book. So Wikipedia and kind of its uneven terrains if I can steal a phrase from Mark Graham is a way to think a way to a way into a way through to think about the different values, given to information through representation, right. So Mark Graham's phrase uneven terrains references kind of the often over representation of some things right on the English Wikipedia, as well as under representation of other items and these have significant implications for newcomers that that newcomers to Wikipedia need to understand. And here we have another student quote. It's not random. The information that's missing from Wikipedia. It's a history of knowledge of the events that have been documented and historicize in the world. So that third frame. You know, the student is able to kind of see that value that the information has, and is able to see how in participating in that experiential learning process, they can do more to work through systemic biases and address systemic biases. So the fourth frame here, and I'm going to speed it up a little because we're running short. The fourth frame is research is inquiry, it helps think through how people put information together in Wikipedia, as a space of compiled information, although Wikipedia doesn't allow the traditional synthesis of information you know no sense. But what they're what it is instead is a compilation and really in a way it is a synthesis because it's taking a variety of different voices and combining them to tell a whole story a story that should be represented in a neutral way, but still it is a combination. And the student here is saying that they're far more aware of these articles I love how they a lot of these quotes are like they're like oh and something or whatever and we kept that in because this is real quotes and they're trying to work through these ideas. They're saying that they're being more aware of like where the sources are coming from because they're they're able to kind of see how things are connected. The next one scholarship is conversation is really, it's not just about the kind of that it is as these are all interconnected right they're not just individual frames, they have a lot of overlapping ideas and the last one really blends with this because this is about not only the conversation that you can see when you're reading a Wikipedia page because you can see how these different information pieces come from different sources, and how that represents a conversation, especially when it comes to things like scientific articles, and how it can, and because of NPOV, you have to represent a variety of things in which, you know, things have been discussed, but also on the backside. And this is Dallin Bridges here. They're really talking about this college scholarship scholarship as conversation by also seeing how the students can, or all learners can view how the students talk about the actual article coming together, they can see the actual meta conversation that is emerging that forms the framework for how things get represented. And then, finally here searching is strategic exploration, really is talking about how learning to evaluate sources is not a simple task. It's really difficult in a complex media landscape. And these contributors need to understand, right, as Dallin Bridges say, searching does not always produce adequate results. You know people are used to this idea where they search on Wikipedia or they search on Google they get the first thing they look at the knowledge graph. People are used to a lot of information that they can just ask their, you know, personal digital assistance I'm not going to say any of the names to spark them off. But and then they get an answer, and then they're happy with that answer rather than understanding that they need to dig deeper to understand these things. And this is a lifelong process that can really affect the way in which people understand all information. So I'm going to, I'm going to shift gears a little bit to kind of talk about social justice work social equity work and how our research has uncovered opportunities for addressing systemic biases in Wikipedia. The first part of, at least my realization about this was a large scale survey I did with joie zing it's published in first Monday should be in the bibliography that actually asked instructors that were participating in wiki in the discussion about why they were motivated to do that. Right. And that research has shown that there's this emergent group of instructors but also potential wikimedia contributors who are really motivated by social justice and knowledge equity concerns right. And what we would like to do is really call on wikim wikimedia the wikimedia community community to kind of leverage that group of people in both educational projects as well as more public intellectual work with wiki projects and other knowledge equity initiatives so we're thinking especially like whose knowledge. Afro crowd these initiatives that are really trying to work towards bridging the gap in terms of some of these systemic biases. But a lot of it really comes back to this and we're going to come back to the policy here, recognizing that verifiability itself is kind of what we have called a double edged sword, right. Because, on the one hand it's so important in teaching information literacy, right. And I've noticed as well that the, the large language model the pop the wikipedia policy on the large language model is also taking a similar approach right like to, to vet take care and share make sure that what you're using is actually working towards the information's reliability. So, but also on the other hand it's like blocking indigenous and folk knowledges that are not based in these print centric and public sources, published sources right. So we need to keep the community dynamic recognizing that fifth pillar, especially in order to stay flexible for the ever changing media landscape. So here, you know, we're also thinking about the that appropriation of Wikipedia content starts to have devastating effects on Wikipedia's ecosystem, because not only and this is a quote from Jim Menly act is users rarely know what their answers that their answers are to Wikipedia, when it comes to Google knowledge graph, generative AI, and a lot of other things even like tick tock is like just there is just a voice, you know reading Wikipedia or somebody actually just reading Wikipedia to them they don't know this. But also this is precluding users from accessing and experiencing Wikipedia, because they don't know it's coming there because they're not clicking through, or they don't have the option to. It reduces the ability to find new editors, new users and potential donors, although I don't think you know where there hasn't been a huge problem with that lately. Wikipedia has a distinct problem recruiting new editors, particularly in diversifying editorship, which is also there in this in Jim Menly acts study as well. But really in the end here, we want to say that this is a really quick rundown of a lot of work that Matt and I have done over the last eight years, and I want to kind of close it with kind of three main things. One, despite some of the issues with representation biases, Wikipedia's policies have shown to have incredible effects to help teaching information literacy in increasingly complex mediated space. Two, if the community and researchers can come together to explore better ways to combat systemic biases and find new ways to expand representation on Wikipedia, it only gets better. There are many projects that are doing quite a bit and some of the people here are like literally on this zoom call that are doing this amazing work. But some of these other issues, such as the representation of indigenous knowledge are baked into the system and we need to address them. And three, all of this remains threatened by the increased detours to Wikipedia, not just the knowledge here that is under threat, but that's potential for change. That makes Wikipedia so special. It's the community that is committed to providing access to the world's knowledge that has created the space that teaches not just about topics, but about the interconnectedness of information and creates a space that allows for representation of the underrepresented. So let's work on ensuring we can keep that going. Thank you very much. So many days in this talk, I was not familiar with the theory of experiential epistemology and find it so appropriate for Wikipedia. I thank you very much for sharing this. I don't know we have, I think we don't have questions on YouTube yet, but Isaac, I think you mentioned that you have a question. So please unmute yourself. Sure. I can start with my first question. It's on the difference I think between like showing versus doing and I was thinking in particular, I think it was point number two about process and the student had the quote about, you know, understanding how the article is evolved. And it makes me wonder, you know, one idea would be, oh, well, maybe if we had better visualizations, for instance, that would show a reader like is how this article evolved and wondering just how close that gets you to the understanding of by attempting to actually like add a revision to that article and if you view all of thoughts on that. I can start with that real quick. I one of the my favorite videos that is really old right now is this somebody did this great video on the evolution of the heavy metal umlau. And it's great because it goes through there and they explain the process and they watch. And so my students will watch as this weird little article about the heavy metal umlau comes around. They even can see how, how there was vandalism, how it gets erased and everything. And that's just from somebody just paging through, right. So I think that they're, you know, we, I don't know if we need a bunch of extra tools or whatever to show that we, I wish somebody would remake that video, because it's it's kind of getting old at this point. But what we need is, is to be able to illustrate that in ways that then then they can go find and see that they can experience themselves. That's really what it comes down to is they can see this come together very quickly and then they're like, Oh wow, this is how easy to do it. I can do it. But then they also realize the complexity of it. And so like, it's a lot of kind of layering of like different experiences and understandings that come together all at once. Yeah, I'll just, I'll just very briefly add I think showing can be powerful. Having them actually experience it is doubly powerful, even when sometimes they have a negative experience actually like they you know it's it's not uncommon at all for students participating in wiki Ed and trying to edit the English version to get their edits reverted. It can still be very, very powerful, because they actually connect all of the dots, and they become a user in that experience. Thank you both. Yeah, I like that showing essentially as the spark for hopefully getting to do it. The important thing to remember I think that is that a lot of this is actually we a lot of Wikipedia and take a lot of this for granted, but when you're teaching with it you start to see like how these processes come together. And it's amazing how very simple simple things are incredibly powerful that we just sort of overlook a lot because we don't realize how incredible that that process of learning is and how it changes the way that you fundamentally approach. Everything else in the world. Can I jump in with a quick question. Well first thanks so much for the talk it was really cool. But on the last slide, you mentioned that like one of Wikipedia's greatest potential is change and I was wondering what you were thinking like of the role of rules and facilitating that kind of change. I can take a stab at that one it's a definitely a big topic because the English Wikipedia is more folks have used the term ossified in terms of its particular editing guidelines behavioral guidelines etc. But it's important to remember throughout all the, especially when compared to other language versions right which you're going to see more dynamism like more ability in younger Wikipedia is for there to be more flexibility in rules. So for the English Wikipedia it's important to remember that the community mediates the rules. Right, and the community can reinterpret and can rewrite some of these rules and policies in order to create a more equitable encyclopedia. That's my short answer. It was great to discuss about limitations of only one Wikipedia and the template of change because I think this creates like a great intro for our second talk. So you're on the floor is yours. So yeah let me sharing screen is always. Okay, there we go. Does that that showing up right okay great. All right. Thanks for having me again and I'm also so happy to be going after Zach and Matt because it really I think it, there was a very nice segue just now. I've been talking about this work on variation variation and overlap in the peer production of community rules for quite some time and I'm very excited to be sharing it at this showcase here. And like as we saw in the talk just now, we kind of know that the governance of Wikipedia happens through its communities of users, and the stakes to do pretty well at this is quite high. The possibilities and potential when we do well is also immense, especially now that you know as Wikipedia has downstream impacts with language models augmenting Google search, etc, etc. But an important fact about these communities is doing this governance work is that they don't exist in isolation. They are networked and connected to one another. So if we want to understand the governance of systems like Wikipedia and their success and failures. What we really need to do is also look across communities and get a sense of the governance ecosystem that they're starting to make up. So how they're creating a really complex arrangement of norms and values and governance practices, etc. In the context of Wikipedia by communities what I really mean is language editions and I think of language editions in this study as kind of like connected self governing community organizations and studying language editions or at least like thinking of them in this way is useful for our purposes because we can see that language editions are distinct in their content and contributors, but they remain aligned in some key ways and I think that these key ways is probably very familiar to the audience but it's worth emphasizing because it's why they're like so interesting to compare as a set of communities from a social science perspective. So first, all the language editions share the same goal producing a peer produced in psychopedia. Second, they're hosted on the exact same wiki technology infrastructure where all the interactions are mediated on. And third, the language editions all follow the same self governing model where people are collectively participating in the rulemaking and rule enforcement in their respective communities. And so, you know, taking together all the ways that these communities are the same. We want to know are the rules the same or the governing practices that they come up with the same as well. And that brings us to the research questions that was driving the project that is kind of what I'm primarily drawing on for this talk. So first, how do the patterns of rulemaking over time compare across language edition communities. And second, how do the sets of rules become more or less similar over time across the language edition communities. And in the study we conducted a descriptive study. Basically examining the rules and rulemaking on the five largest language edition so that's English French German Japanese and Spanish. And we do this because they have comparably long and rich histories to examine covering almost 20 years. But it is of course true that this would be interesting if we could extend this further to especially to the smaller communities I think. The details of the exact research method are available in the manuscript which I'll share a link at the end but I think it's also available online on the research showcase and my website. And I'll just give a brief overview here so that we can set the groundwork a little. So first we constructed a list of rules per community. Rules aren't actually identified in like exactly the same ways across communities, which already point to some organizational differences but what that really just means is that we had to develop a consistent logic for what would be included as a formal rule and analysis and this came out to like something like 700 rules or so that I manually went through. And then for the rules we collected their revision histories to understand their development over time through the view of their edit logs and we also collected something called inter language links and this is a technical feature of Wikipedia that connect like conceptually equivalent pages across these if those pages exist so like if there is a Wikimedia research showcase page on the English Wikipedia and on the Spanish Wikipedia they should be connected with the inter language link. So, in the case of rules, the indicates whether rules are shared across these language editions. And then, finally using all this data we conducted our descriptive analysis so in the next few slides I'm going to highlight some of the main findings and kind of talk about them and what they suggest. So first, directly to the question about rulemaking patterns. And we compare the patterns and rules and rulemaking over time across the wikis. We saw noticeably similar trajectories across all of them so here I'm showing a set of figures a B and C, and panel a is showing rule creation over time, and panel B is showing rule activity over time and we can see that in both cases. Most rule pages were created and most actively revised and something like the first five years or so of rulemaking activity. And in particular I want to draw your attention to figure B because it is showing this rise in decline pattern which is pretty well documented in patterns of historical content on English Wikipedia specifically. And what we're seeing here is that this is also true for governance page content, and also across language editions. And finally panel C is showing that the revisions of rule related pages shift towards discussion of rules and proportion, rather than revision of the rule text itself so you know admit the kind of declining rate of revision that we're already seeing in figure C is telling us that people are increasingly unlikely to actually edit the rules, although they might still discuss them. Additionally, we also note that revisions are getting smaller over time which is shown by the red line on here, and the time between them is increasing. In other words they're becoming less frequent so this figure along with the figure C that I had here is something that was observed in earlier work by Brian Keegan and Kasey Fiesler on English Wikipedia. And so here we're basically seeing how it extends to other language editions. And together all of these figures really do suggest a shared pattern of formalization in terms of written rules across the language editions. Earlier we were like kind of talking about or we were hearing about like ossification happening from that. And it is true that formalization is something that has been observed and discussed quite a lot in the context of English Wikipedia. And one of the key takeaways from our set of figures is that several rulemaking patterns that we observe in English Wikipedia, especially like patterns that indicate a tendency towards formalizing. They do they replicate across non English language editions. And from a researcher's perspective, this is good to know there's so much work with like so many interesting findings about self governance that have looked to English Wikipedia. But because English Wikipedia can feel so exceptional in many ways and it's large even compared to the other language editions. Sometimes it can be a little hard to say how much those findings translate to other scenarios. But by looking across communities our work does suggest that some of the lessons and challenges and takeaways from a very rich existing body of work about the like trajectories that self governance takes online might be more broadly generalizable in a way right. And I think it might also help us like kind of change how we perceive some of these patterns so a lot of work suggests that formalization is undesirable and even in my manuscript we I kind of take that tone of like it's like calcifying and so what do we do and and part of this is because it runs counter to the sense of open participation, which has been like shown to be a really like a very real challenge in retaining newcomers and diversifying editors, especially. But seeing it happen across communities does suggest that it may part be part of the natural trajectory of stabilizing as a community to and there, there must be like some kind of benefit as well. To this kind of stabilization right so rather than asking like how to prevent formalization completely some better questions to ask might be like is formalization happening for the same reasons every time across these communities. And what does that mean when it's not and what does that mean when it is. What do we know are the unwanted side effects of formalization and how can we specifically target and mitigate them and so on. One of the ways to think about what's happening as these communities formalize is of course simply like looking at the actual worlds and the resulting sets of rules and so that brings us to the second research question, which is how the governance of these communities relate to one another. Before we go into how the rules are overlapping over time. I want to take a moment to contextualize like what are unique versus shared rules so to understand the overlaps in real sets we're taking advantage of the inter language links that I mentioned earlier so as a reminder, they should know that these rules are shared slash conceptually equivalent across communities and the more inter language links. A rule has the more connected is on on or more shared it is across the language editions. And in this study, we're focusing on just the links among the five wikis. So the highest numbered would be for like the four other wikis right. And in the list of our like 700 or so rules. 28% of them have for inter language links so they're completely shared across all the wikis and 34% have none so they are completely unique to one biggie. And overall, you can see from the descriptive numbers here that shared rules tend to be older, longer, a lot more widely referenced internally on Wikipedia, as well as more edited and edited by more people. So the question is like how do these overlaps start to change across the communities over time. And this panel figures is showing the proportion of rules that have more versus less inter language links to other language editions over time and the darker tones are indicating that the rules in that proportion are more widely shared. So you can see that over time all five language editions are starting to like see a creeping increase in the proportion of rules that are more unique to them so the lighter ones. Although the degree to which this happens is like quite variable right shared rules are pretty dominant in the smaller wikis Japanese and Spanish, while unique rules are more prominent in English and German, which were the largest two, in terms of active editors when I did these analyses. But it's not actually completely explained by scale either because French and German are actually not that different in size. But you can see that German looks a lot more like English Wikipedia. But in any case, while there is growth and more unique rules attention to rules actually goes the other way so that's the really interesting part here the panel is now showing how revisions are distributed among widely versus less widely shared rules. And we're saying that over time editing activity like really substantially concentrates among rules that are widely shared so the darker tones. And what this suggests is that people are like actively discussing and thinking about the broad shared rules and I want to like recall the earlier figure that was showing how the edits are actually increasingly going to the discussion pages. So this is really about discussion right. And this is an indication that a community is like really taking the time to make the rule its own or as we might say like internalizing the rule. A much broader body of work from political science underscores how internalizing rules is a really core ingredient for success and self governance and and Wikipedia I think that like comes through and what we're seeing here. There's also a lot more work to understand exactly what's happening when these rules are getting internalized, and what that really implies which I'll get back to in a moment. So for now with the with the two figures together here, you can see how there's both a sense of divergence and convergence and the rules that communities are developing discussing and using. So this is going to these broad roles, which emphasizes their like kind of shared importance and value across communities. But this also means that communities might be individually changing the meaning of those rules over time. And this is also supplemented by the fact that the count of unique rules is going up. So what are like the key takeaways from this. I think there's two main ones. First, communities that are aligned in key ways so earlier I was talking about goals and technology and governance models, but through the through the slides we've been seeing that also they're aligned in their core rules and their rulemaking patterns. So despite being the same in all of these ways they still develop substantial and sustained institutional variations in practice. This is evident not only in the kind of continued edits and discussion concentrating on shared roles, but also in the growth of more unique ones. And second, the broad widely shared roles that we are observing to have continued importance, maybe a really key factor in helping to coordinate communities towards shared goals so in so far that Wikipedia has worked pretty well so far. So we might reasonably expect that whatever communities are doing to internalize roles has not been like completely disastrous for the most part right. I am aware that there are some cases where like Wikis have had some problems with neutrality but they tend to be kind of exceptional outliers so how this works this like coordination through roles works is a question I think for future work that I would love to explore. If we're seeing variations in practice like our figures are starting to suggest like when is variation bad, and when is it too much. I think understanding when broad roles fail to coordinate communities adequately towards shared goals is really important because if a community breaks away from a shared goal and produces harms it impacts other communities in turn. And on Wikipedia, I think this has some like particular implications for safeguarding against misinformation or informational harms across these distinct self governing spaces. So to sum it up, there are some. I would say there's like three core trends that our work is emphasizing so first, we see the shared patterns of formalization and an increased second, we're seeing an increasingly diverging set of rules across these communities. And this is third supplemented by or complemented with the finding that there's a continued importance of shared roles which may or may not be evolving in different ways. And I do want to say of course our analysis is limited to five language additions and some of the patterns here are pretty like at a high level. But what we are seeing overall is like a story about how communities are adopting and internalizing rules and I think this is a story that matters for many other platforms, which are increasingly relying on communities and users to govern. And like when platforms are doing this, community rules become basically the fundamental mechanism around which the successes and failures of online governance occur. And with the scale and history of Wikipedia, understanding how it's happening on Wikipedia is not only practically important for thinking about like the downstream impacts of Wikipedia, but also for gaining a lot of valuable insights and lessons for how we govern and organize online more broadly. So to that end, here's some like directions for future work that have come up in discussion in the past, and that I've also been thinking about. So first is like the question of scale right so we are still looking at pretty large communities in this study and I think it would be super interesting to look at smaller communities and what it means for them to be internalizing rules successfully. And second is the question of characterizing community specific rules. I think this is really compliment to the first like what kinds of rules are unique and what makes them so I did take like a non systematic peak at unique rules and I will say a lot of these unique rules tend to be quite complimentary to the broad shared rules. So rather than like adversarial in some ways. And then finally, the, there's like a lot of work to be done to understand how enforcement of broad shared rules varies. Our language additions using the same rules to different effects is their variation within a language addition that we should be keeping an eye on. How can we address like harmful effects that we might see or leverage the rules in different ways to combat biases and stuff like that. I'm very happy to chat about any of these or anything else. Thank you and here's our contact information for myself and Aaron who is my advisor and the supervisor of this project. Thank you very much for presenting this amazing work. So we have already some questions here in the, in the room let's start with Isaac, Matt, and then Caroline. So really interesting talk. I'm going to try to kind of phrase this phrase as well. But I was thinking about what you were sharing, and about what I'm at and Zach shared earlier and trying to connect the two and wondering about kind of where as a new editor, it becomes really important to start understanding some of these nuances of the rules and differences maybe across language additions. And, you know, there was a second part of this that I was saying and that would be welcome Matt and Saxon put on this then. That's about like, you know, I assume you've generally been teaching the English Wikipedia communities and like taking so you work into mind like does the class look dramatically different if you're teaching to a class that's going to be editing a different language addition. So I have not a lot, but I have some experience I've done some work down in Ecuador, where we've done like miniature editathons with students and specifically around representation of indigenous women. And if there's one thing, I mean, I would say a lot of the takeaways for learning things is very is very similar, but it is so much easier for them, because the first of all there's a lot more knowledge gaps that are easy to fill and second of all there are a lot less persnickety when it comes to, you know, are you sure you want to cite that kind of thing. So I found that representation of knowledge there is a little bit easier for there is a little bit of a lower bar now in the in the smaller Wikipedia is because there's a lot more work to be done right and you know with it. And besides just the ossification of all the rules, but like now people just done a lot in Wikipedia so there's a lot less spaces to fill that are low hanging fruit right there's obviously huge knowledge gaps but one of the reasons for that is because of systemic, you know, knowledge representation with where are these sources so people have been able to find the low hanging fruit and a lot of the English Wikipedia, much easier. So I found it to be a little bit easier when it comes to that, but I think the takeaways for the students have been very similar, especially when it comes to that they're really motivated because, especially there they see there's even bigger gaps and they're like how I grew up in a town that I grew up in not covered in Spanish Wikipedia so because, especially the Spanish language Wikipedia has a huge issue with knowledge gaps around South America. So they were finding a lot of things in Ecuador that just didn't even have pages. Similarly like in the, in the journey to like kind of close gaps across language editions. I think there are people who are like doing cross lingual editing quite a bit I was lurking on tea house the other day and someone was asking like why, why, why, why isn't this page on English Wikipedia when it qualified on like this other Wikipedia that I've been looking at and I want to bring it over. And then folks were talking about how the standards of neutrality and notability are different. And I think when people are trying to like bring stuff that are not local to that particular Wikipedia that's when these like nuances start become trickier and I think in so far that we are actively trying to start closing these gaps in that sense that this is going to become more of an issue. Thank you. Matt. Well thank you so much for this presentation is wonderful wonderful great research. I'm kind of interested in like the granularity of your analysis. Were you able or did you identify the kinds of types or classes of rules. For instance, behavioral editorial or content rules that were more divergent across like did you kind of differentiate between the types of rules in that way in your analysis and what and if so what did you find. So I actually did not and part of this is because of my own like linguistic barriers for interpreting the rules fairly like I didn't like I speak English and Korean and German but only English and German are in this set. And I didn't want to like impose my own like misconstrued understanding of what the what's being said in these languages so I kept it at a pretty course level. I think like in the future if I were to try to do this I might follow like an existing taxonomy of roles like Eleanor Ostrom's taxonomy of roles for analyzing institutional development. But I think like my like my sense of what's more divergent or varied is really like these more complimentary roles where people are developing extra role pages to hash out or speak in greater depth about how they see like neutrality or something or how they see notability applying for a particular case that they feel is more like culturally relevant maybe or that happened to come up in discussion so it now has its own page that kind of thing. Thank you. So you're the next one in the queue is Caroline. I don't think we can hear you. Can not hear you. Can you hear me now. Perfectly. Great. Thanks so much for their presentation. I was thinking about your findings and conclusions related to the decrease in revisions to rules that you saw and formalization and pros and cons of formalization. And I was wondering about the role of like democratization or consensus building in the decision making process surrounding rule revisions, which I'll admit I don't know a lot about. So my question is, were you able to incorporate any data or metadata about like requests for comment or voting on proposed changes to rules. In other words, I'm wondering, is it, well, maybe I'll let you answer the question and then I can, I can follow. Yeah, that's a really I think that's like a really good question. I didn't incorporate like these more deliberative or consensus consensus seeking processes because they map more to like a specific subset of roles and here it's like a pretty broad set of roles. I am starting to develop like a research project about articles for deletion. So maybe I'll have more answers for you in like a few months or something to that end. I think one way that I probably could have gotten at this is maybe focusing specifically on like text mining the rural discussions themselves. Because of like the multi-lingual aspect, I was like, okay, maybe this is not, maybe this is not within the scope of this particular project but it's definitely a good thing for like next steps for sure. And especially now that there's a lot of more like multi-lingual tools that we can use to do that kind of thing. My answer is my question. I was thinking about like a, if it's possible that the decrease in changes to the rules you're seeing could, you know, have something to do with like perhaps like a growing importance and like consensus building before change in implementation. But yeah, I realize that's an additional project. So, thank you. Cool. But thanks for the answer. I appreciate it. Thanks. So thanks so much. This is really fascinating. And I think my first question is a little bit superseded by already your comment, which is sort of dependent on the idea of doing some kind of content coding on digital tools. But I was wondering about whether, when you mentioned that, I mean, in general you saw sort of emergence of common patterns in rulemaking across language editions. And I was wondering if you had the chance or if you're thinking about doing something in the future about the determinants of that emergence, right. So like, is it sort of, do we see it at a certain threshold for community activities? Can you see the certain threshold for, you know, added activity for, you know, I just thought that would be an interesting sort of idea of seeing like, you know, at what point does a community feel like they need to formalize. And I think kind of along similar lines, I'm really curious about whether there's a, it's just sort of stimulated a thought in my mind about research I'm familiar with about sort of the role of norms entrepreneurs. So whether there are, because we know that there are certain folks who, you know, and participate in multiple language editions. And I was wondering whether there, there were sort of nodes of both, not just rules that were shared but there were sort of folks who maybe had some role in promulgating kind of shared rules across the community. So I think there's, there's like two questions the first about the determinants and the second about like cost language contributors. So the determinants thing I think is super interesting and I actually hadn't thought about it so now I want to like go look at it. I will say I do think like scale definitely is taking a factor here but it's not complete like I don't think it will explain completely everything because there are like even like with French and German they were pretty similar in scale and now French. I think French is German was bigger than French but now like French is bigger. And they have like slightly different patterns that we're seeing French is a little more similar to the smaller ones German is not. So I think there's also just like organizational tendencies that these communities are taking on that are making a difference. And in terms of cross lingual contributors, I did not do this systematically but I did look at it for a neutral point of view and I actually found that like there's almost no cross lingual contributors. And the ones that I were seeing were mostly bots. So I don't think that this is the case where people are like, really systematically coordinating across the language editions rules. But it doesn't that doesn't like foreclose the possibility where people are actually referring to other language editions as they make edits like the early revisions of Japanese Wikipedia's. Some of their rules. I, if you look at the comments it's quite clear that they actually copied and translated it from English Wikipedia. So I think there's like a lot of different ways that they might be like subtly coordinating with each other or like kind of trying not to reinvent the wheel here right. So hopefully that answers your question. So, again, thank you for your presentation like I truly appreciate when people get some existing work like in your case you took these were by Brian Keegan and easily fizzle in 2017. And I think like Isaac and I were attending to the presentation as busy students of that paper. So it's like six years ago. So you took it by to address much more language editions. And this is great because this gives us like a deeper understanding of Wikimedia communities. And I'm very curious about the challenges also because policy pages are not easy to analyze. So what were the main challenges that you have to address when studying policy pages in multiple language edition Wikipedia. And what have you learned that you can serve with our community so that we can foster more multilingual studies. Oh, oh, that's a good one. I think the first is the language barrier. That one was like, especially when I was starting to develop the list of rules. It was, I was mostly using Google translate but I spent a lot of time on it because I didn't, I didn't want to mess up basically, especially for the languages that I didn't know I didn't want to like unfairly misconstrued what they're constituting as rules and stuff so unfortunately there is like, like fairly similar ways of categorizing rules across these communities and that's why I focus mostly on these like higher level, like more course, more course or levels of analysis. I think like in terms of like what we can learn from for to for promoting multilingual tools or multilingual analysis. I think maybe. I don't know, maybe like, I mean, what kind of multilingual tools have been trained on Wikipedia data would be helpful because that like is we already know then that that's trained on the existing data set and it can be a bit more reliable for the actual evaluations like one of the things that I had really wanted to do with this work was looking at multilingual content but at the time it was really hard to figure out what kind of like text mining tools would actually be sufficient to like fit to do this in a way that wouldn't like totally skew my results in a sense. I do think that maybe creating more opportunities for community engagement with the smaller communities would be really helpful because in my case, I focused on the large communities but and part of that was like I wasn't sure how to approach the smaller communities that I don't have like any linguistic or cultural ties to without feeling like I was just extracting their labor. And so having these opportunities and help maybe like interfacing with researchers to understand like what are the problems communities want to get solved, because then it's not so much about like my curiosity as a researcher like I want it to be grounded and what the language additions want to know about cross lingual dynamics. A great answer like building those bridges with those small communities I think it's something that deserve a lot of our attention. So, I think we have no more questions. So let's go back to your turn. Thank you again. Thank you Pablo and thank you everyone for joining the showcase today and this very lively discussion. I especially want to take our speakers Zachary, Matthew and so I am thinking for your contributions today. The Wikimedia Research Showcase is made possible thanks to the lovely coordination team with my colleagues Pablo and Isaac to thank you both. Isaac was also managing the Q&A today so thank you for that as well. I would also like to thank Emerald for the support with the audio and video. Our next showcase will be on Wednesday October 18th and we'll focus on data privacy. We're looking forward to seeing you all there. Thank you everyone.