 Good morning, good afternoon, everybody, and welcome to the January edition of the Wikimedia Research Showcase, the late and happy new year, everybody. I hope you had a good break. And I'm very excited today to have here two guest speakers to kick off our 2019 series. We have Martina Vlaestra and Tasas Nulas from NIU who are going to be presenting some exciting new research on our Wikimedia projects. Before we start, I want to give Miriam a moment to give an important service announcement. Yes, so just a reminder that this year again we're organizing the Wikimedia workshop, which is the annual gathering of all researchers that are interested in working with Wikimedia data. It's going to be held in conjunction with the web conference, so May 13 and 14 in San Francisco. We have two deadlines for paper submissions. The first one, if you want your paper in proceedings, is January 31st. If you don't want paper in proceeding, you can even submit it at the second deadline, which is May 14th. So please submit your amazing work to the Wikimedia workshop. Thank you. All right, thanks a lot, Miriam. And as usual reminder of our house rules, we're going to have two presentations, each lasting about 30 minutes. This is going to be a quick Q&A at the end of each presentation. We have Jonathan, who's, as usual, our host on IRC. So if you do have questions on the YouTube channel or on IRC, please ping Jonathan, he'll be relaying them to the speakers. We're going to have a final Q&A session at the end of the presentation. So with that, without further ado, Martina, I think you are the first one in today's agenda. Thank you. Let me just share my screen with you. Thanks for having me. Can you guys see my screen? Yes, we can. That's okay. Cool. So, for me, in this presentation, I wanted to give you an overview of a series of studies that I did that are aimed at providing a better understanding of the relationship between new Wikipedia editors' motivations and their editing activity. And this work was undertaken in collaboration with Oded Naav, Opharrazi, Khoi Cheshire, and Lear Zalmanzin. So just as a quick agenda, I'll begin by providing some general background on collaborative production or co-production platforms like Wikipedia and why studying editors' motivations can help us to understand activity on these platforms. And then I'll go through the three studies that we did. And the first one relates to how new Wikipedia editors' motivation is related to their activities. So effectively, what do people do and how is it related to their early levels of motivation? And next, we'll investigate how these motivations change over time. And finally, we try to understand how editors' activity and their motivations influence each other as a function of time. Now, a lot has been written about the many ways in which the internet has changed how we connect and communicate and share. And one of the places where these changes have been most significant is in how information is produced and disseminated. So in the last like 15 to 20 years, we've seen the emergence of these really radically decentralized production communities in which the individual has gone from a passive consumer to a potential author or contributor. Now, these community-based models of production span a vast range of contexts from open-source software projects like GNU and Linux and Apache and GitHub and citizen science projects like iNaturalist. And of course, one of the biggest and most prominent examples is Wikipedia. Now, we use Wikipedia in our work as a sort of case study for understanding behavior more broadly in these types of collaborative production systems because it's one of the largest and most prominent and just most often studied platforms of this type. And for us, it's an especially useful arena for examining editors' emergent behavior because we can really distinguish between specific types of co-production activities. And there's minimal organizational design sort of dictating what types of activities people engage in. And finally, activity in Wikipedia is observable and it's persistent. So we have access to this really rich and detailed longitudinal data set on the work that's done on an article and also the work that's done by these individual editors. So the analyses that I'll present here are largely based on Wikipedia transaction log data that were accessed through the public API paired with surveys of Wikipedia editors where appropriate. So these types of platforms broadly share several characteristics. They tend to be decentralized, self-organizing, free and voluntary. And a key element that we're interested in certainly is that participants can contribute as they wish. So unlike in conventional organizations where activities and responsibilities might be hierarchically prescribed, people engaged in these types of peer production systems can largely participate as they wish. That is, they can choose a task and an approach to the task that they find most compelling or most interesting. Now, even though participants can choose how they engage in these systems, most people are either minimally engaged or drop out soon after beginning to participate. And this might be related to their motivation. Now the inequality between the majority of participants who are minimally engaged and the minority who are responsible for most contributions is characterized from different perspectives in various popular frameworks. So for example, you have the notion of the core and the periphery of participation, the reader-to-leader framework, the power law of participation and the funnel of participation, excuse me. Now one explanation for this variation in people's level of activity might be related to their motivation. So broadly speaking, we believe that people who are more motivated to participate in these communities are more active and those who are less motivated are less active. But empirical evidence of these relationships between motivation and activity and attrition and time is still largely missing, particularly for individuals in the earliest stages of participation who are at highest risk of dropping out. And we believe that this is a function of certain methodological limitations within the prior research. So first, the existing research tends to survey veteran participants, usually long after they've been enculturated into a given online community. So these types of studies really don't help us to understand the motivations and activities of this really vulnerable class of newcomers. They also tend to use cross-sectional survey designs, which inadvertently assume that that motivation is sort of a constant construct over time, which we believe might not necessarily be the case here. And finally, they often aggregate these measures of activity so that it's difficult to distinguish between different functional behavioral profiles for the editors. But these differences are actually really important because different activities provide newcomers with different experiences and different levels of socialization and exposure to community norms. And this may in turn influence both their motivations in the future and how persistent they are in their editing. So these are important limitations to address. And the studies that I present here take on these limitations very deliberately by targeting individuals at the earliest stages of engagement and by measuring motivation and activity at these multiple points in time. We also use a stratified sampling in our recruitment of participants to our study to ensure that the distribution of engagement within our sample reflects that of the population. So we're able to get data from people at various levels of activity and not just those highly active individuals who are more likely to respond to our surveys. So I think taken together, this approach should give us a somewhat more accurate view of the relationships between motivation and activity across a more representative sample of the population. Now all three studies draw from the same data collection event, though each used a different subset of the data as appropriate for the research question that was being addressed. And I'll make sure to talk about what data was used for which study when we're talking about those studies in more detail. So we began by monitoring the first two weeks of editing behavior for all newly created accounts during the recruitment interval to classify these new editors according to the activity level strata that each one belonged to. And this enabled us to target participants for recruitment at various levels of engagement and then to adjust their data using case weights in order to reflect the distribution of participation within the community at large. We sent these individuals links to the first survey and 206 people responded after approximately six months. A second survey was sent to the same individuals and 90 participants eventually responded to both surveys. So then through the API we downloaded survey respondents edit transaction logs for the first edit period which was the period of time between account creation and the submission of the first survey and again for the second edit period and these are edits made between the submission of the first and the second survey. We also downloaded activity data for six wikipedia namespaces. Those were the main article the main article talk the user the user talk the wikipedia and the wikipedia talk pages. And finally we surveyed participants on eight motivations that we saw were prominent in the prior literature and on the screen you can see a quick a brief description of each of those motivations. Those were collective self-expression social motives intrinsic motives and here we interpret that as fun or enjoyment norm-oriented motives reputational motives obligation and identification motives. Okay so in the first study we investigated the relationship between these newcomers wikipedia editor newcomers earliest motivations to participate and the roles that they come to occupy during the first few months of participation on the platform. So this analysis incorporates the responses from only the first survey but consolidates the editing activity from both edit periods and we took both the count and the proportion of edits that were made in each of the six namespaces as two independent but complementary data sets. We applied k-means clustering to each of these these data sets to sort of identify or surface these prototypical behavioral profiles that are based on the distribution of edits that were made across these six namespaces. And finally we used analysis of variants to compare motivations across these behavioral profiles looking for differences that might give us some some insight into why certain people sort of went down one behavioral trajectory versus versus another. So the cluster analysis of the count data yielded three main patterns of behavior. First we have low volume main article editors and these are people who made a really small number of edits in the main article space and and virtually no edits elsewhere. And just as an aside this represents the overwhelming majority of of our participants and probably of people within the population. We also identified a class of high volume main article editors and these are people who made a high number of edits to the main article namespace and and very few edits elsewhere. And we also found that these people tended to be significantly more motivated by social and reputational motives than the other two types of editors that were identified in this analysis. And finally we have a class that we call comprehensive editors and these are people who made a high number of edits in both the main article and the user spaces and a fairly high number of edits in the remaining namespaces. Now in the analysis of the proportion of edits that were made in each namespace that analysis also yielded three main behavioral profiles. So first we have main article editors and these are people who predominantly only edited the main Wikipedia article. We have generalist editors who made many different types of edits across all of the namespaces and these people tended to be more motivated by intrinsic motivation. So enjoyment or fun editing Wikipedia than the other two types of editors that were identified. And finally we have user page editors and these are people who predominantly edited their own user pages and these people were significantly less motivated by a sense of obligation to the community than the other two types of editors. Now I think that the main takeaway from these particular results is that people who are motivated by different things when they begin to participate tend to be drawn really quickly into different forms of participation and what's interesting is that some of these people seem to be pulled immediately into these core forms of participation on the platform like the comprehensive or the generalist editors. And this is sort of contrary to the conventional wisdom where newcomers sort of begin with these small and peripheral contributions and then sort of move to the core over time. So we would expect that these newcomers' contributions would increase in complexity and centrality but only over sort of a longer period of time and sort of more experience. But I also think this may provide some useful information for platform designers who are interested in engaging these very differently motivated participants to contribute in these more substantive and persistent ways. So our second study delves deeper into participants' initial motivations and how they change in the first few months of participation in Wikipedia. Now in this study we only use the responses from Survey 1 and Survey 2 so we didn't incorporate any of the activity data and we essentially use linear mix effect models to analyze how motivation changed over the survey period, so changed over time. Now the results of this analysis showed that the extent to which people valued their community values they felt that their knowledge made a difference on Wikipedia and their belief that editing Wikipedia is enjoyable have all decreased significantly with time and this may indicate that Wikipedia isn't altogether a welcoming environment for new editors and I think that there's some other research out there that might be consistent with this observation. On the other hand we did see that social motives increase as a function of time so the extent to which people value social interactions with other editors has increased. Now this type of result I think it also hints at a design intervention that could be used to improve these newcomers experiences by strengthening these early social experiences with with other editors. Now given what we know of the prior research I think that these results provide a more detailed and a more nuanced understanding of editors' motivational paths in peer production and in Wikipedia specifically. We found that different motives change in these different ways and these changes happen really early within the editor's career at least within the first six months and possibly sooner. So if we think back to the average participant who drops out or is minimally engaged these results show that it's insufficient almost to say that they experience like an overall decrease or sort of a generic decrease in motivation rather you have to look at like the specific types of motivations to understand the differences between sustained and unsustain behavior and how we might target these individuals as a function of the changes that are specific to these different motivations. So in the final study we we try to combine these ideas about newcomers behavior and motivation to understand the interplay between early intrinsic motives and edits that are made to the main article specifically over time so that main co-production activity and we focus really specifically on intrinsic motivation because it's one of the most prevalent and consistent predictors of activity in the prior literature. So we first investigate the relationship between newcomers early intrinsic motives and how persistent they are in their editing and in the second part we investigate how intrinsic motivation and activity influence each other over time for those participants who persist in their editing. But first we have to define two new classes of editors. Here we have dropouts and they're editors who stop editing within the first two weeks of engagement so before they submit that first survey and we have persistent editors and persistent editors are those people who continue to edit after the first two weeks so after they've submitted that first survey. And within our sample we had roughly equal numbers of participants and in both groups of course within the population you're going to have a lot more dropouts than you do persistent editors. So we first use t-test to compare editors early motivations as well as the number of edits that were made in the earliest period of engagement across the two groups and and perhaps it's unsurprising that we found that persistent editors are significantly more motivated by intrinsic motivation so that was sort of fun and enjoyment and they tend to make significantly more edits in the initial period of engagement relative to dropouts and I think this is sort of consistent with our intuition about participants in these types of systems. But we then asked whether these two things are related so essentially are people reporting higher levels of intrinsic motivation because they've been more engaged in the platform in in that earliest critical period of engagement. So using OLS regression we found that motivation was not a function of the number of edits that were made. So essentially early motivation early motives or at least intrinsic motives are formulated through some other means and not contingent on the individual's earliest experiences in the main co-production activity so editing the main Wikipedia article and in fact they might even sort of enter with these motivations sort of fully or predetermined or sort of preconceived. So next we use structural equation modeling to investigate the interplay between these levels of motivation and editing activity for persistent editors only over a longer period of time. So to figure on the screen illustrates the composite path model that was used for the analysis. Path A measures the effect of the number of edits in the initial period of activity on that level of motivation that was reported in survey one. Path B measures the effect of motivation from survey one on the number of edits between survey one and survey two. Path C measures the effect of motivation in survey one on motivation in survey two. Path D measures the effect of edits in the intermediate period of activity on motivation in survey two. And finally we pat E measures the moderating effect of the activity done in that intermediate period on the change in people's motivations between the two surveys. So again, we found that the early number of edits do not influence the level of motivation reported in survey one. So essentially reproducing that first regression that we did, and that's the result on pat A. But we also found that these early motives do not turn around and in turn impact the amount of editing work that's done at later stages. So that's the result on pat B. However, we did find that the number of edits that was made in this intermediate edit period does have a marginally significant effect on the level of motivation that's reported in survey two. So the more they edited, the more motivation they had to edit but only at this later stage. And this is consistent with our expectations and what we've come to expect from the prior literature but it's really interesting in contrast to our initial findings which showed that their earlier edits do not influence their early level of motivation. And I think this serves to show that the relationship between activity and motivation is really different at different stages in the editor's career. But for me, the most interesting result from this analysis is the significant moderating effect of activity on the change in the level of motivation between the two surveys. So we dug a little deeper into this interaction effect by looking at the marginal effect of activity at these different levels of initial motivation. So to do this, we use a model where the level of motivation in survey two serves as the dependent variable and we essentially vary the level of initial motivation to observe how the coefficient on that activity term changes. So the dots in the plot represent the predicted coefficients of activity at different levels of initial motivation. Okay, so you can see how starting or entering with these different levels of motivation in survey one result in a change in the direction and the magnitude of the predicted effect of their activity on their later levels of motivation. So motivation as reported in survey two. Now this shows, I think that participants who begin editing Wikipedia with these higher levels of motivation in survey one trended towards a virtuous cycle where the more they edited, the more motivated they were to edit. And on the other hand, participants who entered with lower levels of motivation but persisted in their editing trended towards a vicious cycle where the more they edited, the less motivated they were to participate. So essentially their early levels of motivation are reinforced by their participation. Okay, so the results of these analyses are interesting for several reasons. First, people who persisted early on did initially have these higher levels of motivation relative to people who dropped out, but it was unrelated to the number of edits that they made early on. So unrelated to that earliest sort of critical period of activity. And this is sort of antithetical to the prior literature. But the relationship does change over the next six months to more closely align or reflect the results of prior research. We do eventually observe a positive effect of activity on the level of motivation but only as reported later on in the editor's career. And this suggests that motivation is, again, a lot more nuanced than originally thought and may not be as directly or as consistently correlated with the main co-production activity as we expected. Instead, newcomers may have other early experiences outside of the main co-production activity that lead them to develop these higher levels of motivation. And one explanation that we might be able to draw from our own prior research, the results of that second study that we discussed, is that some people might be having these more meaningful social experiences early on which lead them to develop these higher levels of motivation and sort of over time, they evolve to become more closely related to their activities and their experiences on the platform. We also found that editor's activity plays a significant and surprising role in moderating the change in motivation over time. So regardless of how these motivations are formulated at the outset, this result really highlights just how important early motives are to the lifespan of the editor because their activity is just going to reinforce these early motivations and expectations. And this can have really significant implications for how long the editor is likely to remain engaged on the platform. Okay, so in summary, taking all of these results together, I think at a high level, this work highlights the importance of considering motivation in studies of online engagement. And I think it can provide some insights which can form the basis of a much more nuanced theory of motivational dynamics in peer production systems like Wikipedia. And I really hope that they can help to inform the design of interventions that can engage participants at the periphery of these communities in a much more targeted manner by appealing to specific and relevant types of motivation. So that's it for me. That's the end of my show. I really appreciate your time and I'm happy to take a few questions. Thank you, Martina. That was a great presentation. Thank you. And I think we have quite a few questions coming in. So Jonathan is the one who is there keeping the queue. And also for those of you who joined late, I see there are quite a few now in the hangout. Please poke Jonathan and he's the one who will give you a permission to ask your question. So with that, Jonathan. Awesome. Thank you, Martina. So first question we have is from YouTube. James Salisman asks. So one of the six initial motivation categories was identification motivation. And James asks, how did identification motivation change over time? It seems that that category was excluded from the graphs. You're right. So within our study, we initially set out to collect these eight, this sort of corpus of eight motivations. And then for each study, depending on our interests and sort of precedent within the prior literature, we sub-selected motives for additional analysis. So we actually didn't analyze identification motives in that second study. I believe if I'm not mistaken, the ones that are on the slide are the only ones that we use in that analysis. So those were collective self-expression, intrinsic social and norm-oriented motives. But further research might reveal something really interesting about how these motivations change. And certainly I think it's an area that deserves further research because our results certainly show that there are some really interesting patterns that really only emerge when you look at the sort of nuances of how these changes occur over time. Awesome. Thank you. Next question is from Layla. Layla asks, can you share your thoughts on the potential interplay of selection bias in the first study, by which I think she means the first round of surveys. That is, the current structure of Wikipedia may dictate that only certain types of editors with certain motivations stay around for specific editor groups you had. For example, the fact that social is not a motivation for certain editor types can be because the platform doesn't accommodate that motivation. Well, Marquina, may I ask you to turn on your camera? We cannot see you right now. Oh. Maybe switching on and off. Yeah, here you are. Yes. Sorry about that. You know, that's really interesting. It's something that we've actually thought a lot about. I mean, we really tried to address biases within the sampling using this stratified sampling method. You know, if you just sort of randomly sampled with the pedia editors, you would get sort of this large population of veteran editors who are more likely to respond to your surveys and very few editors who are likely to drop out. And so we really tried to, you know, address that in the design of the research itself, but of course, inevitably, you're going to have other dimensions in which bias is going to emerge. I think, you know, with that first study, with the clusters, I think your point is really interesting, right, because you can sort of intuit how different clusters would persist and different clusters would sort of kill themselves in essence, right? Like you probably won't see those low volume editors continuing to persist and sort of become core contributing members. So I think that's a really interesting extension to this research is not only sort of what these tabs are, but whether they are inherently predictive of the persistence of a certain class of editors and what that might mean for results like the ones that I've presented. It's a really hard question. Yeah, I agree. Well, it's good at the hard questions. I believe Marshall, you're next, and you can ask yourself since I believe you're on the chat. Hey, can you hear me? All right, so thanks for this presentation. I'm Marshall, I'm the product manager for the growth team here at the Wikimedia Foundation. And so we think about how to grow new editors, and this is really relevant to us. One of the things that we've been thinking about is the clustering side. Because sometimes we think about what are the different kinds of editors that a healthy Wikis ecosystem needs in order to keep being healthy. They probably need some of the high volume types, some of the low volume types, but we wanna think about what all those types are and how to get the right proportions. And so the question I wanted to ask you is, it sort of seems like the depth of your clustering was limited by the fact that you wanted to be able to cluster amongst just the hundreds of people who had responded to the survey. But if you were liberated from that and there wasn't the survey limitation and you could cluster all the editors, how would you go about that? And what kinds of things do you think you would include beyond the number of contributions in the different namespaces? Oh gosh, that's so interesting. You know, it's interesting too because without sort of being on your side of it, like I see sort of the clustering into these different behavioral profiles and sort of their usefulness, it's almost like a self-selection, like a Darwinian sort of like, you know, you want sort of these healthy communities and that might mean that you don't necessarily need certain types of editors within the communities. Gosh, you know, I think that the, one of the limitations that we have in terms of the nuance of these clusters is that we limit it to these six namespaces because the overwhelming majority of edits, certainly by new editors is made in these six namespaces, but there are what, 35 more different namespaces which all have different sort of, or might be indicative of different functional behavioral profiles. And I think sort of opening the analysis to include these more sort of diverse perspectives on behavior might be really, you might see more nuance or subtle sort of clusters of behavior emerge. I think that something that is interesting or that I would like to see paired with that sort of clustering analysis or paired with that analysis is the quality of the edits that are made by these different editors. I think that's a really interesting sort of next direction to take this because it helps you to sort of understand not only sort of how people self-select into these behavioral trajectories, but also what the implications for the articles are if you know that this behavioral trajectory is really making substantive edits and people in this behavioral trajectory really isn't. So I think that that's one direction that that could be really interesting to take this. I wanna be. Thank you. So there are more questions coming in and we're gonna put them on hold until the end of a showcase because we need to move on to the second presentation. So thanks a lot, Martina. Hope you can stick around until the end. We can have more discussion after Tassos and Tassos if you're ready. We can also start with your presentation. Hi everybody. First of all, can you hear and see me? Yes. Fantastic. So my name is Tassos Nulas. I'm at, I'm based at the Center for Data Science at New York University. And today I will be presenting basically, I will be doing also a small demo of Wiki Atlas. It's a new project. So for my research and also production perspective is by no means ready. Actually, I mentioned to Diego that we should be looking for some feedback. Diego is the collaborator in this project. And he said, hey, we are doing this meeting. So why don't you join us this Wednesday? I didn't know we would go live on YouTube and everything. So maybe I will be a bit unprepared in terms of presentation material, but hopefully we'll get to the tool and what I'm hoping to get out of this is some feedback and some ideas in terms of like how we can move forward. So without further delay, I'm switching to presentation mode. My entire screen is to be shared. And may I confirm with you that you can actually now see my presentation? Yeah. Fantastic. So this is a project going on with Diego Science who is based at Wiki Media Foundation and myself. It started about two months ago, not two years, that's a long time. When Diego was around for the offsite meeting of the data science team and I'm doing work on location-based services and location technology. Diego works for Wiki Media. So bringing the two together, we are creating Wikipedia Wiki Atlas. So what is Wiki Atlas? It's an interactive cartography tool that enables Wikipedia article discovery by direct reference to geography. So essentially we're taking geo-tagged Wikipedia articles and we plot them on the map, but not only we plot them, but we are hopefully creating an environment where users can interact and explore those articles and by extension knowledge. It's currently accessible through the web. So you can go to wiki-atlas.org and play around with a tool. You will notice quite a few bugs. So please send us your feedback. But after we begin with this web version, as it progressively matures, we will start moving to other mediums. And what we're hoping to do is build a mobile application where mobile users were, say, moving in particular area of the city or our tourists. They can switch on the app and discover Wikipedia articles nearby. We are planning to use also augmented reality tech. So we can create immersive experiences for users who move about certain urban environments or non-urban environments and they can explore content, not through just the monitor, but hopefully through some sort of direct interaction with the physical surroundings. What sort of questions we're looking forward to addressing from a more research perspective and these we have not answered, but I'm posing some of these hopefully to give an idea about the sort of like scientific slash research purpose of the tool. So we want to understand what is the impact on Wikipedia user experiences when exploring content through these new mediums. So obviously we can explore Wikipedia through its web interface, but what about accessing through other mediums? Do users discover different things? Do they learn more? And do they remember content and knowledge better? And then from the editor's perspective, the idea of providing a direct visual experience of Wikipedia articles could hopefully help them gain more insights in terms of like article coverage around the world, what languages are represented in what areas and so on. Hopefully some might be also motivated by discovering that some areas are very underrepresented and then we can hopefully proceed with more contributions there. Right, so this is an image and I have not moved to the demo yet, that's just a slide of what the tool looks like. So this is basically, we can imagine a user navigating in the area of Amsterdam and exploring different articles. What you can notice immediately is its article is represented by a three-dimensional cube. The height is proportional to the number of views that the article has. We will discuss this particular popularity feature a bit later, but essentially we have a few features on the top left we can search by category. So we can look for parks, museums, buildings and so on. Top right, we have location search so users can search by address, town, city and so on. And critically to the bottom left we have a popularity filter which I will showcase later. So especially for urban areas that are extremely dense in terms of content, we can use this filter to remove articles that are less popular. So users can start exploring the most significant articles perhaps first. That's something we can discuss also a little bit more in the next slides. So one might be wondering why geography, why do we seek in fusing together these two aspects? So first of all, geography is very important in learning. Spassile context and locations are key reference points in human cognition. Episodic memory is a lot about recalling specific things with respect to places and locations. So with Wiki Atlas, we are seeking of realizing this relationship in a deeper way. Number two, humans naturally organize in space. Information and knowledge is generated there. So we congregate since antiquity, since our tribal times and villages, then we have urbanization, larger cities and so on. We tend to organize in geographic space and we tend to generate information and knowledge there. And perhaps as a consequence of this process, a large fraction of Wikipedia articles already comes with geographic coordinates. And essentially that's how we built them up. We took all those articles where we have exact location and we generated essentially the map of those. A question that remains is what do we do with those articles that basically do not have a geotag? And for this, we will discuss a bit more, our intent to represent also popular people and notable people in specific locations. So given a person, we know usually they're a town of birth, but say getting a place, a city like New York City, where you have perhaps thousands of notable people, how do you represent those in space? Is it even meaningful? And from a sort of like a more abstract perspective, when you move away from people that are in a way more directly associated with space and say you're given an article, say about trigonometry, can we still find geographic associations perhaps through processing the semantic or word content of these articles and also find ways to associate them with space? Right, so a few points and the features we are using on the map about articles. So we are fostering a popularity-based exploration with these three-dimensional, basically objects. Its article is a three-dimensional object, it's how it is a popularity. The question there is, do we actually bias users to the most popular content? And in a way, of course we do, but we also use other visual features such as text in order to allow for users to also note and explore the least, the less popular articles as well. And our goal in that respect is, can we sort of like enable users not only discover what is the most important somewhere in terms of like collective attention by users, but also can we enable certain deepest discovery of less popular items? Things that we could potentially be interested about, but they're not immediately brought forward to our Wikipedia experiences because they're not as popular. Then a few questions there. Popularity is represented as a height of these 3D shapes. One big question we had was, do we use a three-dimensional space to represent things or a two-dimensional space or pros and cons for that? So three dimensions allow for a more immersive experience. There's also more space that is at the particular zoom level and type represent the three-dimensional sort of like environment. Of course that comes with more computational costs and so on. So there's very straight of so considering that respect. Location search, as we pointed out, we have a geocoder tool. We use map books for those that maybe have not heard. Map books is a cartography platform that you can create your own maps, having the right data. So they provide a geocoder tool that we have added as a feature to the map so people can search by address, place, city or country, but going deeper into this particular aspect, things that we consider now and there are problems is something that people refer to as geothermalization. So let's say we talked about popularity before. We have a few thousand views for an article in New York, say 5,000 views and if you would compare this, say, to an article that has the same amount of views in a small village in a Greek island, would it mean the same? So obviously, bigger cities, metropolis and so on attract more attention then in order to process information uniformly across heterogeneous geographic spaces, problems like that need to be somehow taken into account. And of course, one thing is what sort of like variables you consider, the other thing is what sort of visual interfaces you built around that. Another issue that we are currently working on is to enable discovery across different geographic scales. So obviously you can open them up and you can zoom in and you can look at specific neighborhoods or cities, but then if you zoom out and you look at a country or a continent, so how do we enable discovery at the higher levels of hierarchy? This is something we have not addressed, but things that has taken, say, the European continent, I would like to visualize, say, all historic battles or war acts that took place there, could they do this? And so moving on, another feature we discussed before is category search. So this is a snapshot of London. So we have search for museums in London, as you can see on the top left. And that's sort of like easy, again, exploration, not only when you have a lot of content, but when users are particularly interested in specific categories. Personally, didn't know that London had so many museums. People who are museum freaks, they can use this feature to explore a city at its full depth. So, okay, probably been talking a lot already. Let me tell you a few things about the next steps, the immediately next steps we're thinking of taking forward. So first of all, we're thinking of introducing more languages as different layers on the map. So currently we're only doing English Wikipedia articles, but it would be really nice to have German, French, Russian, so we'll be adding more languages as we move on the next few weeks. One of the goals is also to provide visual head-to-head geographic comparisons between languages. So you take an area of London and then you try to see how it's represented in terms of Wikipedia article of different languages. The second point which attached a bit before is biographies. So adding notable people is something we have been thinking with the problem being again, how do I specify a specific location for a given person in the city? And of course, if everybody's, say, born in Rome, how do I plot all Romans, basically, across the city in a meaningful way? One idea, and I'm welcoming your input there that we've been processing, is the idea of having the most, say, notable or the most popular individuals in the center of the city. And as we move towards the periphery, you have the least popular people. Right, so third point is with respect to next steps is to not only allow for users to explore and read some content on the map, but also add some sort of like features, social features, first of all, to allow them to share specific articles and their locations with their friends, for instance. So the way you would share a link on Google Maps for a particular address or location, perhaps you could do the same here. We would like also to allow users to save articles, so do some sort of bookmarking. Maybe, say, you imagine having a tourist, the tourist is moving to a new city and they have saved some articles they would like to actually visit and as they explore these places, they can read Wikipedia and learn more about the historic context of their surroundings and so on. Something else that has been proposed and we are processing as well is to have an idea of chronological asserts. So obviously, in terms of time, Wikipedia articles can span centuries or millennials users the way they focus on specific location search for articles might want to do the same with time. So adding a chronological search tool is something we are processing. And to this end, I would like to say thank you. I'm happy to hear your questions, but also ideas. We are still at a very early stage in terms of how we're moving on with this. So I'd like to hear from you whether this makes sense at all and how would you imagine it would become more useful for the Wikipedia community. And if you allow me, I would just do you a quick demo which usually fails at this stage of development but I'm gonna give it a try. So I'm moving on with that. Right, so let's go. We have Wiki Atlas, loading. Right. So it takes some time. So we are having quite a few elements to load here. Hopefully with a good internet connection, things are faster. So we are in New York already. So in the center, you can see also there's more activity. You can see the most popular articles are around the World Trade Center for perhaps good reasons. One of the main things that you notice as you explore is basically in cities like that, things are too many. We have the problem that in other areas we have too few articles and then the question is how do you bring them forward? And that's the idea of, that's a problem of generalizing your system. So it works across different areas. But here on the bottom left I'm playing around with the popularity filter. So now I'm using articles only above 5,300 views and you can see that already clears up the space. Okay, let's see, we're at little Italy. Let's click and we can get basically a brief of the article. The first photo that appears on the article and if the users want to explore more, they can actually click and read the actual Wikipedia page. Let's reset things. As we discussed, we have location search. So let's move to Rome. Rome is an interesting place from the point of view of Wikipedia and Wikiaplas, I guess, given it's basically tendency to accumulate a lot of historic information. So the interface, as you can see sometimes is not really fluent, but we're working on that. So this is Rome. Again, we have quite a few things to discover. Let's say we're searching for a particular category of things in status and hopefully it will work. No, it didn't work. Oh yeah, it did. It takes some time usually. Okay, so we have the ecstasy of Santoriza and we can explore their particular status and so on. I guess that gives an idea of how things look. We have this basically three dimensional to do the two dimensional view of the map. Sometimes looking things from above is easier. Sometimes when you search for particular things, it makes sense to go 3D and perhaps rotate and see things from west or east. And we have not developed a mobile application yet, but that is to follow soon. But we have this basic feature working at least center in terms of like a user location. So I click and then you can see my location on the map. That's IP geocoding base. And then I can see what is there in terms of like things near me. I think probably had enough of this demo. You can go exploring your own, please report problems. And thanks again for attending. Thanks a lot, Texas. That's really exciting. And you have a full feature set request on IRC that will be packaging and sending your way. That's fantastic. It's been like a responding to questions. So Jonathan, we have any specific question for Texas or is everything already answered? So putting aside the various feature requests on IRC, we do have a feature request or question about features on the YouTube channel. One user asks, it would be nice to include Mapillary, the open source version of Google Maps in the Wiki Atlas. How do you think about that? Is there any way to include street map view in this project? Yes. So there's various considerations we have been making on that. I think having street map use also matches our idea of having an augmented reality application where you actually have not only a view of the information of the Wikipedia information in particular location but also you get direct visual clues in terms of like the physical surroundings. I think Google street view would be cool to enable how to do this exactly, whether we would have just a link to the street view of the Google platform. That's probably not very hard to do. So we are actually thinking about it. Another point perhaps relating to that is the fact that currently we use the gray dark layer of the map. The goal here is to highlight information and articles but in the future, hopefully users will be able to explore different layers such as sawing rivers or mountains or something more rich in terms of like the physical characteristics of a particular area. So yes, a quick answer is yes, we are thinking about this and yes, thanks for the suggestion. Also, thank you Tessos. No other questions that I can see on IRC or YouTube? I will jump in and ask one must relate something that we're discussing on IRC. So first of all, it's really exciting. And thanks for like pulling together on these different data streams for this application. The question I have is about how you represent the value, the why value for any specific item on the snippet. It looks like right now when you open the specific item, you'll get a snippet about the article but there's no information there about the actual data point. And having something like an indication of the popularity for that item on the snippet itself or maybe like a generalized value of it like you were saying, maybe like a rank within a geographical context. I would like to know for example, if I'm exploring any given geographical area, I'd like to maybe filter and get a rank for what I'm selecting. Right now this is impossible in the visualization and I think it would be super valuable to have that. Yes, yes, so definitely providing more sort of like information context about the article is something that we should be considering. So it's easy to put a number of views. You can also put perhaps we could consider having a specific link maybe within the website where you click a link and then you get some sort of like general statistics about that article. Ranking is also something we are preparing. So one of the things we were thinking to consider is to adapt the height of this sort of like article bars according to the zoom level. To do that, you would need to have some idea. So for each article, not only you would need to know what's the popularity of the article but also the popularity of the surrounding article so you can normalize accordingly. So we are trying to do this and of course we have constraints by the tools and platforms that we use for this particular production. I think Diego is preparing something in terms of ranking so let's keep in touch for this and I'm looking forward to getting some more feedback and we would be happily introducing some more of this. Of course, one big question here is who do we present this? Who do we build this tool for? Is it for the general public? So I saw this tool to my parents and they liked it but then if I would show to my dad some sort of like specific scientific related statistics about an article maybe he wouldn't be able to pick this up. So I guess the trade of here is to add more features that are relevant for specific audience but also to make it simple enough or present those features in a manner that doesn't disrupt views for more general audiences as well. That makes sense. Thank you. Any other question from the room or from IRC before we open the floor to questions about either of the presentations? No? Nothing else on IRC. All right, so shall we go back to the original queue you're trying to have? Yes, I believe it's you, Aaron and me. Oh, okay, great. So Martina, you're still there? Yes, yes, I am. So first off, I really wanted to commend the effort you put into this work to think about the design and outreach implications, not just the research implications. I think it's very valuable for people who are attending the presentation. But the other thing I want to ask you, in the very first set of results, the fact that you focused on the question of what you call responsibility and obligation really caught my attention because from what are you called from a literature that I'm familiar with and it's been a while since I looked into the interesting motivation literature, the specific motivation related to a sense of responsibility or obligation is not one of the themes that comes up most frequently. Even though I think it's something that when I talk to fellow Wikimediants, the urge to have something represented in Wiki projects, because it needs to be there, is something that many people feel strongly about. And I saw that this was basically, if I read the first chart correctly, was one of the stronger drivers of the Sarawakasar Associations with the group of editors were almost uniquely added in the main namespace. So I was wondering if you could unpack a little bit this notion of the obligation slash responsibility to help us understand it a bit more and see what kind of consequences we can draw from the fact that apparently this is one of the stronger types of motivation associated with that specific cluster of editors. Yeah, so, I mean, first off, in terms of how we came up with this list of motivations, we had a series of papers that we were sort of compelled by and we sort of chose from those papers the full list of motivations and validated our own constructs and went through that process. In terms of that finding, it's a really interesting one. And I kind of want to caution from reading too far into it because it's a really small portion of our sample and of the population that actually sort of is sort of representing this result. I'm talking on the order of like 5% of our sample which is something that wasn't reported here. It's reported in the paper and the reference is in the PowerPoint which I can provide. But it's a very small, like if you're thinking about sort of what are the main drivers behind what the majority of activity in this setting, it's not that, but it's interesting nonetheless. And one of the things that we sort of thought about with this obligation is, so generally it's people who are predominantly editing their own user pages who are significantly less motivated by this sense of obligation. And so the idea that we had was perhaps these are people who are just generally more motivated by some form of self-interest than your other contributors. And that sort of emerges as a lower sense of obligation to the community at large. But that's sort of the explanation that we offer in the paper. It's definitely an interesting one and you can see how it's sort of tied to or related or it sort of has a flavor of these instrumental versus non-instrumental intrinsic versus extrinsic motivation. Like that discussion, you can see how it sort of sounds a little bit like that. And so I think it's definitely worth investigating more. It's a really interesting one, but again, it's really a small fraction of the population that it actually represents. I'm gonna say, yeah, thank you. Eric, do you want to jump in? Yeah. Go for it. So hi, Martina. Thank you for coming to present this work. I really appreciate it. So I wanted to ask you about two places that I noticed that you said that your results are a challenge to task work. So one of the cases was when considering legitimate peripheral participation and how people organize their activities. Another place, it sounded like you were saying that your results challenge this idea that motivation drives activity if I was understanding that right. And please feel free to correct me. Essentially what I would like to ask you to do is to help unpack those and to help me understand like what exactly is refuted and where you think we might look next to see what's really going on. Sure. So I think the main claim is that our understanding or at least when this paper or these papers, this work was done several years ago. Our understanding was rather simplistic in that we had these two countervailing ways of thinking about the relationship between motivation and activity. On the one hand, the more motivated you are, the more active you are. On the other hand, the less motivated, the less active. And so my take on these results or one of the implications that stood out to me is that it highlights that it's not that simplistic, that you have these various relationships and that these relationships between motivation and activity in fact change with time or at least that's what our research begins to hint at. And I think that we don't sort of have, I think that more research is needed to sort of play that out better or sort of to a greater degree than we did in sort of these relatively exploratory analyses. But that's sort of the main issue that we took issue with is sort of like, this is a far more complex relationship that evolves with time relative to the way that we tend to think about or that we had sort of the prior research had tended to think about these relationships. Is that, does that help? Yeah, I think that helps answer the question. Thank you. Cool, cool. So we have a good question from Pine on IRC and I'm gonna bump him up the queue. Pine asks, if I may ask one question about the first presentation, my understanding is that there is a very steep drop in participation between an editor's registration and their first edit and additional steep drops through their fifth edit. Does this research about motivation help us understand why those early drop offs are so steep and what we might be able to do to retain people for at least a few more productive edits? Yeah, I mean, we don't address that specifically, but I think that there are some hints at how we might be able to do this. One of the things that stood out to me in reflecting on these series of studies is that people seem to, we used to think of enjoyment that people would stick around if the task was only more fun or more interesting or more compelling to the editor. And I think that our last study, not that it disproves that, but it challenges that, at least within that earliest period of engagement, that critical point where you're going to lose the majority of your participants. And so the question is, and sort of the natural extension to that work is what other motivations are at play in that earliest period and how did they change over time and what's their relationship with activity, which we don't do. But I think if you look at the second study that I presented, you have this hint that perhaps interface interventions that sort of are compelling to the social motivations of the individual. There are social expectations sort of designing interactions or a platform for meaningful interactions with the other editors might help to keep them not through the sort of main intrinsic motivation, the one that we tend to think of as associated with activity, but rather through these other more social oriented motivations. And hopefully over time that all sort of result in more sustained participation, but only more research in this area will tell us if that's sort of a viable solution to you or one of several viable solutions to addressing that problem. Yes, I agree. So no more questions other than mine. So I'm gonna ask mine. So I was interested in learning whether you'd considered the impact of those interactions, the kinds of interactions that editors had on motivational shifts between say the first survey and the second survey. So there's been a good deal of research on the impact of initial experiences with other editors including reverts, warns, welcome messages, et cetera on retention. And I wondered if you had thoughts about how positive or negative interactions might have particular impacts on motivation and whether I suppose that might be different depending on the trajectory of the editor across the different groups that you were presenting. Sure, that's really, really interesting. I mean, one of, so for the majority of the research that we did didn't necessarily focus on retention, it focused on the amount of engagement and those are actually two different constructs and it's something that we begin to think about a little bit in our third study when we sort of distinguished between persistent editors and non-persistent editors, but really those are sort of different phenomena. You could be really, really active in that initial period, but you could drop out, right? And so we don't necessarily address this question of retention that explicitly, at least not over sort of all three of these studies. And I think it's a really hard question to answer too, because we have access to these transaction logs, but these transaction logs, they reflect behaviors that are sort of on the platform itself and not necessarily between the editors. And so those sort of interactions in terms of measuring them, it's a much harder question. And so for various reasons, we haven't sort of empirically addressed them, but I have no doubt, perhaps this is a great thing to say as a researcher, I have no doubt that these interactions play an important role in the way that these motivations sort of develop over time for the individual and likely moderate the relationship between their motivations and their activity and retention and attrition and these other sort of activity or task oriented measures that we're talking about. I think that that's probably, in terms of this type of work, sort of the next major thing that really needs to be addressed is the role of these interactions and how they can be moderated by community management schemes or sort of platform interventions to sort of bring about the best in these communities. But sort of intuitively to me, it makes a lot of sense that these interpersonal interactions that are occurring on the platforms are hugely important in understanding this question. Great, thank you. Awesome. And I think with that we're at time. And so we're gonna close here our showcase. So virtual round of applause to our speakers. Thanks a lot for sticking around until the end and thank you NYU for this amazing research. And see you all next month for the next showcase. Thanks for hosting us. Yeah, thanks for having us. Thanks for joining all.