 Hi, everyone. I'm Nicole. I'm a research fellow at the University of Oric, and I recently defended my PhD thesis. For my thesis, I collected data on the offline meetings between German-language Wikipedia, and in today's talk, I want to showcase that data set. My thesis was concerned with Wikipedia, particularly with the people who write it. And I didn't only care about the online component, but I was especially interested in the people meeting offline in the real world. Such meetings come in different shapes and sizes, from the casual sharing a beer and a pub, to more organized tours, for example, having a guided church excursion, or even booking a pilot to take area pictures. In my thesis, I was asking how offline meetups influence online behavior, and the first step to answer this question is getting the data. So the goal of my data collection was to collect all offline meetings taking place in the German-language Wikipedia, organized since its launch in 2001, up to 2020, when it came to a halt due to the outbreak of the coronavirus pandemic. Most meetings are clearly organized on Wikipedia. There's also a few over a few pages which list them. On the right, you see all currently active regional entities which regularly organize meetups. So the organization of a meeting then looks as follows. What you see here is the screenshot and the translation of the very first meeting of the right Hessian regional entity. And they've met for the first time in 2008. And what you can see is that they have a list of attendees, of people who have signed up, a list of apologies, of people that would have liked to come, but can't make it, and they also have a recording, a sort of result section recording what has happened. During my data collection, I aim to include all information which you can see here and which is relevant. So I try to collect who, met where, and when, and also the apologies sent and the minutes recorded if available. There are multiple pages which I checked for meetings because I try to collect all offline meetings organized. So I started with over a few pages which include all meetings or which include all editathons and all open editing events. And there's also an overview list of other events. I also checked all Viki projects and all task forces to see whether they have organized any meetings. Throughout the scraping of all these pages, I use the snowballing approach. So if any page linked to any other one which was relevant and included a meeting, I also scraped those. There are a few pages and meetings which I excluded from the data. First of all, I skipped all virtual meetings. Second, I didn't check any portals, whether they included meetings because portals are more, they provide landing pages and are directed more towards readers than offers of Wikipedia. I also didn't check meetings which were organized anywhere else but the German language version of Wikipedia. So I didn't collect meetings organized on meta or comments. And I also needed to skip very regular meetings taking place in community spaces. Community spaces are interesting but there are places of extremely high Viki media activity and in most of them, meetings take place several times a week and people stopped having recording their attendance. Because of this, it was not possible to reliably give a list of attendees in community spaces. I ended up with a list of over 4,400 meetings and information on the place, when you, where they took place and to corresponding coordinates, the date, the type of meetings, so whether it was more social or more work oriented like an editathon, and a list of attendees collected from the minutes recorded after the meeting if available, if not then from the list of attendees. I also collected the apologies of absences as well as the minutes available. So what we have is all attendees from 2001 up to 2020. The first two meetings out of those 4,400 took place in 2003 in Munich and they have increased over time up to around 2008. Since then, we've observed around 300 meetings per year. The dataset looks like this. So what you see here is the first 5 lines of the dataset and the most important variables. You see 5 example entries of meetings which were organized in Switzerland. Now the offline data is interesting but I think it really shines if you combine it to online data, for example to the datadump which provides detailed information on all online activities undertaken. We need to merge the users on the basis of their user names and this isn't that straightforward because user names can change. Users can also sign up with new names and scraping data and information from different sources comes with issues regarding and coding and capitalization. So to be able to merge online and offline data, my goal was to create a name to ID lookup table. So the goal was for each actual person, I want to have one unique identifier and connect all user names this person uses to that one unique identifier. To do that, I collected all redirection links to a user. I collected all renames which were locked in the renaming logbook and I also collected any explicit mention where users said that they are changing their user names or that they have a new one. This then allows me to combine online and offline data and it opens up the door to many new research opportunities. So I have already used the data to assess the meetups causal effect on the number of contributions of editors. For all sociologists in here, I also tested Coleman's mechanism of norm enforcement and I explored how meetups matter in Request for Artmanship. But there's so much more. So there's a spatial component in the data, there's a temporal component. We also have networks. There is lots to do. And if you have ideas now and you want to get the data, then you can do that in the open science framework. The data and documentation about all variables is there. I shared the data with a creative comment, attribution, share like for license. So feel free to use it and also improve it and check the preprint on archive. The extended app, the preprint is currently under review and the extended abstract is a very short version of this preprint. With this, I say have fun with the data and many thanks for listening.