 Hello everyone and welcome to the September edition of the Wikimedia Research Showcase. I'm excited to be here today with some guest speakers. We have Mikkei Tatsumi from That's Gonna Easy in Turin, Italy. We're just going to be presenting on some pretty exciting piece of research without any collaboration with Wikimedia on monitoring the impact of health outbreaks on the public attention using Wikimedia and Wikimedia traffic data. And the second presentation is going to be by Jane Im from the University of Michigan and Amy Zang from MIT presenting a big overview of RFCs, request for comments, one of the key processes for deliberation in Wikipedia and in our communities looking at what we can learn from this large corpus of online decisions. So without taking more time, the format is usual. We're going to have about 30 minutes for each presentation including Q&A and you can stick around for a longer Q&A at the end of the session. Jonathan is going to be our IRC host, so if you have any questions on IRC or YouTube please direct them and he'll relay them to the channel. And with that Mikkei, let's say use yours. Thank you very much, Dario, and thank you to the Wikimedia Foundation to give me the opportunity to present work done in collaboration with colleagues at Fundazione Easy, the Institute for Scientific Interchange Foundation in Turin, Italy. So this is work done with my colleagues on the Rappanissone, Daniela Paolotti and Ciro Cattuto about the impact of news exposure on collective attention in the United States during the 2016 Zika virus epidemic. And as an introduction, the main idea of this study relies on the concept that it's known that behavioral responses during epidemic outbreaks can influence the spread of the disease dynamics. And awareness during epidemics is a key concept in epidemic modeling and as shown in the picture as an example during a Middle East coronavirus outbreak in Seoul in South Korea in June 2015, the response, the behavior response of the population was quite relevant to the outbreak. There's a large number of theoretical studies that tackle this concept and try to incorporate the behavior responses into epidemic models of infectious disease spread. But actually there are fewer empirical studies, mainly due to the lack of data and measures of awareness during outbreaks that it's something that it's hard to quantify and even more difficult is to quantify behavior changes during epidemics. So here I'm just citing a few key references to the subject which is getting quite relevant in the literature, especially in the past 10 years. Most of the modeling approaches in epidemic modeling incorporate media exposure into the model structure to some kind of media function. So it's pretty clear that media play an important role in raising awareness of the public during outbreaks and the typical assumptions of models is that media coverage increases, has the incidence of the disease, the number of new cases grows in time and at the same time the susceptibility of individuals decreases as the media coverage becomes more alarmist. So the more cases there are of new infectious diseases, of an emerging disease, then the media coverage gets more alarmist and then people tend to stay home to reduce their contacts and thus reducing the spread of the diseases somehow and ramping the spread of the disease. And there's a good review I mentioned here on BMC Public Health about how many models are considering this type of effects of media exposure. In our study we focused on a specific epidemic outbreak that is the 2015-2016 Zika virus epidemic. The Zika virus is a mosquito-borne infection, it's a virus transmitted by it as mosquitoes and it's vector-borne and also if transplacental and sexual transmission can occur, the infection is mild and most of the infections remain asymptomatic, 80% of the cases, but actually it can lead to severe complications, especially during pregnancy because it causes microcephaly in newborns and there is no vaccine or specific treatment and the worldwide attention started on Zika virus in 2015 when a first cluster of microcephaly cases in Brazil was associated to a Zika virus epidemic and from 2015 onwards the virus spread to the rest of the Americas and reached the US, reached in the end more than 45 countries, 47 countries in two years, and the WHO declared the epidemic a public health emergency of international concern on February 1, 2016 and cases in the United States were first reported by the CDC at the beginning of 2016. The United States are a special case for the epidemic because only a few states could report local transmission because of the presence of the vector, especially Florida and Texas where the states, where most of the local cases were reported, although all states in the contiguous United States reported cases because they were imported from abroad due to travel. So the research questions of our study were mainly how collective awareness and attention patterns to a public health threat such that Zika virus can be captured by digital sources in general and how does media coverage shape the public attention during epidemics and we answered, we tackled this question by combining different data sources. First of all, the Wikipedia page views data of selected 128 Zika related articles, so mainly the Zika virus page and all translations in 94 languages and these page views were geo-referenced. So page views that were geo-referenced in the US, in the United States, down to the level of the city. Then we combined this information with a number of web news items, about 100,000 web news items citing the word Zika and the United States from about 7,000 new sources in 2016 and this is data available from the GDALT project. And next we consider also time spent mentions of Zika in TV programs that were aired in the United States in 2016. This is data available from the TV news archive and these are taken from close captions of TV news programs aired in the United States. And finally, we linked this media coverage and attention patterns to the epidemic data, so the weekly Zika virus case counts reported by the United States by the Center for Disease Control. So just to give an overview of the time series of the temporal evolution of these four data sources, we see here in the four panels the daily Wikipedia page views of the pages under study, the TV mentions of the Zika word and news items mentioned in Zika over time. And we can see a clear pattern of all the three, of all the three times series with a huge spike at the beginning in February corresponding to the declaration of the WHO of an international health alert, while after March more or less the attention on Wikipedia decreases and even though in summer there is again news covering the Zika outbreak again, especially due to the Olympics in Brazil that were also raising concerns for the public health, the attention remains low on Wikipedia page views and it doesn't get back to the peaks of the beginning. At the same time it's very interesting to see that the Zika cases reported by the Center for Disease Control, the red curve are actually increasing over time, reaching a peak exactly in summer, in summer 2016 with a very different pattern compared to the attention and news coverage time series. And this is clear from correlations if you measure correlation of time series, all the Wikipedia page views and the news items in the TVs are highly correlated with a person's correlation coefficient higher than 0.7 while the correlation of the three times series with the Zika virus incidence is almost negligible and it's very small. There is only a weaker correlation with news items but actually the correlation with the Wikipedia page views is actually negative. So this suggests that the attention measured by Wikipedia is actually not much driven by the epidemic incidence but the epidemic profile of the disease. And this picture remains the same also if we look at the 50 states. So all the Wikipedia page views time series were highly correlated across the 50 states in the United States. So here I'm showing a cross correlation table where we can see basically the time series, the Wikipedia page view time series in each state and the level of correlation with the Wikipedia page view time series of all the other states. And we see that the correlation is all above 0.8 for all the states. So basically there is a high correlation across states and also correlation of course with the national timeline which is above 0.8. So the attention measured by the Wikipedia page views was synchronized over time and in space across the United States. There is only one case, quite special of Montana that is the only state for which there is a slight correlation with Zika virus cases reported in that state while on the other hand for all the other states the correlation between the attention patterns and the Zika virus cases reported in that state are not significant. So it really seems also that at different spatial resolutions the attention is not much driven again by the number of cases reported. We also looked more specifically at attention patterns in cities of the United States. So we looked at a higher granular resolution in space and we looked more specifically at about 800 cities in the US with a population larger than 40,000 and we ranked those cities by the relative attention to Zika virus pages compared to the total page view that was observed in the whole 2016 to all the Wikipedia pages. So we compared how much the page view to Zika were ranked the higher or lower with respect to the global rank on Wikipedia page views for that city. And in that case some patterns start to emerge and for instance if we look at the cities where the attention was the highest so the top 10% of cities in attention we clearly see that they are most in Florida. So here in this case we can see that at the city level there is a correlation with the risk of infection. So where those places especially the area of Miami for instance was at high risk of local transmission of the disease and where the largest number of cases was reported. At the same time cities with the lowest pattern of attention so with the lowest attention to the disease were cities that were quite far from the area most at risk like in the Midwest for instance. So to conclude our study what we did was to try to model the Wikipedia page viewership by a linear regression model that includes all these media signals and at the state level and weekly. So what we wanted to model here is the number of page views to Zika related Wikipedia pages by week in each state of the United States and we want to predict this value by just only using the TV mentions of Zika and the mentions of Zika on news at the national level. So what we want to do is to model the Wikipedia page viewership not as a time series analysis in a sense but by week predict the number of page views only based on the number of mentions of TV on TV programs and the number of mentions of Zika in news and to explain in more in details the model we are using the modeling approach we are using so what we do is to model the Wikipedia page views scaled by the state population because of course the volume of page views is proportional to the population of each state and naturally we find that this relation is likely super linear and at the same time we predict this number as a linear growth with the media signal so as a combination of a linear combination of the TV media coverage the news at the national level and also the news at the state level news at the state level means news that mentions Zika and a specific state so we of course we predict the page views to grow with the number of mentions of Zika in media outlets at the same time as we have seen that we observed from the time series that the page viewership the attention to Wikipedia pages decreases over time and this does not increase again even if during summer the news were increasing this suggested us that there is some kind of memory effect and this is also something that is known in the literature then of course people look at Wikipedia pages once when there's something new but then they don't go back again to the Wikipedia pages because they already know about the matter so of course there is some kind of memory effect that we incorporate into the model by including an exponential kernel that basically dumps the response to news after over a certain time frame where here the memory kernel is dumped exponentially with a timescale tau that basically gives us the average the average timescale over which the memory takes place the memory effect takes place and here we consider two different components of the memory effect so a memory effect on the TV signal and the memory effect on the new signal and we do this week by week separately and state by state so what we do is we do the prediction statewide using a k-fold cross validation with k equal to 5 and what we found in the end is actually the model performs very well across all the states if we consider states with a population larger than one million individuals the model is able to predict quite well with an R-square that is an average of 0.76 but it gets higher for instance in Texas or Illinois it predicts very well the number of page views week by week and as I said before this does not take into account previous weeks on the same time series so it's only based on the number of news and TV items that we see each week separately and so the model gets pretty accurate but what is also most interesting to see is that actually the model gets really pretty accurate when we only consider TV or web news together without the memory effect so the average R-square is already of 0.6 and basically we compared all the combination of the features so we train our model by including all the features that I described before one by one and then by adding them together combining them together and basically a model that includes all the features performs only slightly better than a model that includes only the TV signal the web news and the memory on the web news or the memory on the TV so in the end we see that the media signal is highly informative already at the very basic level of the page view dynamics at the state level and this is again the TV and news sources at the national level already highly informative to conclude my presentation what we have seen is that Wikipedia page view data can be used to measure attention during a public address and this was also already initially studied by some previous works what we have done here is that we have done it at the higher spatial granularities and we've seen that at all spatial granularities attention seems to be mainly driven by media outlets and more in general the correlation with the incidence of the disease was found to be weak and this is contrary to what typical assumptions what are the typical assumptions of several theoretical models that actually expect the attention to grow with the incidence of the disease and of course we need to consider several peculiarities of these specific studies especially the fact that the Zika virus as I mentioned is a vector-borne disease that usually has mild symptoms and the low transmissibility so it is a very peculiar infection and of course other diseases like flu or measles or Ebola for instance could lead to potentially to very different dynamics of attention patterns and at the same time it's important to know that we looked at attention patterns what we didn't make and click on links with behavioural changes this is something that is missing in our case and it would be very interesting to do with some additional data sets at our hands and of course I would like to acknowledge my colleagues at SA Foundation again and also colleagues at Wikimedia Foundation that helped us in setting the problem and in the study and also thanks to the Open Access Policy for the data of Wikimedia Foundation and there is a preprint available if you are interested in knowing more about the study you can find it at the link here online and thank you very much for the attention and I will be happy to answer questions from the audience thank you very much Thanks for a great presentation Michele I'm going to ask Jonathan first if there are questions from the channel or from the room I have a few I'm going to keep them for later No questions from the channel at the moment I'm going to start coming in with some of my questions so something we discussed during the making of the study Michele So first off I guess it's a question that you mentioned in passing about the relation between population and pages being super linear that's something we discussed during the early version of the study and I was wondering if you could expand on that and also add some thoughts maybe about the relation between population but also issues such as internet penetration and connectivity to what extent we can take basically population as a good indicator of what level of traffic we will observe for these topics coming from a given state or country Yeah sure so I would say regarding the internet penetration yeah of course in the United States we can assume that it's high or kind of almost equal across states that's the resolution we are looking at we think so the idea of looking at the scaling with the population comes from the fact that it's been observed in several studies several important studies especially those led by the Louis Bettencourt that several of these human activities like crime or number of patents or other indicators of human activity scale super linearly with population size at the level of cities and naturally we tested this I would say quite extensively and found that actually this holds true also for Wikipedia page views and we use that to inform our prediction model in a sense that we target in our prediction Wikipedia page views rescaled by population with a super linear scaling factor although this was not the aim of our study so actually it would be very interesting to investigate more on whether this is true not only for the United States of course but also across countries because then actually as you mentioned there are other factors that might play an important role and I'm sure they play an important role so I'm not sure that this is true for countries where of course the internet penetration rate varies a lot across cities for instance or it's very low in general and also this should be considered also as something to investigate depending on other features like the type of access to the page or things like that but yeah, I would say in the end the fact that the page view scales super linearly with population is interesting in line with a huge literature on the subject about scaling of human activity with population size. Cool, thank you. I think I have time for maybe one more quick question from Chris and we'll save more for the second Q&A. Thank you, I'll ask that so I'll just read what I put here in chat so I was wondering if there was any information about how viewers of these pages on Wikipedia on Zika are actually interacting or using that information and if there are any specific behaviors known what factors might be associated with those kinds of interactions or might be predicted of those interactions and as an example I just one thing I thought that could happen is someone could continue to click through like a link on there whether that's an internal link to Wikipedia or some reference that's cited on a relevant page. Right, we haven't used such kind of information so we really looked only at the number of Wikipedia page views. I know there is information about user behavior so it's potentially available and would be very interesting to understand that a model, that kind of interaction during outbreaks, that would be very interesting to somehow discriminate on why people actually visit the page and for sure also the main reason for visiting those pages changes during the outbreak so at the very beginning it will be led by interested for something new and then over towards the end of the outbreak there will be other reasons maybe more health related or what so there's a lot to explore in this area I would say we just really looked on a very simple question in a sense. Thank you. I have a quick shameless plug it sounds like super relevant to this question so it turns out that we just started a few weeks ago a new project that starts to look precisely at this question so one is characterizing the usage of citation like when readers follow citations in general and external links but it's one specifically with a team we're cooperating with that's looking at the role of Wikipedia as a gateway to medical and health related information and so I think we're going to have some solid answers to that question looking at specifically these patterns of interaction not really to a specific outbreak but in general like the interaction with articles in the context of medicine and health related information so stay tuned alright thanks again Michele again please stick around to my additional questions at the end of the session and we've got a transition to Jane I think Jane is the one who's going to be presenting am I right? Yes. Alright, fantastic. Can you guys hear me? Yes. I'll start. Hello everyone my name is Jane M and today I'll be presenting about deliberation and resolution on Wikipedia a case study of requests for comment I'll first introduce the team members I'm a newly starting PhD student at University of Michigan and I did this work while at Career University the other members are Amy Zane PhD student at MIT CCL Christopher Shilling member of Wikimedia Foundation David Carger professor at MIT CCL and Jonathan Morgan member of Wikimedia Foundation today I'll be talking about requests for comment so what are they? Request for comment is a major and common process used by Wikipedia editors for requesting input from uninvolved editors in order to solve content related disputes so it's basically a way of getting help from outside in order to solve a content dispute that's not being resolved locally and the request for comment process consists of initiation, discussion and one of the three outcomes which I'll be explaining shortly let's start with an example so here is a request for comment started by an editor about an unresolved dispute about whether the lead paragraph of Donald Trump's page should say he's the current president or not and after this RFC has been initiated we can see editors coming in writing down their opinions and along with the reasons and sources after this deliberation has run fully we can see an uninvolved editor who we call the closer come in and formally the request for comment the closer evaluates the consensus of the discussion and writes it down as a closing statement at the right top gray box we can see that the closer here has evaluated that the discussion's consensus is to include that Donald Trump is the current president on the Donald Trump's page here we can see that this request for comment has been closed and the deliberation has been met with a consensus with the dispute being resolved and this is a very ideal case of a request for comment's ending but we found out that it does not always happen to all RFCs a request for comment can meet one of the three endings and the example we just previously showed is a case when the request for comment is formally closed by a closer and the dispute being resolved however there are cases when a request for comment is not formally closed like this one here this is a request for comment about whether to enlarge section on policies on Bernie Sanders page and it is currently left without a formal closure among request for comments like these however there are some that are still informally ended usually by a participant and initiator who thinks that there's already an obvious consensus and there's no need to further discuss or there's no need for formal closure and here we can see that the dispute is also resolved the problematic case is when the request for comment is left stale ended and closed by nobody in this case the dispute is not resolved and the deliberation is left stale for example this is part of a request for comment that is left stale and we can see an editor referring to the stale RFC as another sad Wikipedia fail we found out that actually many request for comments need these sad endings we found out that roughly a third of request for comments are stale without any closure and we also found that many that do get formally closed do not get closed in a timely fashion so the average period of a request for comment open up to the formal closure was about 45 days and this is 1.5 times more than the time period a request for comment is allowed to be open by default so it appears that many request for comments are stale so why are they a problem first they can be discouraging to editors if a request for comment never gets closed when they have put a lot of effort into it it basically means wasted labor second they can be a problem for productivity as editors involved in request for comments may wait on the outcome before further editing so the next natural question is why do many request for comments remain stale and to answer this question we did a number of things which I'll be presenting in the rest of the talk and we believe this constitutes the first study of investigating the process and issues of request for comments and I'll continue by explaining how we got the data so first we had a quantitative study where we gathered and analyzed 7,316 request for comments from the English language Wikipedia since 2011 to 2017 we built models predicting request for comments outcomes using this data second we had a qualitative study where we interviewed frequent and closers and we also inspected 40 randomly chosen stale request for comments from the data set in order to find out why many RFCs remain stale we also consulted with two members of the Wikimedia foundation and discussed this study on Wikimedia's research mailing list to further describe about the RFC discussion and closing data set we first used the revision history of talk pages provided by the media wiki API and the edits of a bot that manages request for comments we gathered the URL of the original page where the request for comment was started at however one problem is that threads on Wikimedia gets archived when time passes by so we iterated through the archives to find the current version of the URL of the RFC and using the URL we retrieved and parsed the request for comments content using libraries like Wikichatter in total we grabbed about 7300 requests for comments and we extracted initiator participant closer information and comments initiating and closing statements we also kept the reply structure intact using python libraries now I'll provide a description of requests for comments over time including their issues using the data set that I just explained about first here's a graph showing the number of requests for comments initiated each month from 2011 to 2017 although we can see a sharp increase in 2011 we assume that's because that's the point when the bot managing RFCs became active so overall we can see a steady volume of requests for comments being initiated about 60 to 120 per month we also found out that closers are first their more experienced than initiators and participants for instance they have a higher number of average edits made on Wikimedia second we found out that there's a smaller number of closers compared to initiators and participants this may imply that not everyone closes so we found that there's a steady volume of usage of requests for comments with experienced editors as closers now let's get back to the main question the major issues we found in the RFC process to recap first we found that many requests for comments that is a third of them are stale second we found that among RFCs that are formally closed many of them are not closed with a timely fashion it took actually about 17 days since the last comment up to the formal closure and this is a graph showing the timelines of all the RFCs in our data set and the yellow line shows the time between the opening and last comment while the blue line shows the time since the last comment up to the formal closing statement if there is one and we can see crazy outliers existing here we can see RFCs that took a long time to close after the last comment and we also can see RFCs that just dragged on and on not being formally closed and among them there were many RFCs that were stale so why are many requests for comments stale now I'll explain the reasons we found behind the stale RFCs to recap the methods first we conducted interviews with 10 frequent closers on wing English Wikipedia and second we had a qualitative analysis of randomly selected 40 stale RFCs from our data set so in total we found 5 reasons behind stale RFCs and the first one is related to problems with initiators and initial proposals among the 40 RFCs we found that some of them contained 2 vague initial proposals making it hard for newcomers to further participate in the RFC and the interviewees also mentioned that they found unclosed RFCs mainly some of them due to having no clear question in the initial proposal in a more severe case we found initiators using biased words in the initial proposals trying to solicit participation in an unfair way and editors will not participate further in this case making it more likely for the RFC to become stale second reason is related to behavior of participants among the 40 RFCs we found some of them having excessive figuring which led to long and complicated threads with lots of back and forwards which made it hard for newcomers to come and examine the content and further participate in a more severe case we found out that some participants became sock puppets which means that they create multiple counts in order to sway the discussion in their favor and interviewees mentioned that they would not close when they are suspicious that there was sock puppeting going on the next case is related to lack of interest or expertise from uninvolved editors among the 40 stale RFCs we examined we found that some of them didn't contain a wide RFC remain stale but among them we found that some had small participation from new uninvolved editors this implies that if an RFC does not gain much interest from the outside editors it's likely to be more stale and interviewees mentioned that when they find an RFC that they believe that not many editors care about they sometimes just pass it by and not close it thinking they can't create an impact on the RFC anyways this category also includes the case when the request for comment requires a certain amount of background that not many closers have the interviewees mentioned that they sometimes do not close RFC because they feel that they do not have the expertise needed to understand and close the request for comment this implies that if request for comment is in an area that not many closers are aware of then it's more likely to become stale the next reason is related to the RFC being too complicated or contentious the interviewees mentioned that they sometimes find request for comments containing huge amount of comments and where the feelings are running very high and they told us that in this kind of situation they will not close it and leave it to someone who is more experienced or closer with more authority interviewees also mentioned that there were just sometimes some cases when they couldn't just understand the request for comment because it was too complicated the last reason is related to interpersonal issues and Wikipedia politics just like anyone else Wikipedia editors also have interpersonal issues and interviewees mentioned that they sometimes will not close an RFC because they have a bad relationship with an initiator or a participant so up to now we have incorporated these findings from the interviews and a qualitative inspection of all the RFCs now I'll be explaining how to incorporate these findings into building models for predicting request for comments outcome our goals of building models is to first we want to understand the features that can predict whether and request for comment will go stale second we want to help initiator and participants to prevent stale RFCs and considering that we want them to take action beforehand and that features change over time we also need to build timely models when I mean by building a model I mean by building a predictor which needs feature selection and for this for this step we'll use the five reasons we found through the qualitative study and after that we'll train the classifiers using those features so in total we came up with 61 features grouped into eight categories and the first one is related to initiators experience so during the qualitative study we found that problems with initiators and initial proposals for example the proposal being too vague or the initiator using biased words can negatively affect the RFCs outcome and we thought this was potentially connected to the initiators level of expertise and therefore we calculated features like the number of edits made by the initiator on Wikipedia and whether the initiator is an admin next we also considered participants interest we learned from the qualitative study that if an RFC does not gain much interest from editors it's likely to become more stale therefore we calculated features like number of participants of RFCs and the ratio of new participants to the top page where the RFCs thinking these two reflect participants level of interest next we also considered participants experience we learned from the qualitative study that behavior participants can impact an RFCs outcome such as soft buffeting and bickering therefore we considered features like the age of the Wikipedia account of participants and the number of edits made by the participants on Wikipedia next we considered size and shape of discussion we learned that if an RFC is too long or too complicated with lots of back and forths it can scare away closers considering this we calculated features like number of comments of an RFC and the average depth of replies per comment next we considered contentiousness we also learned during the qualitative study that if an RFC is too contentious it can also scare closers away therefore we calculated features like number of reports and post votes in RFCs and weighted reciprocity next we considered tone of participants discourse we learned that behavior of participants like bickering can impact an RFC's outcome and therefore we calculated various tones including hostility, anger, cognition and certainty we also considered initial proposals tone and length we learned that problems with initial proposals for example they can be too vague or too short we learned that they can negatively affect an RFC's outcome so we calculated features like number of words in the proposal and the same tone related features that I mentioned previously last one of the least we considered popularity of RFC and its topic we learned from the qualitative studies that if an RFC has lack of interest from editors concerning its topic then it is likely to be stale therefore we considered features like number of words in the RFC thinking it reflects the popularity of the RFC and we also considered features like number of revisions made on the top page where the request for comment is started at before the RFC initiation thinking it reflects the popularity of the RFC's topic so using these 61 features we built 4 types of models using logistic regression adaptive boosted decision trees random forests and support vector machines using radial basis function kernel the baseline here is just predicting close for every RFC and it reaches 67% of accuracy we can see that the adaptive boosted decision trees perform the best overall except the recall score and it reaches about 75% accuracy this is about 8% increase in the baseline performance we also built classifiers using features for each category because we wanted to know which category was the most important and we can see that the size and shape of discussion was the most important category followed by participant experience and participants interest and this is the top 14 features in the ADT model incorporating all data and we also can see the correlation between each feature and closure and the red box ones are the ones that I'm going to be talking about first we can see that all features related to size and shape of discussion were included in the top 14 features interestingly the number of comments and the average reply depth of comments had a negative correlation with closure while the average number of replies had a positive one this may imply that if a threat is too complicated along with lots of back and forths it might scare away closers however a certain level of interest which can be signaled through the average number of replies can be beneficial for an RFC to avoid staleness we also found out that number of participants which is related to participant interest was also included in the top 14 and had a positive correlation with closure also implying that a certain level of interest of participants can be beneficial to an RFC's outcome lastly we found that almost all features related to participants experience were included in the top 14 with all of them having a positive correlation this may imply that experienced participants may help an RFC to avoid staleness on the other hand it may imply that experienced editors may know which RFCs are likely to be closed out of experience and participate in them so up to now we trained and tested the models using our entire data set however considering that one of our goals is to help initiator and participants to prevent RFC's staleness our next question that we have to answer is how soon after an RFC is initiated can we predict the likely of closure reasonably so for this we used all 61 features that I previously explained and also using a new feature which is the number of days since the last comment up to the current time and we built models using features immediately after initiation after 1 week after 2 weeks and up to 11 weeks and this is the result of the time we models we built the X axis shows the number of weeks after RFC initiation while the Y axis shows the accuracy level and the pink and purple lines each stand for the accuracy level of baselines for predicting just close for every RFC and predicting unclose for every RFC while the other four lines stand for the accuracy level of the four types of models we built and the first take away from this graph is that we can predict above 70% accuracy as early as 1 week after initiation with the ADT model and random force performing the best second we see that the baseline for predicting close for every RFC decreasing and the baseline for predicting unclose increasing over time this is because for every time point we only consider RFCs that are not closed yet so as time passes by some requests for comments become closed and we discard them from the data set this makes the ratio of unclosed RFCs increase making the baseline of predicting unclose for every RFC increase and in the sixth week we can see that it's a 50-50 chance for both baselines and our best models improve over the baseline by over 15% with the random force and logistic regression performing the best the implication of timely models is that they show which features are important at a time point along with the prediction of outcome therefore initiator and participants can take actions using these top features for example let's say after 2 weeks the models show that participants expertise level is crucial in order for an RFC to become closed so participants see this and they can invite experienced editors to the discussion in order to increase the likelihood of the RFC to become closed and avoid stillness. To summarize our contributions first we created a new comprehensive data set of request for comments and anyone can access and download the data set using this link second we found that roughly a third of request for comments do not get closed at all remaining still and many do not get closed within time third we conducted a qualitative study showing insight from the closers themselves as to why many request for comments remain still and last but not the least we built new models to help predict which request for comments are likely to go still and also timely models that help initiator and participants take action a paper on this work was accepted to CSCW 2018 and anyone can download it using this link and these are the mail addresses of mine and Amy's thank you for listening to this presentation and I would be happy to answer questions thank you thanks for the great talk Jane going to pass it on to Jonathan for questions if Jonathan is still around and can hear us well also with one here well he's coming back I really appreciated the design recommendations or policy recommendations towards the end of the presentation I think they're very clear and actionable one general question that I have which is similar to what we've always seen in the past where you know there are obviously there's a different participation in this governance spaces from newcomers as opposed to to experienced editors all of the data that you have points of the fact that you know the higher likelihood of closure when more experienced editors are participating and the negative effect of length to me these two data points really suggest that we're creating barriers for newcomers to participate in this conversation I think the second point is very much related to the work that also Amy has been doing on the importance of summarization for people who are not already familiar with an existing discussion so I'm curious in general about your thoughts about participation from newcomers in this specific type of discussions you're getting additional insights what is desirable I mean there are I guess there are questions about like at what point is somebody's tenure on Wikipedia is appropriate to participate in this discussion oh so your question is about my thoughts on newcomers participation and request you come up correct yes so from the interviews it felt that the interviewees felt anyone is allowed to participate in request for comments however from the studies results I thought that it would be great if there is a way for newcomers to get feedback from maybe more experienced editors on how to participate in request for comments like during the interviews there was an interesting point that editors learn from other experienced editors while participating by looking at the way how more experienced editors use reliable sources and newcomers seem to learn from by looking at them while participating in the discussion and I thought maybe if there's a way for newcomers to get more maybe active feedback on how to participate in request for comments from more experienced people maybe that would be a great way to improve a lot of request for comments outcome that was my general thought yeah what Jane was saying is basically even more pronounced when it comes to the role of closer so newcomers may feel concerned about participating in RFC but for taking that lead to actually try and close an RFC which right now anyone can do it's not something that only admins can do they're there for people who are new to the process thank you I know that Jonathan by the way is having some technical problems coming back in so I think we'll try and we'll see what happens I want to open up the stage for others so Chris if you're around if you have additional questions and I'll check RFC I did have one thing I could add to that question of yours actually the other thing I think that is acting as a barrier to participation either in terms of closure or just getting involved with RFCs in general involves just understanding guidelines and policy and how to engage in kind of not necessarily argumentation but discussion around those I think to the extent that we can encourage people to participate in RFCs that are perhaps not as guideline and policy heavy that is a possible gateway for people to kind of get started because there are certainly proposals that don't rely on really heavy or nuanced understanding of editorial policies and guidelines alternatively if we had a better way to teach people about policies and guidelines because I am very reluctant as an English Wikipedia editor to just throw people into those pages to the extent that we can present those in a more accessible way those are two ways I think that can help people become more involved in these in this discussion process in this kind of proposal process there are currently no questions on IRC sorry I dropped out suddenly I have a question for the speaker if no one else does are there any other questions from the room looks like there is one question from the YouTube live chat Jonathan I just put it in the chat log I'm happy to read the question or if you'd like to this question comes from James Salisman are there any features other than hostile tone of initial proposal that have to do with the initial proposal ergo cognitive tone or hostile tone or effective tone well we tried out all the tones for the initial proposals as well but it turned out that other than the hostility of the initial proposal nothing was included in the top 14 so it seemed that it was the most important tone of the initial proposal excellent and personally I'm always curious about design directions one thing and I apologize if this was asked when I was booted out of the meet but when it comes to helping Wikimedians and potential closers assess how to support a discussion that seems to be running down without consensus or try to move it in a more positive direction with machine learning models having the models output its predictions is one thing but there's also been a lot of interest recently around making interpretable models that allow end users to understand something about the decision making process and making that information available to them in a way that will allow them to make more productive decisions I wondered if you had any thoughts about that in this context so I think that goes along with the implications that I explained about the timing models so with the timing models they not only show the likelihood of whether an RFC will be still or not at a certain time point but they also provide insights on which features are important at that time point which is an example that I showed in the presentation after one or two weeks the timing models show that the participants experience is the most top important feature for an RFC to become closed that the participants can see that and oh we need more experienced editors and maybe send invitations throughout Wikipedia to like more experienced editors and get their neutral suggestions on the RFC to improve the likelihood so I think that's a great implication for the timing models and do you think that you might have you given any thought to how you might evaluate the effectiveness of this model in use cases like that I see to evaluate I guess if I were to evaluate I think I would try to first find RFCs that where this model wasn't used this tool wasn't used and then also get another data set where this model was used and try to compare those over time yeah I think there are many cases of RFCs out there where the people that are participating or the initiator are really interested in getting it closed and they might post to the administrator's notice board kind of asking for participation but may not get the kind that they're hoping to get and if that's the case it would be great to be able to find them or for them to find this work and think about additional strategies for getting input that's great thank you this is just to me thinking aloud we've never tested as far as I know these kind of interventions on top pages or deliberation related pages such as RFCs or AFDs RFAs or what not but I'm just wondering if one could test with a model like this and plugging into a bot that basically would post on the top page at a specific point in time like we have we have a high confidence that this RFC is not going to close unless we bring in more domain experts or unless we diversify the participation rate or if any of the recommendations we presented in your results I'm just wondering this is like a big question not just empirical but also a community standpoint a question to figure out if a bot intervening in such a stage to sort of like a nudge or redirect a discussion with a stated goal of maximizing the chances that it might be closed is something desirable we never done that I'm just thinking aloud that this is something that potentially a bot if we find a community willing to experiment with that yeah I think that would be a really great uses for the models we built like Amy and I we found out that bot is like one of the main tool used in Wikipedia for sending out notices and we also know that a bot is currently used in notifying some random people about the RFCs but this model was used to maybe get the right audience to participate in the RFCs I think that would be a great use case yeah yeah so there is kind of this bot that you can subscribe to the bot and tell it how many like maximum requests that you want per month or something and the bot will just randomly go through and alert you about an RFC and we haven't done the details to look at you know how effective that bot is but it would be interesting to see that another thing is actually right now when an RFC is looking to get closed there isn't a very good mechanism for publicizing that so there is a request for closure page and right now there's kind of a few editors maybe like one major editor who is actually going through and manually posting those to that notice board which you know maybe that could be a more standardized process where you know if this person stopped doing it it would probably stop happening so yeah oh and there's one more question yeah one more question I'll read it for archival purposes this is another one from James Salisman did you measure the number of incoming links to the RFC section? I suspect that a lot of RFCs died because they aren't advertised in wiki projects and Wikipedia Central Notice Board etc. yeah that's actually a great point we actually figured that oh this feature might be a really important one because we noticed that people really try to advertise them but if there's like a difference of some RFCs that they can advertise actively maybe that is a major reason why they come sale but unfortunately for this study yeah we couldn't we weren't able to measure the number of incoming links and the main reason was because we couldn't find a really clean way to do this so we we found some advertisements on various places but we couldn't find a really clear rule to get a really clean set of the advertisements so that was one main reason why we couldn't measure the number it sounds like subscribers to that bot does initial notifications like look in distribution of types of editors that are subscribing to that bot and receiving these notifications which I presume is mostly skewed towards experience editors is going to give us some insights into how people discover an RFC but I agree that's a very important point sweet we're almost going to close I want to check if there are any additional questions I don't want for Mikayla if he's still around just to close this because I didn't get a chance to ask him before so Mikayla give me a sign oh you're still there so one thing that I find fascinating about that study is also very relevant is like this issue this divergence between incidents and the tension that's something that probably will be studying more and more given the current way in which information is propagated over various channels and social media I mentioned to you recently there was this case of a hoax of a nonexistent Ebola outbreak in Italy which apparently is making the rounds on Italian social media and also being used for political purposes I'm just curious if you think this method of use here in the case of Zika could be meaningfully used and adapted to study the spread of hoaxes related to public health threats lacking any actual incidents in the area where these hoaxes are spreading yeah thanks for the question that's a very good point I think this is the specific case of a hoax is related to infectious disease outbreaks hasn't been investigated much and this would be very very relevant to study I think nobody has really a clear idea of the impact of on attention of this type of misinformation and definitely as our study shows I mean the attention is driven significantly but what people read on the web at least on news, web news that was what we mentioned would be nice to have other data sources that you mentioned other social media mentions of the disease and whether these are more significant drivers than the actual infection epidemic spread in this case so this would be very interesting to study as a follow up of our work that's great, thank you thank you awesome with that I think we're almost at time so again thank you all of our speakers thanks for joining us today please remember to share a copy of the deck so we can upload it and share it with our audience and see you all next month for the next showcase bye everyone