 Hello, my name is Eitel Kun and I will present our work titled Longitudinal Assessment of Reference Quality on Wikipedia on behalf of my co-authors. Our work focuses on verifiability policy of Wikipedia and this policy ensures that people using the encyclopedia can check that the information is coming from a reliable source. What is burdened to demonstrate verifiability lies on the editors who add and restore material. In our work we refer to verifiability quality as reference quality and we define two metrics to measure reference quality. The first one is reference need index that represents the percentage of citation-missing sentences. The second is called reference risk index and it considers the proportion of non-authoritative sources used in revision. Now for the first metric we provide citation detective tool that is based on the previously published work that introduces citation need model and the way it works it retrieves the article content then segments and embeds the article. After that it forest the word embeddings and the section embeddings into the citation need model and this citation need model gives us a label for each sentence whether it needs or doesn't need a citation. And finally the result is stored in a database. So the definition of reference need is a proportion of sentences missing references in a revision among those that require a citation. So here we find the set of sentences using citation need model and then we find the proportion of sentences that are missing these references. For the second index reference risk we utilize perennial sources list which is a community maintained non-exhaustive list of sources whose reliability and use on Wikipedia are frequently discussed. This list has five categories and the first one is blacklisted, deprecated, no consensus, generally unreliable and generally reliable. And the definition of this index is a proportion of unreliable references in a revision. So we find the proportion of unreliable references among all the references that are present in a given revision. And we refer to unreliable references that are blacklisted and deprecated sources from the perennial sources list. In this presentation I will introduce three datasets that we used. The first one is reference history dataset that tracks occurrences of references that are added to the perennial sources list. Then we also have random and top datasets that are 20,000 randomly sampled pages and then 10,000 most viewed pages. Here we present the sizes of our three datasets and the period that we consider covers from 2010 to 2020. Now I will present our results. I will start with the evolution of reference risk. As we can see from this plot, for both datasets, top and random, reference risk score remained below 1%. And we can also notice that after the creation of perennial sources list in 2018, the reference risk score decreased in both datasets. Now considering the evolution of reference needs score, we can see that in both top and random datasets, reference needs score has steadily decreased over 10 years. And we can also observe that articles related to culture ended to have a better citation coverage, meaning a lower reference needs score. In 2020, mean reference needs score reached nearly 30% and 38% for top and random datasets, respectively. We next consider lifespan of risky sources to evaluate a community effort for maintaining reference quality. And we refer to lifespan as the time elapsed between the addition and removal of a reference in the number of dates. As we can see from this plot, the median lifespan of risky references decreased by more than three fold once they were added to the perennial sources list. This result may indicate that labeling of perennial sources encouraged editors to remove unreliable references more quickly. To sum up, reference need is decreasing and reference risk is remaining stable low. Reference quality does differ by topics and community efforts as, for example, creation of perennial sources list play a key role in maintaining reference quality on Wikipedia. Global future directions are extending our work to other language editions and considering a global reliability index, because in our case, we are limited to the perennial sources list coverage in our computation of reference risk score. You can refer to our full paper using this link where we also show some additional experiments, as well as to our GitHub repository where we share the dataset and some code that we used. We are open for further discussions and suggestions, so you can contact us through this email. And thank you for listening.