 In 2020, when the Nobel Prize for Chemistry was announced, millions around the world googled the term CRISPR, and likely their top hit was its Wikipedia article. In general, Wikipedia's science articles are the first thing that appears on search engine results, making it a central node in the transference of academic knowledge to the public. In this research, we propose that Wikipedia can play even a bigger role by providing a source of historical knowledge on its own right. This work is a product of a collaboration between Omer Ben Jacob, a journalist and philosopher of science, and myself, a biologist trained in the WECLA. We've been working on this for a few years now. Our first study showed how two Wikipedia articles on another Nobel-winning field, Thurcadian clocks, managed to accurately reflect the field over two decades updating as it underwent scientific revolutions. We then expanded to another case study, now a large-scale research about COVID, and we harnessed the power of bibliometrics on Wikipedia. And today, we want to share our latest study on CRISPR gene editing, which we also use for consolidating a large methodology on how to do this type of research in general. It can be found on bioarchive, and we hope to see it peer-reviewed very soon. In the meantime, we are also completing additional case studies on other cool topics. Now, the required philosophical theory here is that encyclopedias can be seen as a reflection of their time. So just like the first encyclopedia of Diderot encyclopedia reflected the technology, the culture, the worldview of the Enlightenment era, Wikipedia is a present-day reflection of modern human society. So if Wikipedia is indeed meeting its goal of representing the consensus on scientific knowledge, it raises the question whether this was also true historically. Can Wikipedia articles document shifts in science over time? To address this question, the CRISPR field is just perfect. Initially, it was a basic science finding. You can see here the entire article from its birth in 2005. Even the name CRISPR actually was given in 2002. Today, of course, it's a hot scientific topic. It won the Nobel Prize, and of course, it also has this social impact. It has its own movie on Netflix, and it became kind of a pop icon. And all this growth happened during the lifetime of Wikipedia. So how can we use Wikipedia to track this growth? Well, first things first, most intuitively, we have the view history function. This view is amazing because we can track changes made to a single line or the same section of text or the entire article in a resolution and convenience that's really difficult to imagine for comparing texts in traditional historical approaches. So here we can see, for example, the exact moment in 2013 as a section title changed from possible applications to applications. This here is the CRISPR gene editing revolution, what it received the Nobel Prize for. So in our work, we combine this type of qualitative reading with kind of like a more traditional historical approach with quantitative analysis. So we developed an automated tool that is free to use, it's found on our GitHub. And this tool can first sieve through Wikipedia's massive body of articles to identify a corpus of related articles. So we search for the term, for example, CRISPR. And we return only the articles on Wikipedia, which have the term either in their title or that title of one of their sections, which is expected to exclude articles with only minor or incidental use of the term. So here out of 720 mentions of CRISPR, we identify 51 significant articles. This tool also scraps Wikipedia's articles for their metadata, such as the date of creation, the article size, the references. And analyzing this data is then combined with in-depth reading of the revision history, like we showed before. This is called a thick description approach to contextualize the finding, the quantitative finding, and create a sort of archaeology of the articles. And now that we discard the less relevant articles, we can do this analysis in depth and a content-dependent work much more efficiently. For example, we can create a timeline that documents the growth of the field based on the date of birth of articles. So we can see, for example, CRISPR open in 2005. Then in 2012, articles were created for Jennifer Dowdna, one of the Nobel laureates, likely following the publication of her work regarding genome editing, which was also open that year. Gene Knockout existed prior to CRISPR, right? There were other methods. But if you dig deep into the edits, you see that only it was later edited to include CRISPR in a significant manner. So we also created an automated version for Wikiblay, where we can see how CRISPR permeates into new bodies of knowledge. Gene Knockout, as we saw, was present before CRISPR, but only after 2012 it was related to CRISPR, connected to CRISPR. Same goes for designer baby. An article existed as a theoretical entity and referred to CRISPR only as this became possible after the 2018 affair in China. Anti-CRISPR is a technology that's based on CRISPR, so it was already born with the term in it much later. This shows us how the historical dynamics of how CRISPR permeates into new fields and concepts. And of course, it's automated so we can compare timelines between different fields and see different models of growth. We can build networks and see how different connections were made over time. Here's a comparison of these timeline between CRISPR, the circadian clock, and the coronavirus. So as we said, our tool also scraps all the references and we can start doing bibliometrics across fields, comparing the important publications which were used when and where according to their type could be scientific publications or even websites. And the dream is that if we gather enough case studies, we can start to model different patterns of different fields. And we treat these Wikipedia articles kind of like living documents which allow us to perform content-dependent archaeology of knowledge in an yet unprecedented resolution and apply quantitative metrics to our analysis. I want to thank my collaborator, long-time friend, Omele Ben-Jacob. Of course, thank all our students and our collaborators, the LPI in Paris and Wikipedia workshop team for the stage and thank you for your attention.