 Okay, and we're live. Let me let the settings finish getting loaded up and wait just a few seconds for any attendees to who see the notification. I'll talk a little extra long, introducing you to give people time to drift in here at the start. But okay, I think with that we're rolling. Welcome back everybody. Very much appreciated to see you as I've heard people say events like this before. I'm glad to I'm glad to see that you decided that day one wasn't a waste of your time and you're back for day two. So thanks very much. We have another really exciting day worth of talks lined up lined up for you here today. Commencing with I'm getting an echo of my own. I'm sure it'll go away. It's fine. I'm extra long introducing you to give people time to drift in here at the start. I'm getting a bad echo. But okay, I think with that we're rolling. Welcome back everybody. Sorry, one second. I do not know where this echo is coming from. As I've heard people say. Oh, Christoph, do you have, do you have Crowdcast open on your computer? It wasn't a waste of your time. Maybe. I think you're watching your own broadcast in tape delay. A day worth of talks lined up lined up for you here today. Let me see. Commencing with. I need to disconnect here. How about that? There we go. Okay, that is what it was. Cool. All right. So yes, while we get the screen sharing back. Good. Okay. Here we are. So commencing with our second keynote speaker. So we'll kick off every day with a keynote as you've seen on the schedule. And we have with us this morning or afternoon, depending on your on your time zone, Christoph Malater of the University of Quebec in Montreal, who is there. Canada research chair in philosophy of the life sciences and professor in the department philosophy has done some fantastic work over the last few years. Really, really a pioneering group working there at you come on the use of topic modeling for purposes of history of philosophy, philosophy of science, history of philosophy of science, et cetera. And he'll be talking to us today on the topic modeling of multilingual non parallel corpora. So how we might think about applying machine translation to a philosophy of science corpus. And so with that, I will, I'll give you the floor. Thanks so much. Thank you. Thank you, Charles and Luca for having me with you. There's just been a great conference so far. So I'm really happy to be with you today. And my talk is about, we'll be about exploring multilingual non parallel corpora, especially with topic modeling and with a special focus here on the philosophy of science for philosophy of science corpus. And this is actually a problem that we stumbled upon when we targeted our, our corpus and fellow philosophy of science journals we were not expecting actually to have to face a multilingual corpora. But we did. So we had to find solutions and what I'm going to share with you today is some of the solutions that we implemented. So my talk is going to be probably quite heavy on the methodological side, but I will also share with you some of the results that we, that we got. So the background question is how can we map disciplines through time in particular? How can we identify the research topics of a discipline such as philosophy of science if we're interested in philosophy of science in the history of philosophy of science? And how can we investigate changes of this discipline through time? Of course we can use expert analysis and this is the usual methodology of closed reading by which we look at texts in detail and with expert knowledge and make sense of all those literature and we're able to reconstruct a history of the discipline with its main research themes opening and closing through time. But of course it is very time consuming especially if the corpus becomes larger and larger and larger. So another approach is to use computational text mining methods. This is something that we also saw yesterday. Cody in the presentation of Cody or tool for instance in which there were different approaches being used to investigate a large corpus and one of the possibilities to use topic modeling algorithms over very large corpora as a form of distant reading to be able to collect somehow make emerge main topics that are present in the corpus through time and how they evolve through time. So one of the advantages of these methods is that they can tackle very large corpora, huge amount of text. Another advantage is that they're bottom driven bottom up their data driven somehow. So they may help set an empirical basis for otherwise what might be informal claims. And so here this is what we intended to do on the full text content of eight major philosophy of science journals from the 1930s up till 2017. But sometimes maybe even more often than we think corpora are multilingual. In our particular case, we found out that 6% of the articles were published actually in German, Dutch or French. And so that was not so much as a percentage of the whole corpus, but the non-English articles still represented about 44% of articles before World War II. So the first option is when you have such a corpus could be to exclude all in the non-English articles and just focus on the English articles. And this is what we did in the first study. But then we thought that we were probably missing something in the pre-World War II period in which we had so many non-English texts. And we wished to include all the articles, including this non-English articles in a subsequent study, but how can we do this? So when you tackle multilingual corpora, there are actually two types of multilingual corpora that you need to be aware of and that we on which we found work in the literature. There are what are called parallel corpora and non-parallel corpora. So parallel corpora include expert translations. Texts are available in at least two languages or several languages and they're typically gold translations between all the texts. An example is the proceedings of the European Parliament which are available in several languages and they're perfect translations of one another. So we find a lot of work on multilingual corpora that use such a corpora as the proceedings of the European Parliament. Another form of parallel text include comparable texts, but that are not expert translations. Texts in different languages that are supposed to roughly exhibit the same distributions of themes or subjects, though they're not exact translations of one another. And this is what you find in Wikipedia, for instance, how much articles or entries in French and English talk about the same things, for instance, but they're not necessarily exact translations of one another. And then there is a whole other class of multilingual corpora, the class of non-parallel corpora which include texts that are not aligned. And one typical example is articles from journals that accept publications in different languages. And in this case, the articles are never translated in several languages. You have articles that are found in different languages only once. So we had to tackle, or in our case, non-parallel corpora. So what do you do if you want to do some topic modeling which are the type of solutions that are there in the literature for multilingual corpora, generally speaking. We've seen that there is a lot of things that have been developed in particular for parallel corpora. Topic modeling algorithms have been developed to carry out topic modeling across parallel texts by aligning the topics at a sentence or document level. And we were somehow puzzled to find about this, but the interest of doing this type of topic modeling is actually to help improve machine translation through these types of topic matching parallel topic model. So this is something that we had to be aware of, but this is not something that could anyway help us with our problem. For non-parallel corpora, you need to have specific language bridging solutions that are somehow implemented. And so the first solution is to use what we can call advanced topic modeling algorithms that include such language bridging solutions. And in that particular case, several algorithms have been developed or proposed such as some that include multilingual dictionaries directly inside themselves or all their lexical resources. Others use concept trees, for instance, inferred from WordNet or based on user input. So others use a combination of partial text alignment and lexical resources, quite sophisticated stuff indeed. The objective being to be able to carry out a topic model on multilingual non-parallel text without having any translation in between, but using bridges like dictionaries and so on to be able to make the connections between the topics in one language and topics in another language. Another solution is on the other way to use what I'd say, what we call advanced machine translation and together with a monolingual, very simple or vanilla topic modeling tools. And this has also been investigated in the literature and there were found two options to ways of doing it just to do the machine translation on just the specific terms that are typically present in a term document matrix. And those were in other words, you not translate through a machine translation, the entire corpus, but you would just filter out all the words and just send the words for translations once they've been aggregated throughout all your corpus. But another option that has been investigated more recently is to do the machine translation of the complete texts and to have full text translation. This has been assessed recently in the context of parallel text by DeVries and colleagues looking specifically at the European Parliament corpus. It has also been assessed for studies based on linguistic inquiry word counts by some of the teams. And we found this a quite attractive solution to implement, but our case still is that we did not have parallel texts to check for the translation. So we had to devise a ways of checking for the quality of the translation somehow even without having the golden translation to check against. So we, as I said, we opted for the second solution, especially because in our mind, it provided and advantages the other traditions did not, which was that it provided a possibility to have access to the article content in English. And I don't know about you, but I can read French. This is no problem. My German is very rough and my Dutch is nonexistent. So having access to the full text was somehow a great advantage to be able to make sense of some of the results of the analysis. So this is one of the main advantages that we see to this solution. We used machine translation with a Google translate following what had been done by recent colleagues also for consistency. And to show that this type solution that it tested on one parallel corpus could also be implemented on non-paral corpus. And we used very simple plain vanilla LDA topic model, most basic form that is very well-known and very robust. And then on top of that, we did further analysis with IDOC codes and I'll go into that as well. So remember that our corpus is non-parallel. And so contrary to what had been done before with multilingual corpore and machine translation, we did not have the possibility of checking with algorithms the quality of the translation. So we had to devise a new way of assessing this translation quality problem. So in the talk here, I hope you understand now there are two intertwined question, a methodological question that we stumbled upon when we wanted to answer a history philosophy question. So the methodological question is how can one tackle non-parallel multilingual corpore and evaluate the machine translation quality? And this is where we devise what we call semantic topology preservation test. And then there is a history of philosophy question of content question, so to speak. What are the research topics investigated in the philosophy of science and how have they changed through time? So with this, here are main results is to do a complete topic model of the complete corpus and to bring new insights from the non-English text, not only pre-word work too. So the structure of my talk will be very classical to present to you the data, then the methods, then the results and then we'll go to a discussion period depending on how much time we have left and then we'll be happy to answer your question. So the data itself, as I said, it's a philosophy of science corpus includes eight major journals in the philosophy of science, so the BJPS, Erkendness, the European Journal for the Philosophy of Science, International Studies in the Philosophy of Science, the Journal for General Philosophy of Science, Philosophy of Science, Studies, Part A and Santès. So you can see here the publication periods, the number of articles in the different languages and it's a, so most articles like 15,900 are in English but there were still quite a lot in German, some in Dutch and some in French. The German articles were mostly from Erkendness pre-word work, pre-word work too and from the Journal for General Philosophy of Science from the 1970s through the 1990s. Journal articles in Dutch and French were mostly from Santès and typically before the 1940s or 1960s altogether. So all in all, this is a corpus that includes nearly 17,000 full text articles. This is how the number of articles evolve over time. So you see in shades of orange here, the English articles of the different journals, these are a number of articles. So you see how Santès typically came to become one of the journals with the highest number of publications today but you see also in blue, the non-English articles and how significant they were before World War II and then how some of them appeared also in the 1970s and 90s, these were articles from the Journal for General Philosophy of Science. So before World War II, essentially non-English articles, quite a lot of them in Santès and Erkendness and then later on in the JJPS. So that's about the corpus itself. Now the methods, we implemented here a research design following two main stages. So stage A is more concerned to the translation and stage B concerns more the topic modeling itself and the content of the corpus. So stage A includes all the translation issues of the non-English articles into English and the assessment of the translation quality, especially for the purposes of bag of word textual analysis and stage B is the topic modeling of the entire corpus. It's analysis, both from syncronic and diachronic perspective in comparison with the previous topic modeling that we had done on the English only portion of the corpus. So this is the overall research design in orange. You see all the different steps involved in the translation and translation quality assessment and in blue, all the topic modeling work and in gray, that was the methodology followed for our previous topic model. So we'll go into some of the details here, state step by step, just to make sure that you're fully aware what went on behind the scenes. So what do we do for the machine translation itself? We took all journal articles and they made a data that we had organized into a dead frame. We used automatic language detection to detect the article language. We then split the corpus into four step corpore, English, German, Dutch and French and each non-English sub corpus was sent to go translate by chunks of about 25,000 characters was as requested by Google translate. The translation results were then reassembled into articles and then back, the whole of it assembled back into a data frame. We carried out a manual quality assessment. In this case, for each non-English language corpus, 10 texts were randomly selected and for each we inspected their first 500 words of the original document and the translations in particular we scored the text for three types of problems that might have possible impact on computational textual analysis. We found out that there were spelling issues in the original text, mostly resulting from OCR and encoding issues. And so these issues, spelling issues in the original text might have also induced issues in the translation. So they were present in both. We looked for inaccurate terms that were introduced by the translation. So translation mistakes, so to speak. And we look at OCR and encoding issues that were present in the original text and that were corrected through machine translation. So improvement in the text quality through machine translation. But that was done only on sample text and we wanted to have some form of assessment, of quality assessment over the entire corpus, something that could be implemented algorithmically. So to provide a systematic translation assessment over the whole non-English corpus, we chose to compare the relative distances between documents before translation and after translation. The rational is that documents that are close to one another in the word vector space before translation should also remain close to one another after translation. And if this is the case, then bad word algorithms should provide similar types of results if they were to be run on the original text or if they were to be run on the translated text. And this is what we call the topology preservation test because in other words, it means measuring how similar the structure or the topology of the document term spaces is the same in the original corpus or in the translated corpus. So we divide this topology preservation test more specifically this test consists in constructing the document term matrices of the three circle plural in their original languages, so Dutch, German, and French, and of the same thing for the three translations. We constructed the document term matrices directly without any pre-processing of the text. Therefore the dimensionality was extremely high to reduce the dimensionality of all six document term matrices with your single value decomposition. We then calculated the document-document matrices within their respective word vector spaces for all six document term matrices. We use Euclidean distance here. And then we measured the similarity of the original and the translated distance matrices for each sub corpus. And here we used similar metric similarity measures such as Mantel coefficient, Procrustus coefficient, and RV coefficient. And we use this coefficient as indicators of the translation accuracy for better word analysis. Now the methods for the topic modeling itself. Here this is more straightforward. The first stage typically is to do a corpus pre-processing once you have assembled all the English and all the translated sub corpus together in the data frame. We do a word tokenization with part of speech tagging and limitization to reduce the number of word variants with your well-known package of treat tagger with a pen trade bank. We did a word filtering here, removing stop words and also based on frequencies. And that resulted in a reduced lexicon of about 24,000 terms in all articles of the corpus. After the pre-processing, we implemented the topic modeling itself. This said we used an LDA algorithm for in Bly and colleagues. So the one of the classical Python package was the Gibbs sampling. And we did this with a number of topics of 25 as in the previous topic modeling that had been done. The advantage is that this is a fairly coarse grain view that we found quite suited to be able when it comes to describing very general trends in the discipline over nearly a century. But of course, much more detailed topic modeling can be implemented. The results that are obtained from this topic modeling stage are the 25 topics which are probability distributions over the lexicon and the probability distributions of the topics over all articles in the corpus. Once we have this, we did a topic interpretation by examining the most probable words in each topic and by retrieving the articles in which a given topic was in turn the most probable. We also grouped the topics into clusters on the basis of our own expert knowledge so as to facilitate the handling of the topics. And we also use topic correlation within corpus documents to do this manual clustering. In a subsequent step, we compared this synchronic topic modeling with the previous topic model only on the English text. To do this, we assess the similarity between the new topics and the similarity of the new topics with the previous topics. And we used a Euclidean peer-wise distance between their respective probability distribution vectors for all words that were shared in between the two lexicans. The results of this step is a distance metrics between the new topics and the previous topics that shows the alignment or non-alignment of the topics and the effect of including the non-English corpora into the study. As a sixth step, we did a diachronic analysis and also by journal analysis that was simply done by adding publication years and journal metadata to the model and aggregating the topic probability distributions in articles either published in specific time periods or published by specific journals or even the result of the step is a diachronic topic model that shows how the topics evolve through time as well as journal profiles and even journal profiles evolving through time. This will all show. We also did further comparisons with the previous topic model on the journal profiles and the diachronics. So here is more qualitative comparisons, not algorithmic, but this led us to investigate also more in details. The pre-World-World-II period for which the proportion of non-English articles was significant and I'll share some of the results with you also. Right, so this is for the methods itself. So now the results. What do you get first for the machine translation? So very simply, the output of the machine translation was the English translations of all the three corpora. The peculiarity was that we had a high number of OCR and encoding issues, especially in the pre-1960s documents. We found out that all the documents that we had gotten from JSTOR typically were plagued with question marks and were with OCR issues. So, V'Association France aise entendement histoire heure, should actually, L'Association Française, here at the Science Polytechnique, should actually entendement should actually be histoire. What we found out is, when we looked at the question marks, this is something that we could do algorithmically. We could measure, count the number of question marks, either inside words or outside words in the English corpus or in the non-English subcorpora. And we found out, this is what shows here, that the translation actually reduces on average, 80% of the question marks found in the text. So, we didn't know exactly where this question marks were reduced and how, and this is why we looked at the text with some close reading and some manual quality instructions. So, this is an example of comparison here on just a very simple paragraph of a text by Louis Rougier, La Relativité de la Logique, published in Air Continence in 1939. This is the original text that we got from JSTOR and as you see, there are many issues here with question marks. And this is a machine translation. And so, as you can see, there were issues that were present in the original text that would have been, would have impacted the topic modeling if we had done a topic modeling just in French here, that were corrected by machine translation. There were others like here that were not corrected, that stayed. And there are still some that were, some like proposition here that was translated by proposals. This is not really accurate. This is what we call a translation mistake. So, you have here the three types of anomalies that we identified, anomalies that were present that got corrected, anomalies that were present that stayed somehow and things that were not anomaly but anomalies that were introduced by the translation. We quantified this over the sample text and what we found out is all the three types of anomalies that on average, the translation, machine translation left out 3%, the three words per hundred words in each corpus of anomalies were left. Translations introduced about 1.4 words per hundred words of text, of mistakes or issues in translations, distortions and machine translation corrected that 9.4 words out of 100 words. So, all in all, it's a net, if you will, a net benefit of about 8% of improvements of the machine translation in the quality of the text. So, we're really positively surprised by this machine translation step, at least from this qualitative perspective. Now, we looked at the topology preservation test itself. These are the results of the different metric similarity coefficient. Here you see the mental pro-Christus and RV coefficients for the three languages. And as you can see, the results are extremely good. I mean, over 0.98 for French in the range of 0.9 for Dutch and German here. So, not only on the sample text that we studied, we found significant improvements into translations, very good translations and even improvements in the quality of the text. But overall also, it seems that the machine translations preserved extremely well the ways the documents are grouped together in their different vector spaces. So, that gave us very good confidence, very good level confidence in this machine translation step. Now, results of the topic modeling itself, once we were confident about the machine translation, the results are first a set of topics. A set of 25 topics were found by the model itself. So, a topic is nothing but a probability distribution of the lexicon. So, here you see the top 10 words, which are the words that are the most likely or the most probable in this particular topic, in the first topic. And the name here is the name that we assign to that particular topic. Once we interpreted the topic on the basis of the set of words and on the basis of the top articles in which the topic was most likely. So, here you see the 25 labels with the 25 bags of words, top 10 words. Don't forget that in the LDA topic model, it's not just a bag of words, it's probably the distribution of these words. So, some of these words are actually much more likely to be found in given topics. Another way to look at the topics is to look at how topics relate to one another in documents. So, this graph here, this network represent the topics, nodes that are topics, and the thickness of the edges is proportional to the correlation of topics within documents. You see that some topics tend to be often found together in specific documents throughout corpus. So, here we see a first, let's say, cluster of topics that are about formal themes like fill out language, mathematics, truth, sentence. We can imagine that this is something about philosophy of language logic, typically philosophy of mathematics. And this is exactly what we find when we go to some texts. And when we look at the text in which these topics are the most present. Another set of topic concerns is more epistemology oriented. So, concerns knowledge, arguments, or scientific theory. There is another third topic here that concerns, let's see, confirmation, problem of induction, experiment, but then probability. Somehow in between there is a single topic here, which is more about agent, game theory, agent decision, these types of things. Another topic here is about, so I'm jumping by order, A, B, C, D, E. Here is about philosophy of biology with the topic evolution and philosophy of mind and the neurosciences with three different topics here, one about perception of one about mind, another one about specific neurosciences. We have three topics, somehow in the middle here, that are maybe some of the, let's say the core topics in the philosophy of science about explanation, causation, and here in property, there is a lot of, there are a lot of articles about, for instance, supervenience emergence, this type of things. Then we have a little cluster here. There is more about philosophy of physics with something which more quantum, philosophy of quantum physics, but also some relativity found in there, as well as another cluster, which is more about atoms, chemistry that we named particles. And then finally, a cluster, which is more of a social historical nature that includes like topic classics in which one finds classical authors, for instance, Galileo or Newton type of investigations. And then history or philosophy in which you have, let's say articles, there are bad philosophy of science but that involve classical authors, such as Kant or Descartes. And then there is a topic social, which involves more, let's say social aspects of science typically. So this is one way of looking at the topics. I'm very quickly going through them. What we built is a map of topic visualization tool that makes it possible to look at all the details because there's just so much information here that I cannot convey the whole of it just in the presentation. This is a map here, you see a scatterplot of all the documents with their dominant topic. And once you click on a particular document within specific cluster, you get the details here of the article, which is with its probability distribution and also select which of the topics you want to display in this scatterplot. And then you can also look at topic details. So if you choose one topic here, you see a word cloud of that particular topic, all the key to top words here. The top words are again present here, but also in the other topics in which this word is present, the other topics here are listed. So the word selection is one of the top words of the topic evolution, but it's also present with a much smaller probability in topic experiment or probability confirmation or engine decision, for instance. So this is something that can be used, this browser can be used to really look at the details of the topic modeling results. Now, just remember, we wanted to improve or to see how to improve our previous topic modeling with addition of the non-native text. And so one of the core question that we had is, how do the new topics compare to the previous topics? And this is a heat map based on the distance that we measured on the distance between the old topics here and the new topics. The topics are ranged still by cluster, they are the color shades are also representative of the clusters. And as can be seen, the distance that are the smallest or red here, there is a fairly good alignment of the new topics with the old topics, but some differences, right? So the introduction of this 6% of non-English text did change a bit the topic model. We had, so there was a good alignment nearly all topic except one topic here for explanation that did not fully match properly the previous explanation topics. But on the other hand, that particular new topics match better this other topic here about game theory, probably because the models being involved there would have to investigate that in more details. But on the other hand, that previous topic more the topic here explanation was a better fit for this one than any other here. So it's say, if you look from the new to the old, you don't get a good pairing, if you look from the old to new, the pairing is maybe a bit better here. What we see also is that, so the mapping is not exactly the same in the sense that in here, for instance, concerning the cluster of confirmation and probability, we previously had two topics and the new topic model includes three topics. But there is a good matching of, it's probably that topic to the two other, to the two news, to the two new topics here. The same thing for, let's say, philosophy of money and the neurosciences, there were two topics and now there are three topics. And philosophy of physics on the other hand somehow shrunk a bit in the representation, probably meaning that the added non-English text did not have much philosophy of physics in them. And on the other hand, they increased the number of social historical types of topics here because the new topic model includes four topics whereas there were only three in the previous topic model. So fairly good alignment, but still some changes. So what is the, now the topic evolution itself? If we look at the di-quantic view of the other topic model, you see how the topics evolved year through time. So you see on the x-axis, the different time periods from 1930 up to 2017, the x-axis here is the probability of the topics being found in articles of the given time period. And the right axis here is the number of articles per time period. And this is the shade, little lines here. The dotted line is the tall number of articles and the gray dotted line is the number of English on the articles. So you see that it was really pre-World War II that there were some major additions as well as here through the general philosophy of science. So we can analyze the trends even at this large scale. So there was a relative significance of topics related to history of philosophy up until the 1960s with a decreasing trend. And then there were some ups and downs for social aspects of science, typically. There was a concerning philosophy of language. There was a strong decrease of the language topic that can be seen here. Significance and a decrease, probably this is linked to a significant and a decrease of the logical empiricism we see also in the corpus here. Topics such as confirmation were really significant in the 1960s, but then decreased. There was a slight increase of the topic probability also in the 1960s and then stagnation. We find an increase in epistemology related topics can be seen here and all this area and pretty clear through the topics arguments and knowledge and we need to look at more details which types of articles were behind that. There is in terms of philosophy of biology, a strong presence, there was a presence of philosophy of biology before the... I said before 1940s, but actually before 1950s and of mine here, a decline and then more development especially in the 1970s, especially in the philosophy of biology and the neurosciences and a relative constancy here of this little single topics, age and decision all throughout the corpus. So slightly increasing from the 1940s on. And concerning the other two clusters, there is relative significance of topics related to the philosophy of physics, yet a slight decreasing trend in this same topic since the 1970s, things are a bit different depending on which topic you look at but this is the general trends that we observed here. And there is a regular increase of themes related to causation, explanation and property all throughout the corpus. So these are the broad trends that we can get through the topic model. We can explore in details all this also in the web interface that we've built, looking at the different time slices, also looking at specific topics in detail. And so as to be able to gather more insights because this is again extremely rich in terms of data. We calculated by using the journal metadata, we calculated the journal profiles. And this is where we see the topical distribution through all the different journals and the different journals being here on the X axis. So Air Continence Centers, the BJPS, Philosophy of Science, CJPS, ISPS, DGPS, and SHPSA, point A. And we see that the profiles are different somehow. Air Continence Centers are quite heavy on the formal side, philosophy of language, philosophy logic, also on epistemology, on the opposite journal such as ISPS, DGPS, or SHPSA are much more at a higher proportion of the social historical types of types of topics. But even looking at details, you could look at the details here and you see that some of the journals have, tend to have different emphasis and different on some topics more than others. We, as I said, we wanted also to compare things before and after adding the non-English corpora. So this is a comparison of the topic model done on the complete corpus, the dichroic view and the previous topic model on the English corpus. So of course, one of the main thing was the addition of a time here and here, earlier publications that were not present. What we've seen is a slight shift here of the social historical topics that were, you know, that level here that went, but increased at that time period and also here in the seventies probably, this is due to the inclusion of the non-English texts. So they were, they were of somehow different nature than the average English on the document. We did similar comparisons on the journal basis. So here you see the top, the journal dichronic views on the complete corpus. This is the first raw here. So we see Erkentnis, Santès and Djps by Tim Peretz when they were published. So Erkentis, Erkentis was published very much, you know, as you may know before World War II, then was interrupted by the war and only recovered in the 1970s. Santès was published before World War II and got republished and re-interrupted again and the Djps only started in the 1970s. What we see is the, I mean, they're going to compare the new topic model with the previous ones at the level of journals, such changes versus addition of new time periods, but we see also the some of the topics, actually for Erkentis, increased here the topics on philosophy logic and language, increased in share compared to where they were before. On the other hand, it was something, yeah, the trend went the other way with Santès in which there were few social historical topics being present and there were more of these topics in the subsequent modeling. And then we also observed some changes for GDRPS. But we thought that there was very interesting to look at the pre-World War II period, especially in Erkentis and Santès as we saw that some of the impacts of adding the new text, non-English text, were the strongest before World War II. So we looked at author publications and throwing in authors were able to compute the average contributions of authors during specific time periods. So in this particular case, this is for all articles published before World War II, I can tell you in 1941. And so this depicts the proportion, the contribution of each author to a given topic for that period through his or her articles. So for instance, what is really noticeable here is the for Erkentis, the importance of bluish, of formal topics, philosophy of language and mathematics and logic with a strong contribution by Reichenbach, by Carnap, Neurat, Frank Schlich, Hempel, but also by the Polish logician, H.G. J. Kiewicz. This is what we see here. We also see that these are the strong contributions of, well, the founders of the Yeniserical, typically. So this is not surprising somehow that we get this type of picture. What is maybe a bit surprising is that some of these authors also contributed to the more general topics called philosophy and history. And this is in particular the case, not so much for Carnap, but for an author like Reichenbach. Reichenbach is all over the place, for instance, you see it's here, here. It is also in philosophy and a little bit here in history. Neurat also contributed strongly to philosophy and history. To get more details on that, one needs to look at the articles also. And this is what we did here. For instance, this is a sample of 10 articles by some of the main contributors to Akinnis during this pre-World War II period. And we see four, they're just listed by alphabetical order here. These are the contributions and these are the topics and their probability distributions. So we see, for instance, that the logician Ajit Kiviks in an article entitled Language and Meaning has a high proportion of topic language here. And on the other hand, we can see, for instance, or Neurat that here we have three articles by Neurat. That's two of them, these two here are very heavy on philosophy language. They are about protocol sentences, about radical physicalism in the real world. On the other hand, this other article about ways of the scientific worldview is really not heavy on this formal topics, but much more heavy on the history and philosophy. So general type of topics. Looking at Santès brings a total different picture with a total different set of authors, much, much more heavy on the philosophy and historical general type of topics, little bit on the formal topics, but really not so much, definitely not the same type of picture as Akinnis in this pre-World War II period and a radically different set of authors. We see authors such as Sean Makers and Groot. They were actually quite prolific authors who contributed much to the topics philosophy and history with very diverse articles on matter, on thinking, cosmogony, but also on beauty, on sin, among other things. One of the early collaborators of Santès, not the founder, but early collaborator, Sean Makers was a mathematician, but he was also a member of the Theosophical Society, like the founder of the journal. And actually, many of the collaborators of Santès at that time were also a member of the Theosophical Society, as I found out. It was the case for the Dutch philosopher and mathematician, Gerrit Manuri, who's here and who contributed also most to some of the formal topics. But that was also the case of the German biologist and philosopher Hans Drisch, who is here, as you can see. And also of other contributors like Krusman, I'm looking at my notes here. Krusman was also a biologist, a mayor of Drisch, and he too was a founder of the Theoterogecal Society and also a member of the International Society for Significs that gathered several other of these early-country contributors. So a very different set of authors, probably with a different mindset, different approach to philosophy of science that is depicted here when we look at the details of the Santès, of the pre-Werval to the Santès. Again, same type of exercise can be done by looking at some of the articles and looking at how some of these articles contributed to the different topics. As I said, for instance, Schoenmakers contributed very diverse range of articles on thought, on beauty, on sin at that time. Krusman contributed articles, for instance, for organisms in society, and a little bit here on the biology topics, for instance, and so forth, and so forth. I'm going out to the third journal that we thought was interesting to look at in the pre-Werval to the period, which was the third journal published at that time and it was philosophy of science, profession, English language, and here, again, a different picture with a radically different set of authors. We can see a picture which is still more dominated by these historical philosophical general types of topics, but with a bit more of the formal philosophy of language and logic topics, but definitely not as much as in air kindness. Authors, some of the main authors included Malicev here. So Malicev was the founder and the first editor of philosophy of science, and so he contributed himself a lot to the early philosophy of science, and this is something different that we saw in Santès, where the founder did not contribute to the early Santès. But there were other, so Malicev is present here in many other topics, he's written about philosophy of physics, written about many different things. He's somehow all over the place. So he addressed a very broad range of questions. You can see also philosophy of particles and philosophy of even quantum, the topic of quantum mechanics here. Five other figures also emerged if you look at some of the strongest contributors, for instance, David Miller, who contributed much to philosophy, as well as Charles Hartsoorn, sorry for the pronunciation, known philosopher of religion and metaphysition of the time. On the other hand, Lucian such as Louis Katzow here, or Henry Smith also contributed to philosophy of language and logic related topics. As you can see, there is maybe a bit more, it's more cluttered here, there are more contributions, there are more scattered throughout the different topics. So all in all, and so the same exercise can be carried out here, looking at different articles and how these articles contributed to the specific topics at that particular time to see the richness behind the model itself. So all in all, the addition of the non-English text shows a pre-World War II philosophy of science that is somehow much more nuanced than we had before. We just, when we just looked at the English text, it shows very strong also specificity of the three journals, Eric and Vicentes and philosophy of science somehow with very different set of authors all throughout the period. So I'll just take a few minutes maybe to wave in the direction of the discussion and bring some topics here for discussion, especially concerning the methods, but also the results. What do we get of the usefulness of machine translation provider word analysis? Where our results, both the manual inspection and the topology preservation test provide good reasons to trust machine translation for back word analysis, your chance to have a model. Here, which had chosen Google translate services for consistency with previous studies by DeVris, but other machine translation services also offer equally valid solutions. This has been tested by right there. Machine translation, we found that also is of great help for fixing OCR related or encoding issues. Yeah, this is as a free benefit of the machine translation. But interestingly, it preserves word ordering also. So it should be adequate for other analysis, not just bag of word analysis, other analysis that rely on word ordering index, such as collocation, co-currences or in sentiment analysis or even when a word embedding. The topology preservation test that we proposed is of course only in necessary condition for reliable transitions not sufficient per se. So it's not guaranteed that the translation will work, but it provides something comforting somehow. It increases our level of confidence of that the translations work well. And this is actually one of the other things that we can get if you have no reference translation to check the translations against. And this is the case for non-parallel multilingual corpora. So we believe that this is an interesting way of systematically checking where the translation went, whether it went well or not so well without having to look at all the details. About the corpus itself, what we saw is that we got more accurate topping modeling. And that was a motivation for including our non-English text. So that was, yes, that was a motivation. Of course, we could say that the, I mean, a comment that can be done is that we only looked at eight journals and that philosophy of science is published in many other places, many other journals, sometimes more specialized ones, depending on scientific disciplines of philosophy of physics, philosophy of biology, especially as time went on and more and more journals got funded. There's also a lot of philosophy of science published in numerous monographies and edited volumes. But so all in all, this shows that the, it doesn't show, but it's just a warning. Of course, the results should be interpreted in light of this corpus-related limitation we only looked at eight journals, but we believe that the representativeness of the selected journals, which are among the most central journals publishing general philosophy of science lends confidence that the topical trends that we observed in did capture meaningful disciplinary patterns at least at this level. But of course, comments can be made on the corpus itself. The topic model, there are different comments that can be made. Of course, we could have chosen different algorithms, but we don't believe this would have changed much. One of the things that do change a lot, the models is the number of topics. So choosing the number of topics in the topic model is something really crucial. Here we deliberately chose 25 topics as to facilitate comparisons with the previous English-only topic model. And as I said, this barely has the advantage of offering a fairly cause-grain view, which suits the purpose of sketching a disciplinary portrait over the course of more or less eight, nine decades. But in the previous topic model, we had tested also different types of models. So we did, by trial and error, we did topic models with different values of the number of topics, some with a smaller, like 12, 15 topics, up to 50, 100, 150, or 200 topics. And in the end, we settled on the, on 25. Of course, finer-grained topic models, like topic models with 100, 150, 200 topics, would offer much more details. This is what we've done in a previous study, the one that we published in Hopus. But here, the choice of a topic, the number of topics, choice of granularity, ultimately, it depends on your research at questions. What is also key is the fact of being able to interpret the topics. And of course, one should always bear in mind that, even though LD and topic model is a generated model that builds up topics from the corporal, expert knowledge is always needed to interpret the topics and the results of the other model. Then a final comment about the philosophy of science itself, the translated text, as we saw, must affect the early decades of the di-clinic topic model and somehow the topical profiles of the three journals, Erkington's Centres and the TGPS. As we saw, this text counted to 6% of the total corpus, but the share rose to 54% before World War II. Of course, topic modeling only provides a descriptive view of the topical content of the corpus, but it cannot explain the observed facts. This is again, an area that a researcher has to fill in with specific knowledge of the field or that can lend itself to further investigations because there is a wealth of ways in which the data can be further investigated. And as possibilities here, the changes themselves meet different factors. They can be researcher-driven, they can be journal-driven, they can be driven by disciplinary dynamics, they can be driven by extra-discipline dynamics, but still in science, or they can be driven by extra-scientific factors in funding funding policies or broader historical sociological factors. So understanding the whys behind the changes in probably, topic probabilities is something that requires definitely further investigation beyond the topic model. So just to conclude, now as my time's up, what can we say more generally about computational text mining approaches in philosophy? We believe they're extraordinarily powerful to study large corpora. And here, we wanted to show that they can even be implemented on non-parallel multilingual corpora with the help of machine-learned machine translation. Many, many different types of analysis or possible topic analysis are one, but also you can do many analysis, depending on the metadata that you add to the topic analysis, you can do dichronic analysis, journal analysis, author analysis. You can also do conceptual analysis by focusing specific words. You can do author network analysis, and so forth. They're useful in a descriptive way. They describe what is in the corpus. They're also useful in a heuristic way, potentially by pointing to areas of worth of further investigation through classical closed reading methods, for instance. I think can also have a justificatory usefulness in the sense that they provide an empirical grounding to claims that may otherwise be quite informal. So with that, let me thank my co-author, Francis Larot, the Martin-Leonard who designed the website, and two other students here who helped with the German and Dutch text, as well with publishers, institutions, and the agencies. Fantastic, thanks so much. Questions have been absolutely pouring in, so I'm gonna get right to it. So this is, there are 11 in the Q&A box, so I'll even preemptively ask you to try to be a little quick so we can see if we can get through them all. Top questions should be fast enough. Is that, top of the questions from me? So just to be clear, those error and improvement rate estimates that you were mentioning, that's just you had people manually sit and read through a selection of these articles, right? Correct. Okay, and someone else, Rose Travis also mentioned a comment to that question. So how in some cases were you evaluating, I mean, the question of translation error can be kind of difficult in a philosophical context, right? Knowing what the right translation is. Were they all pretty obvious, or were there some judgment calls? No, no, they were not obvious, and it's more to get an order of magnitudes of where we were, because we did a, with some code, we're able to assess the term number of question marks that were eliminated through the machine translation. So that gave us an indication that, yes, machine translation eliminated a lot of the question marks, but they didn't really improve the text. So this is where we had to look at some of the details somehow. And so what was really easy was to track problems in the original text that were corrected. A question mark inserted in the middle of the text because it was an encoding issues and that was eliminated in the translation. There was, most of the time, non-ambiguous at all. The translation mistakes was more difficult. The one I showed, someone could say, well, yes, it's not exactly the same word, so therefore the meaning is gonna be different. But we were, let's say, conservative, I would feel. So we probably attributed more error translations to Google Translates than would be meaningful. But the key point was that, by looking at these excerpts, these randomly chosen excerpts, we were able to really understand where the machine translations did some improvements and potentially where it introduced some mistakes. But the numbers that we measured manually, they're just made on some sample text and they're just, I would say, auto-magnitude, so. Sure, sure. All right, next question from Stephan Hesbryk who asks, the curious whether there were any papers containing none, or at least not highly representing any of the 25 topics. So for example, wondering about the apparent absence of chemistry, right, in the topic model. Sorry, what about chemistry? Well, so there's not really a topic for chemistry, so is there a way to look at papers, were there some papers that seem to not be, carry a very strong signal for any of the topics in the topic model? Did you see any that? Yes, but I mean, it's true that the topic distribution per article is not necessarily, can be a bit flat, but typically the LDA tends to arrange, make it so that there are, typically some topics are more represented in articles. So for instance, chemistry would be in the particles, what we call particles because there is talk about atoms, about molecules, so looking at the details of that particular topic, you would find articles in chemistry, but we did not look systematically at all articles that would have a rather flat distribution of topics. Okay, next question coming in. Yeah, so have you evaluated other methods? So why the choice to use traditional, a bag of words representations instead of some, for example, transformer embeddings that might seem a bit more context aware that would be kind of in line with some of the other ideas that are in the talk? Again, here are the, the LDA works really very well. We didn't see the need to do some, to implement some more sophisticated word embedding, machine learning type of tool here. I'd say the advantage of the vanilla LDA is that it's very simple, it's complex somehow, but compared to others, still very simple algorithm. It's very well proven. It's been used in many different studies. So in terms of, you know, acceptance, it's extremely well accepted by all their scholars. The point also is that we had done previously the study using the LDA, the original LDA. So we wanted to be considered to be able to compare. And the other point is that de Vries, when the tested machine translation and topic modeling on parallel corpora also used the original vanilla LDA. And we wanted to be able to compare our results to their results or contribute to the same type of work here. So this is where we didn't look further. But we, I mean, we tested on our computers different types of models, different ways of doing topic modeling. We didn't like a, even the dichronic ones, we didn't, they work, there are some probably improvements that they're, I don't think they're very significant improvements, but maybe, yeah, maybe some like word embeddings could be used for all their studies, but we didn't, we chose not to here for the reasons I just mentioned. Sure, sure. Next up, a really great question from Susan Hunston, who will be our keynote speaker tomorrow, tomorrow at the start of the day, tomorrow. So how would you assess the potentially negative impact on the discipline of philosophy of science of the move toward publishing only or mainly in English? Is there a way to approach that question with this kind of analysis? This is not something that is shown in the topic model. So what, you know, there are things that the topic model shows certain trends. Now the trends, it's, you know, the trends that we observe are not necessarily caused by a shift to English only, right? So we have to be careful here. We only observe certain, you know, distribution probabilities, but we do not have access to the causes behind the shifts that we see. This is something that would need to be investigated. And the field's actually much more complex, you know, also than these eight journals that we have selected. So there is only so much that we can say about, you know, diachronic changes. Probably what we can say is that there was a style of doing philosophy pre-World War II or even up to the 1950s and 60s that changed afterwards. I'm not sure it was only because it was in English, connected English. We would have to see if the similar changes would be the case in, let's say, French only corpora or German only corpora. But that did change. And here are some other authors like Richardson, for instance, or Gary have proposed the fact that the field has professionalized itself also. And therefore it went through a transformation, a significant transformation phase. And that was probably a very significant factor in the change that we observed in the topic distribution. Together with editorial policies, let's be clear because editorial policies have a strong impact that we see not on the overall diachronic picture, but that we see on the journal profiles, right? Great, thanks. Now back to some more technical questions. So Petrovich actually, our next speaker asks, did you experiment with other distance measures in addition to Euclidean distance, cosine distance, for example? Yes, we did similar results. Cool, all right. So we use the Euclidean distance here for consistency throughout, but the result was similar, especially for the previous topic to the new topic similarity measures where we use the Euclidean distance. We did this on their word factors, but we also did this on their distribution patterns over articles and the results were also similar. Not exactly the same, but very, very similar. So that didn't change. Great, great, thanks. Question from Luca Rivelli. Oh, this is cool. So what about measuring topology preservation on machine translated text, on a round trip machine translation text from the original language out and back? Did you mess with that at all? No, we didn't do that. We could do that. We thought about taking the English text through translation and to another language and back, especially because we found out that some of the earlier publications in English also included some question marks, some OCR or including issues, but much, much fewer than in German or in French, but we didn't do that. What we thought would be really interesting would be to do the topology preservation test on the corpus of De Vries, let's say on the European Parliament corpus, because there is a gold standard translation and also a machine translation, but to implement the topology preservation test, we would need to have access to the entire raw corpus and we didn't have access, we didn't ask for having access, but that would be interesting because in that particular case, we would be able to have the results of the topology preservation test together with also their own metrics on similarity that they implemented, but on the De Vries, their similarity metrics that they implemented was between the machine translation and the human translation in English. So they were comparing English to English, machine translation to expert translation, but they were not comparing machine translation or their expert translation to the original text. This is what the topology preservation test does. Sure, sure. Okay, next question about cleaning from Stefan Reiner-Selbach who asks, did you look into some of the newer methods, it's actually Googling this during the talk, a machine learning methods like the Python Library AutoCorrect to parse through some of these OCR errors, sort of as typos as a pre-processing step? We looked at them at posteriori, but we did not implement them. We just found out actually, there was a bit of unfortunate, but we found out about the errors and the demand of errors somehow afterwards and through the translation and found out that the machine translation corrected a lot of them. But of course, there are many tools that are available to patch these issues. But we thought that was something, interesting to point out, this was somehow our initial mistake not to have found out about this earlier, but only at the end, we found out something that machine translation was able to fix it quite significantly. Also, and in ways that then would actually be very satisfying for back-of-board approaches or even more than back-of-board approaches. But yes, they are, thank you for pointing this. There are other methods for correcting these issues. That's really funny. One of these cases of serendipitous discovery from these tools, yeah. Question from Stefan Lindquist, who asks, some curious about the methodological question of how to settle on a number of topics. So I know that interpretability is important, but are there advantages perhaps to presenting one's results at multiple levels of grain? Yes, you could do that. And this is what we're thinking about doing with the tool that we showed online. The web-based visualization, the thing is a topic model is already extremely rich in the amount of information. And if you show two topic models of different scales, then you need to explain even much more and potentially you need to investigate much more. Or you need then to do a high-level, very fine-grained topic model to explore only some very specific topics. And this is something that we've done for instance when we did a topic model of over a hundred topics of just a journal philosophy of science. And when you do this, you see some, very fine-grained topics that are really about, for instance, some models of explanation. So you see the topic about, for instance, the how the DN model, Hempel's DN model developed in the 50s, 60s, and then it's popularity went down. And then there is another topic about much more general causal modeling and that goes up and then goes a bit down. And then there is another one that goes up all the mechanistic model of explanation that goes up the start in the late 1990s, 2000, and is still on the rise. So you see this type of details. So I would recommend to do a fine-grained topic modeling. If you want that particular type of detail, if you want to investigate, for instance, what were the different research topics in philosophy of science concerning, say explanations or models or concerning causation, for instance, and then which were some of the key articles that appeared at which point in time that you can relate to. But presenting simultaneously fine-grained and a coarse-grained is also possible, but you'd need a mapping tool somehow to be able to investigate, see other detail and find out what is really interesting in each one of these models. So they do not lend themselves to publication in traditional articles one that isn't. That, yes, I can imagine and have some experience, yes. Question coming in from Arlie Belevo who writes, do you have any plans to test this with non-European languages? Do you know of any corpora that might be available for that? I haven't, we haven't planned to do it. Be very happy to know that others would like to do it. I mean, we know that some multilingual topic modeling have been done also on non-Occidental languages, but we haven't done it ourselves. Okay, the question from, another question from Petrovich who asks, did you check the overlap between the sets of authors for each journal, some kind of way to measure that as a quantitative analysis? No, this is a good point. We did not implement, no, we did not implement a metric to do this systematically, but some things could be seen already in this hierarchical diagrams here. You see one figure here, Carnap, just appears in philosopher's lines and this is the late Carnap in the 1930s after he immigrated, a quick way to the United States and before he was here, right? So you see some of the overlaps here, but you see it qualitatively so we did not implement a metric but that would be something good to do, yeah. We have, you have to be aware that working on authors is, it's still a lot of work and you did a lot of curation work because authors may be spelled differently in different articles and also you have the problems of multi-authorship, multiple authorship, so it needs a lot of work. Sure, sure. Actually, hang on, I'm gonna piggyback there for a clarification question. So for these diagrams, did you guys actually sit down? I mean, I guess that's why the time period's a little delimited as well. Did you guys actually sit down and clean these author lists by hand? Yeah. Okay, wow, that's, yeah, that's a lot. That's a lot. One last question from Stefan Hesbuergen. So we actually, we might have time for one more if somebody wants to add one more into the box. So regarding monographs versus journal articles, do you think there's some way to produce an estimate of how both forms of publication might have developed during the period in question? How would you have access? I mean, having access to the journals, there are ways of doing it. Having access to the monographies in our edited volumes, that would be, I think, quite a lot of work. I don't see, you know, I don't like this spontaneously any easy solution to this. Something like Google Scholar maybe or a search on Bookfinder database. Because they're multilingual to be able to retrieve, then there would be a massive amount of work to be able to not only retrieve the titles and all the publications, but also retrieve the content of them to be able to sit through that. It's a, computationally also, that would be something quite significant. But it's true. I mean, here we would need to have a view of also what is happening in this book portion of the philosophy of science. But unfortunately, I don't have any easy solution. So if you have one, I'd be really happy to hear about it. Yeah, yeah, that's always, there's always a data access question here. With that, I think, let me let the broadcast catch up a little bit and see if anyone has one final question. If you have a quick one, we could get it in. We have a minute left in the time slot. This was actually quite nice. We've lined up very well. Failing that, let me go ahead and, yeah, let me go ahead and thank everybody. I think we'll call it there. So that's very nice, very nice on timing. Thanks very much. This was a fantastic talk. You can go back and look at the chat later that's been active and everybody is passing on many thanks. So fantastic, fantastic stuff. And I'm looking forward to being able to play with that website at some point too. So that's going to be really fun. Thank you again, Charles and Luca. Oh, you're very welcome. We're very happy that everything's been going so well. So we'll be back in five minutes with our next talk. Thanks very much.