 And now we will start with Philippe Kacarainen, from the University of Helsinki, which is also a member of the ReED co-op. Okay, hello, my name is Willepeka Kacarainen and I'm from the University of Helsinki. I'm a PhD researcher in history and I'm really excited to be here today to talk about my journey with transcribes and tell you about my research. Let's start by looking at my disarray, what's the topic. My working title is how modern states emerged in the periphery of the Swedish Empire, state-building process in the Paris of East Alamy, and the time period is the 17th century. Let's start by looking at this map. Usually only Swedish people, so not even Finnish people, know that there was a time period that Sweden controlled major areas across the Baltic Sea, from Livonia to Germany. How was it possible? Sweden was sparsely populated area, there were any significant resources, so the answer lies in the state-building process. Corritionally it has been seen that the constant war made Sweden strong, like the sociologist Charles Tilley has said, war made the state and state made war. Usually the state-building is approached from the macro level by studying kings and how they issued like regulations and things like that. However, like other comparative researchers, I approached the state-building from the below. In my doctoral thesis, I study the relationship between the local community and the state. The emphasis is on the local officials, which is often referred with this concept of personal agency. I have marked my research area on the map. There is the Paris of East Alamy. It was quite a big area and there were around 5000 people living there during the 17th century, so very sparsely populated area. Let's take a look at my source material and how I use transcribes with it. My primary source material are the court records and the aim of transcribes is to make them 100% correct transcriptions. For my secondary material, I have different approaches. For these complaints of the common people to the king, I just run the model and it gives me a roughly good transcription, then I can easily browse through the material. However, for the letters from the officials and the tax records, I use the keyword spotting. So you run the model and then you search with the keywords. Of course, this is somewhat unreliable method since you cannot be 100% sure if it's going to find everything you want. But I can count on that my model can recognize important words like East Alamy, which was my research area. In total, I have around 2500 pages of text in transcribes. Most of them are double pages, so I think it's quite a big data for one researcher and I think I couldn't have done this without the help of transcribes. In this slide, I have summed up my experience with transcribes. Let's take a look first at the graph. Here you can see data from 16 models. Actually, I have made 62 models. The blue line represents the quantity of the train set and the yellow one is the character error rate of the validation set. I think my experiences are quite similar to others. Already the first results were promising. However, there are several hands in my material and also the handwriting changes during time, so there has been challenges. And after a certain point, the character error rate has stayed in around 5%. This doesn't mean that my model hasn't improved. On the contrary, I have noticed that its ability to read handwriting outside of the model has improved a lot. Now, let's look at what you can do after transcribes. After processing my material, I have built this database. And here you can see a simplified example from one court case. This court case is about the premarital sex and you might wonder what that has to do with the state building. And my answer to you is it has everything to do with state building. Because before the 17th century, there aren't any cases like this in Finnish court records. And it doesn't mean that there was a sudden change in the behavior of people and it doesn't tell that there was a change in the social norms. Instead, it tells us how church and state tightened the social control. For my database, every court case is one record. And for every record, I take down several details. For example, the metadata and the transcription from the transcribes. Then I make a rough and short translation. Then I categorize the case and take down every person mentioned in the case. In this case, there's also some other information like it says that the woman wasn't at the court, so he was absent. It also says that they had a child together, which is not often the case. And it tells us how gossip was a very important lead for the court. It is also said in this case that they tried to make the man to marry the woman, but he didn't want to do that. Now that you know how I built my database, I can show you some statistics that are quite easy to pull out of the database. The left hand side, you can see the top 10 most active persons in my data. And this is something that I haven't noticed anyone doing with court records before. So the most active member was this sheriff in Bosvenska, Lansman, and he was almost 500 times at the court. And the most active members are lay members or officials, but if we go down the list, there is also regular persons. And what is even more important, I can take one individual person and search for his or hers court history. And this is really important because usually researchers use only one case where they try to figure out what was the motives and things like that. But with my database, I can see the court history of one individual. On the right hand side, you can see these kind of statistics that were really popular in the 1990s. However, nowadays they are not that popular anymore and many have questioned if this is how reliable method. But I think it's important so we get the big picture of what happened at the court and if there are some changes in the time period. And from here we can see that, for example, violence was quite low and the most common crime was that people didn't show up to the court. And the second biggest was these moral offenses like the case I just showed you earlier. Let's move on to another example. Here you can see how I've mapped the data. This map is about land ownership cases. It's a heat map, so the darker the red is, there is more disputes about land ownership. How I have built this map, first I have tagged every place name in the court records with transcribes. Then I have normalized the tagged place. I have around 1000 individual places. Then I have to locate them. This was really hard work because in 300 years the place names are slightly different now. And the easy part was to map this with this program called QGIS. So conclusions. Why would I recommend transcribes? We are in transcribes user conference so it's important to talk about this. So maybe it's more why you should not quit with transcribes. For me, I think the most important thing is data management because all the things I have showed here today, you can do it without transcribes. You don't need transcribes if you want to build a database. You don't need it for the statistics or visualization. However, the court records are maybe the most used source material in the Nordic history. But nobody has tried to do some of the things I have done here. And I think it has something to do with transcribes. Historical data is very complex. So it's so hard for one researcher to get hold on everything. But with the help of transcribes, you get more structure. And when you have more structure, you can make more complex data structures much easier. Efficiency is also a big factor because without transcribes I couldn't have done so much data. In the end, I was also to highlight the importance of transcribes when it comes to learning. I started to use transcribes because I thought that by some miracle it will tell me what's in my source material. So my dream shattered quite quickly and I noticed that the AI is stupid and it makes stupid mistakes. But since we have to teach the AI with this 100% correct transcriptions, we come to the masters of the old handwriting. It was nice to share my journey and my initial results. If you have any comments or questions, I'm happy to answer them. And we can also continue with the topic afterward. Here is also my initial try on the social network analysis. Thank you. Thank you very much for this insightful presentation. Are there any questions from the audience? Dominik, only a shirt. So the question was, am I going to give my transcription to the National Archives? At some point, yes, but at the moment it's most of what I have done for my PhD and I'm not on funding, so that's why I'm still holding it. Any more questions? Yes, I tacked while correcting the mistakes, I tacked every name. There are like 20,000 names tacked, like person names and maybe more than 10,000 place names also. Yes, I exported them and then I made the normalization with Excel and with the help of short Python code. I'm not a great coder. Any more questions? And also one thing I tacked was the every letter of the every court case so I can split the text to cases because there isn't like any indication when one case starts and one ends. So that was useful. Any more questions? Otherwise I would have one. How was your experience with the scholarship that you got from the recovery? You mean like the free pages to use? Yeah, was it easy to apply for the scholarship, for example? Did you get help from us when there were questions on your side? Yes, it was easy. I have applied for, I also teach this how to use transcribes in the University of Helsinki. I have asked like maybe three times, give me free pages and they give me. So it has been easy. Good. Right. If there are no questions anymore? It's funny because it was said in the beginning that we are part of the cooperation but actually I don't know anyone else doing research with transcribes in Helsinki. It's quite a big university and people are doing their own stuff. But you can say that I'm sort of transcribes guy in the university. So the question was when I built the database, did I use the tagging? No, at that moment I just like manually did the work. It's quite hard because the names are always a little different during that time period so you have to normalize a lot. So that's why it's not so automated process that you still need manual labor. Yes, good question and actually I have a lot of people with same names because in Finland during that time it was common that you give your eldest son gets the name of the father. So there are like many people with different names but luckily it's quite a small area. So I can get track of who is who from different things. It can be mentioned in the court case that this was the father and this was the son. So I can separate them not always but most of the time. Good question. Good. Then we've got one more question. Yes, I'm recently into these maps I like to make them. Okay, so I think we'll move on. Thank you very much for your presentation.