 Right. Florian will prepare the presentation of Annette Larner from the Aarhus city archives from Denmark. And you're also a member of the ReCorp, as I know, and we will be talking about saving the Danish cultural heritage and engaging new users across all over Denmark. Stay cheers. Thank you very much. Such an honor to be here today. So thank you for having us. So my name is Annette Larner. I'm from Denmark, from Aarhus, which is the second largest city in Denmark. I work at the city archives, where we are currently hosting and owning a national project using transcripts. And I will be talking about that today and I'll be talking about how we are engaging new users across all of Denmark in this citizen science project. The outline for today is, first, I'll give you an introduction to the retro digitization project, henceforth retro. It is the Danish approach to AI power text recognition and transcription. We don't really have any other projects, large-scale projects in Denmark currently using transcripts. So we are, as you might say, a front-runner in this. I'll be talking a little bit about the sources that we're using, some of our results. But more importantly, I'll be talking about who our partners are and who our participants are. Who are they and how are we engaging them and how are we training them. So that is the outline for today. And also I'll be talking a little bit about what's next after this particular project. So a couple of weeks ago, Flo contacted me on LinkedIn and he said, I just noticed a big spike in users in Denmark. Why might that be? And I said, well, that's because two weeks ago we hosted about 100 students from the University of Aarhus history students, second-year students at the archive. And we wanted to introduce them to transgrubbers. We wanted to introduce them to using and reading old handwriting. And it was really quite an interesting day. They came at nine o'clock and they left at three. And at first we gave them an introduction to the program. And then we gave them some training pages and set them off and let them loose on the program. And it was quite fun, actually, because some of the students, they thought the challenge was really interesting. And their initial reaction was when they saw this text from 1940s, this is like my great-grandmother's handwriting. And I can't read it. And these were history students. And you might say, well, history students, they will be working with handwritten documents from the 18th century, 19th century, 20th century, maybe even further back in that. And it's a really good idea that they are actually able to read the original sources. So when they encountered 1940s documents and they weren't able to read it, they were like, oh, my goodness, how are we going to proceed as history students? I will talk a little bit more about that later. So the Retro Digitalization Project actually launched in 2017, not 18, as a collaboration between the Organization of Danish Archives and the City Archive in Orhus. It was an effort to try and digitize council records and parish records from the early 1800s up until the early 1900s. It was an effort to commemorate the 50th anniversary of the council reforms in Denmark in 1970, where 1,098 councils and parish councils were joined together into 277 councils. It's now further been reduced to 98 councils. And what we wanted to do is we wanted to digitize, scan and transcribe council records and minutes from this period in the mid 1800s up until the early 1900s. And we wanted to make these records widely available to everybody, regardless of whether or not they were able to read handwriting. So our aim is to digitize between 1.5 million and 2.5 million documents. So that is a huge task, and we can't do it all by ourselves. So what did we do? First of all, the material that we've chosen was town council and parish council records and minutes. The reason why we did that was because these sources will help us understand how local democratization developed, basically decentralized democracy happening in all these different councils all over Denmark. The material is also a massive catalog of geographical place names, names of people, local decision making, records of local interests. Why did we put roads in a certain place? What happened to poor relief, that sort of thing. And it's also a unique collection of sources regarding our decentralized democracy in Denmark. So the organization of this particular project is owned by the organization of Danish archives, which is ODA, and our archive in OCE. We have a project manager and then we have three coordinators and IT experts. And then we have an advisory board from the Danish National Archives, Oshu University, the Royal Library in Denmark. And they have been invited into this project as sort of to validate that the work that we're doing has scientific validity. We initially received funding from the organization of Danish archives and Oshu City archive when we first started out in 2017. But then last year we received funding from various Danish foundations to make a three-year project, a bit more of a collected project. So the money will be mostly used for buying credits from trans groups. When we invite people or archives all over Denmark, we want as many archives, local archives from Denmark as possible. The way that the Danish archive system is set up is that we have a national archives and then we have paragraph seven archives, which is local and city archives. And we are mostly working with local and city and town archives. And we've invited as many as we could possibly find and some of them thought it was a great idea and they came back and said we might actually like to hear more about this. So we said, well, everyone who wants to participate are very welcome to. They will receive support from our IT experts and our coordinators. And we put on workshops for the various archives but also for some of the volunteers. There are some terms and conditions for participating is that these archives, they must scan all of the records themselves. And they must add them to the project website themselves. We can't do all of it for them because we are looking at thousands and thousands of records and thousands and thousands and thousands of pages and we just can't lift that task all by ourselves. So we are expecting and hoping that their participation is such that they will scan their records themselves. If they can't, if they are too small, a small tiny little local archives, then we can help them with the scanning. They must all sign up to use transcripts. That is a prerequisite because that is what we got the money for. Some archives already have volunteers. This project is heavily based on volunteers. We can't do all this transcribing ourselves. So we really do rely on excellent volunteers. Some archives have lots of volunteers. Some archives don't have any and some archives have few and it's up to the various archives, the local archives to recruit volunteers to engage in this project. We're currently working with 60 volunteers across Denmark. These volunteers, some of them are very active, some of them are not very active. It's also based on seasons. I mean, we saw during the summer, we did have some nice days, warm days in Denmark this summer. The activity fell and now we can see that people are starting back up again now that we're heading into autumn and winter time. The way that we reward the local archives is that we give them credits as and when they need them. So we are starting to hand out credits now to the various archives and that's really great to see. Next, you can see some of the participants that we are working with, archive participants that we're working with and I don't know how familiar you are with Danish geography, but I can tell you that this is very much a wide variety of places stemming from the very north of Denmark to the very south of Zealand and Denmark. So it's really, really good. And we're very happy with the archives that we're working with. This is a picture from our website, which is a counter for how much we've done. This is not updated. We've actually done a lot more. So we actually need to update that one, but just as an example of how we are promoting how far we've come on the project. This is a status just for the Orhus City Archives. You can see here, we have currently scanned 51,000 pages. We have manually transcribed 8, just over 8,000 pages. We've machine read about 29,000 pages and proofread about 12,000 pages. And that is about a quarter of all the records and pages that has been supplied so far throughout the with all the different archives. Some of the problems that we are encountering is problems with segmentation. And we have found that thorough segmentation before machine reading is absolutely critical to get the best results. But I'm sure all of you are very aware of that. So one of the interesting parts, maybe for you, is does it work? Do the models work? The models that we're using are a mix of models created by the Norwegian National Archives and the Royal Danish Library. And I've got about three slides coming up to look at how our results are. The records and minutes from 1870 to 1950, that is our main focus point, because that is where we have most of our material. So we trained a model using eight and a half thousand pages. The models does the material comes from Olds, Faxes, Nisvill, Guenthofte and the Royal Library. It works very well on most non-gothic handwritten material. We have both Gothic handwritten material and old style handwritten material. And it definitely prefers the old style handwritten material. And we have a margin of error of five to seven percent. So we're very pleased with that. We've also got a Danish Gothic print model. You can see we've used about 3,000 pages of training data. It's very comparable to OCR programs or models. It works on most printed texts that we've got. And the margin of error is very, very, very low. We also have a collaboration with a researcher at Orhus University, historian called Nina Kofel. She received a lot of money from the Kalsberg Foundation to work on administrative writing in the 18th century. And she alongside some student helpers trained a very, very good 18th century model that we also use in our projects. She found that the margin of error that they encountered was six to seven percent. And they were also very, very pleased with that. Gothic handwriting in Danish is incredibly difficult to read. I know that because I worked a lot with it when I did my PhD and I didn't have transcribers and I really wish I did because it would have certainly saved a lot of time. And then we're also testing out a generic 19th century model. It's like I said, it's an experiment because we are bringing in different types of texts, different types of material from 19th century mixed. Mixed is both Gothic and all-star handwriting and 18th century administrative writing. The rate of error is eight to 10 percent. So it really could get better. We have found that a more specialized model for the material that we are using is better. But we would actually like to train this generic model more. So we do need a lot more training material for this to work. But we are in a process of doing that. So crowdsourcing and training new users. Like I said, we have about 60 volunteers throughout Denmark. The way that we are recruiting them is actually not us as such. It's the local archives who are participating in our project. Like I said, some of them have lots of volunteers. Some don't have very many. It's up to the local archives to recruit volunteers into the projects that they are working on. Each local archive will work on local material because the local material is there locally in their archive. So they decide within the framework of City Town and Perish Council records, they get to decide which material they're looking at. They also then get to give this material to their volunteers to work on as well. So we are not actually fully engaged in the material that they are working on per se, but we are kind of keeping it all together like sort of an umbrella type thing. So the way that we're doing it is we have a brilliant coordinator called Mia who has been traveling throughout Denmark to the different local archives and setting up workshops. She's done about eight workshops where local archives from that local area or local region have come in and they've been trained in using transcribers. Then they tried using it themselves afterwards while Mia was still there at hand to help out. Then the idea was that they would get sufficient enough to then go and train their own volunteers. We also have weekly gatherings at the Orhus City Archive. This is actually one of those ways where we are trying to give back to our local volunteers in Orhus. Usually our reading room is open Tuesday, Wednesday, Thursday, but we have exclusively opened our reading room on Mondays for our VIP transcribers and we always make sure that there's cake and coffee available for them. Now we saw that there weren't very many people coming in during the summer, but now with autumn coming up, more and more people are coming in and they are just having such a lovely time sitting all around these tables, helping each other, and that's just a really, really nice atmosphere. We are also looking at doing online workshops, but online workshops don't work as well in our experience as going and being there physically. It can sometimes be a bit difficult to know exactly what pace you should set when you're doing an online workshop, whereas when you are with people, they can put their hand up much better and say, sorry, can you just repeat that? We are also working on making vlogs, video logs or video blocks in Danish because one of the challenges that we have is that our volunteers are mostly seniors and their English isn't great, and as you all know, transcribers isn't English. So that is quite a challenge. We have translated the guides into Danish, but sometimes it just isn't enough. So we're looking at and working on making video tutorials. We're also working with the history department at the O's University, like I said, introducing students to transcribers, because some of them will be using the program in their assignments going forward. So some of the experience we have with training new users when we're talking about seniors is when they come in, they are incredibly excited. They think this is amazing. They love transcribing. They've been transcribing for years and years and years when they've been researching their own heritage and their family history. So generally speaking, they are very interested in this type of work. They have an ability to read old style handwriting and Gothic handwriting, so they are a treasure trove for us. They have the time, unless it's winter summer and they want to be in the garden. And they are generally very interested in preserving the local history of the area where they live in. And that's one of the reasons why we're very keen to give the material from the local area, because they are more engaged in doing that kind of work than if we gave them something from O's and they might not actually understand why should we transcribe things from O's when they live in Zealand. Some of the challenges is that, like I said, the program is in English. They're not always very sufficient in English. So that really does require quite a lot of hand holding. They are also not always particularly literate in digital programs. As I'm sure most of us will agree, it's not the easiest program to use, especially not when you first encounter it. It does take some time to get used to, because the buttons are small and they can sometimes be difficult to figure out. The segmentation is arduous, as I'm sure we'll all agree. So some of our, most of our volunteers, they just want to transcribe, because that's what they've been used to. That's what they've been doing for years. They've been transcribing in Word and Excel and that sort of thing. So they're like, why can't we just continue using Word? Why do we have to start with segmentation and why do we have to do proof reading and why do we have to all do all these things? So that's our task then to say, well, this is the future and you're part of building a future. And then they're like, oh, wow, that's really cool. And actually, a funny story. On Monday, we had one of these VIP gatherings of our volunteers and an older guy came in and he'd never been part of this project before. So he was, I think he was just, he just heard about it and he thought, this is interesting. So he came along and he'd never used transcribers before. And we have this Facebook page, our user Facebook page for all of our volunteers. And he, he, he wrote a post on Facebook afterwards saying, this is really exciting. I really, I can't wait to get, I can't wait to get started. I've downloaded transcribers and I can't wait to get, get started. And then yesterday, I saw that he commented on his own post which said, goodness me. It's like going from a 1960s Ferguson tractor to a four by four SUV. This is exciting. So that was really cool. So our experience with training new users when we're looking at students. The positives are, they're excited. These are history students. They fancy themselves as historians. Most of them probably never work with history, but we won't tell them that. So, so they do, they do have an interest in, in history. They see great value in working with transcribers. They see great value in gathering a large pool of handwritten material that is then digitized, digitized and used for the future. They are digitally capable, even though they still do think that the program is a bit difficult. And they are very good at English. So those challenges, I've eradicated. However, unlike our seniors, they have a really hard time reading the handwritten material. So like I said, even the material we gave them from the 1940s was actually too hard for them to read. So what they did was, they spent about 15, 20 minutes trying to figure out what all the letters said. And then they thought, screw that. We are just going to transcribe, hit the transcribe button and their brains explode. They were, they were mind blown, absolutely mind blown. One of the things we heard from the students was that the program's interface is difficult and it does take time to learn. And in this day and age, things need to come fast. People are impatient. So it's maybe something to, to look into making it even better. But I can see it's already improving so much. It's wonderful. Something that's really important is staying in touch with our volunteers. And we do that by blogging on our website. The blogs consist of updates about a new version of transcribers, the expert client. Or when do you need to start downloading a new Java version? How do you do that? That can sometimes be difficult. We also blog about interesting stories in the material. Like for example, if a family has been applying for poor relief, we just sort of tell this story in an anecdotal way. Or we talk about how we are progressing with the project. Or if Mia has been to a workshop somewhere, she, she writes about her experience with going on this workshop. We also have a Twitter page where we are doing a project promotion content. This is more like an outreach for people like yourselves. If you want to see how the project is progressing, that's where I will be, you know, creating content. And I've also been tweeting from this conference. So some lovely pictures on that. We write newsletters to participating archives. That's not for the volunteers, but those are for the archives. To try and tell them this is how it's going. This is where we are seeing ourselves in the future. We would like to maybe reach out to more archives. Would you like to join? And so on and so forth. We also have a strong presence on our user-focused Facebook group, like I just mentioned. It is a group where we are available to help, so they can write with different questions that they might have. If they, if they can't figure out a certain word, they take a picture of it and they upload it. And most often some of the other volunteers will have answered this question before we can even see it, before we've even seen it. So there's like a real nice community feel on this Facebook page where they're all helping each other, they're encouraging each other and they really do get a sense of togetherness that they are part of a larger vision for this particular nationwide project. It's lovely. And also we post news about workshops and events on there. So what's next? Right now I am working on a new project, a Nordic project, where we will be inviting archives and maybe even the National Library of Norway, we're hoping, to be part of a large-scale digitization effort of documents that pertain to our democratization efforts from the 1800s and onwards, maybe even the 1700s. We're not quite sure yet. We will be engaging ourselves, the Greenlandic, Norwegian, Swedish, Icelandic and Faroese archives and we're really hoping to get a lot of money from the AP Muller Mask Foundation in due time. So that's a little bit from us and I'm very happy to take questions. Thank you. Perfect. Thank you very much. Let's go with the questions directly. Here we go. Yes, we basically ask our volunteers to use Transcribable as the expert client, but failing that the light program is definitely also an option and we are looking into becoming better at teaching the light version because we have been mostly focusing on the expert client version. So we are planning another round of workshops in due course and we will be promoting the light version. Adding to that, that would solve your language problem because the light version can be in the Danish language as well. We have it in the Dutch version, so I imagine that if you nut flow, you can put in the Danish translations as well and they can do a pre-translation for you so you only have to correct. Very good. I'm very happy to hear that. I guess that's on already on my list. It's like I heard your talk, so let's see how fast we are. The voice from the off. If anybody was wondering, it was Florian who's taking care of these things. Good. Any more questions? Here we go. Fourth row. Sarah? Thank you. So I have just a question about a generic model you mentioned earlier in your presentation. A generic model that you were building, so how big of data are you using to do that or how big are you planning to use? Because I think the picture was a little bit blurry from here. So you asking what is our material? Well, I am not an expert on the actual program itself. That's somebody else. But the mixed material, is there a pointer? I think it's not working at the moment, but yeah, there would be. As you can might be able to make out, at the very top, it said Gothic handwriting from the 1800s, which contains material from the 18th century administrative writing model that Nina Kufl from the University of Aarhus has trained. And we are also mixing it with material from parish councils called Elsted, Todbjer, Malby, Bidamaling and Brabrandt also. So they are different records, like books, that we then scanned and transcribed. And we just sort of put the models together, run the models together. Did that answer your question? Okay. Okay, any more questions? Does not seem to be the case. Also online, no questions, Florian? Nope, okay. Yeah, I think then we end here. And I think another thing that we learned from your presentation is that you never should underestimate the power of coffee and cake. Absolutely. So here you go. Thank you.