 So, okay, time flies like an arrow, food flies like a banana, as the Marx Brothers said. So let's hurry on to our next panel, panel number four, as I said, which is entitled The Transcriber's Workflow, Inside, Outside and Beyond the Platform. And the mode that we are going to see here is so-called lightning talks that are ten minutes long. So yeah, we're a little pressed for time and let's see, it depends a little bit on how good you can do in terms of keeping time, whether we have questions or not. So I think it's completely fine if we postpone the questions to the coffee break as well. If you see, okay, time is getting scarce, then we can always do that to have a little bit of a buffer. And yeah, the first speakers here will be Milan von Lange and Caroline Keiser, who are both with the Neat Institute for War, Holocaust and Genocide Studies. And the title of their talk is From Variation to Validation, Digitizing Nighout's War Letters from 1935 to 1950, using HDR, so a very serious topic. And one that's also very interesting in terms of the technological approach that you're taking. The floor is yours. You can use either this microphone or the one that's on the table. So I'll use this one? Yeah. Yeah, thank you for the introduction. So I'm Milan, this is Caroline, and we are together working at the Neat Institute, like Andy just told you. Our colleague Annalise couldn't be here, but she's joining online, so hello Annalise. We are working on a project, an interdisciplinary project at the Nighout, in which we're digitizing a collection of historical war letters of personal correspondence from the period before, during and after the Second World War in the Netherlands and its former colonies. We're going to present the project very briefly, and we're going to say something about the collection, and we're going to ask a question from you, our audience. And Caroline, can you maybe first say something about the collection itself? Yes, of course. The collection is 30 meters long, and it has many letters, but we also have agendas, medals, LPs, and so on. Newspaper cuttings, it's all created in the period 1935-1950, and the letters are written by different people in various situations. We have letters written by collaborators, letters from resistance people. We have letters written by children, by elderly people, by men, women, citizens, refugees. Yeah, well, quite some. Yeah, and the project that we're currently working on is not only a digitization project, but we call it a hybrid project, as we're involved in the project with an information analyst, an archivist, and an historian. And so we call it an hybrid project, as we're also reflecting on the implications of what we're doing, of digitization, of making transcriptions, of using HDR, but also on annotating metadata in these very heterogeneous materials. And we're also investigating the implications for historical research as we go through the project and reflect on archival practices and choices and decisions as we make it in the project. And we do this together with several partners, among them also the Warlocks Project from Luxembourg. I think Nina Jans is also here and presenting tomorrow with the students from different universities as well. And we're now also involved in a small network that we started with these people for digitizing warletters using transcripts through different workshops and meetings already. So yeah, what did we do, Caroline, can you maybe take over the microphone? Yeah, I can. What we do, we scanned and preserved the collection. We did a bulk upload to the server of transcripts, which went very smoothly thanks to Rutger van Koord. I don't know, Rutger, if you're here online, but thank you. We created an HDR model and our current error rate, character error rate right now is 5%, which already sounds pretty good to us. Yeah, and I have to say on the most of the collection, this works indeed relatively well, but from manual evaluation of the results, we also discovered that because of the huge variation within the collection, this character error rate was also quite inconsistent over different parts of the collection. So we have quite mixed results and differences in performance of the HDR due to the extremely heterogeneous nature of this collection. And this is caused by differences in handwriting, as Caroline just just told. We also have letters written by children from sometimes, yeah, very young age. You can imagine that their handwriting is not the best, but we don't have enough material of these children, for example, to train a model that's also good at recognizing their handwriting. And we also have to deal with letters that are written in the times of paper scarcity, which means that the quality of paper is sometimes really bad. Sometimes letters are written on everything like chocolate wrappers, toilet paper. We also have those in the collection. It's not used, but used to write on. And yeah, so basically the question that we want to ask you is what should we do with this heterogeneous collection? Should we maybe accept a more or less inconsistent error rate through the collection? Should we retrain? Should we use volunteers to manually correct transcriptions? So basically that's the question that we want to give back to you. And we have also created a QR code with which you can find some more on the project. And this has no information about vaccination status, but you can find some more. And we hope we have some time for a small discussion. Thank you. Thanks a lot for keeping excellent time. And for that, you're getting marks. No, not for that.