 Dear colleagues, dear fellow transcribers users, thank you so much for attending my lightning talk on transcription revision. Anyone who has ever dealt with paleography or tried reading any document in transcribers, regardless of the tool's employed, has inevitably made at least one additional glance at the folio to verify that this one particular letter is indeed a character, not a random ink smudge. Where traditional pencil and paper approach allows for doubt, variation, transcribers is unyielding. Every glyph, as long as it's on the baseline, must be recognized, every distance calculated, every character inferred. Have to admire the confidence, but can't help but wonder how accurate is the transcription? And it works. The Lagos Project, conducted here in Innsbruck at the Department of Classical Philology and Neolatin Studies, uses transcribers to process nine diary volumes, comprising over 2,000 pages of handwritten ancient Greek text. Our processing pipeline may look familiar to you. Starting with a 50-page manually transcribed dataset for version one, we've gradually built up the ground for a dataset by processing additional pages and manually correcting the output, which is a meticulous and slow-paced task. The most recent Lagos version seven model boasts a 45% increase in training set size in comparison to the previous version. However, the training yielded only a 0.4% improvement in CER. Were there mistakes in the training process? Is it the case of garbage in? Was the new data insufficient? While we prepare for the next transcription revision sprint, I couldn't help but wonder, does the newer version actually perform better? Here's a glimpse at our source. Apologies if it looks all Greek to you. On the right side, you see two transcription variants of the same tendinitis. The top is reduced by the older model, while the bottom is brand new. Another difference is that the top variant has been revised. I have personally checked this page among hundreds of others as part of our continuous revision efforts. Well, you can see that there's been a mistake. While the new model has successfully amended my mistake in line three, it has also erroneously deleted a character in line seven and made a complete mess of lines nine and 10 due to baseline inconsistency. Clearly, there's benefit to training new models with additional data, but since revision takes a lot of time and resources, is it possible to improve or even expedite the process? Our team members are highly skilled paleographers and scholars of Greek language, and it's been demonstrated that scholarly supervision improves the accuracy of automatically generated outputs. At the same time, human oversight is still a factor. What have we tried using a spell checker? It sounds rather appealing, especially considering the fact that dictionary-based algorithms partially imitate the manual revision approach. However, our source text presents a number of challenges. Vidari's author was quite adept at wordplay and code switching, often engaging in transliteration and inventing ancient Greek words for 19th century objects and concepts like pantaloons, omnibus carriage, or musk raid. Since this approach also does not take into account the digitized image, we're risk-introducing a lot of false positives and negatives, ultimately lowering potential precision and recall. Lastly, there's grammatical error correction, a long-standing NLP challenge which has attracted numerous data scientists and developers. Unfortunately, here we are again at a disadvantage. While several resources exist for Greek language, ranging from source text databases to annotated treebanks, Greek is still considered a low resource for underrepresented language. In other words, the Greek corpora available for language model training are much smaller in size compared to, for instance, English data sets. The metrics on the slide are clear. The F score for the state-of-the-art English model is over 70%, while the same metric for its Greek counterpart is almost 20% lower. This particular quote from a GEC survey article written by the leading experts in the field resonated with me. If probability never achieves grammaticality, much like transcriber's CER never really reaches zero, can we even consider semi-automated solutions without scholarly supervision? For now, the Lagos team bravely accepts the manual transcription revision as the primary solution within our project. Not only does it allow us to handle challenging cases and address other revision tasks, such as named entity tagging, but also reading the diary text is honestly a lot of fun. However, I don't try it off the possibility of testing several existing solutions, such as transform-based models, spell checkers, and GEC corpora annotation tools, to see if they can be applied within our project. With that, let me thank you for your attention and open the floor for questions, discussion, and brave ideas. Thank you. Well, thank you for these insights. Who would like to get the microphone and ask a question? We have already questioned from online? No one. Well, it was you, but it's not. Well, you just looted them away. Are there any ways of finding more material to get these models to become bigger, as you just mentioned? There must be people out there in this world working with the same type of sources as you are. There must be, but then again, we have to be wary of whether this material would fit our specific, well, timeframe, our specific historical context. And again, we're dealing with an artifact written by a single author, and it's quite unique, right? And as has been mentioned today in the morning, historians are always quite specific when it comes to their material. Yeah, that's why if we, I mean, we do have access to just nine volumes of this diary. If we ever uncover some more by sheer chance, they definitely could be used. And this would be a welcome contribution, but otherwise, not for now. Thank you. Thank you. Could you elaborate on how you use the MT5 model in your project? Thank you for the question, not yet. To expand a bit further, this is just an idea I've come across in a single scholarly article, Ekaterinikori and John Pavlopoulos. They've run some tests on Greek national learners corpus with the MT5 transformers and the model, actually the results on this slide, the MT5 text-to-text, it's from their research. 52%, sounds quite all right, that worth a test anyway. But then again, the GNC, the Greek National Learners Corpus is based on the modern Greek language. Whereas in our case, we have ancient Greek with an asterisk, meaning that there are several variants, the medieval Greek, Byzantine Greek and modern words thrown into the mix. So fine-tuning is the answer, yeah. Thank you. I'm wondering about the law of diminishing returns on this error correction thing. At some point, a human has to check the output of the machine before he can release it. And I'm wondering, although you're not gonna get to 100% accuracy, are you getting close to that point at which there's no point in trying to get any further? Could you have to check it and verify it anyway? Thank you for the question. All right, let's lift the curtain a little bit. Our current CER is 7.4, which is just below the transcript suggested margin for 400 in text recognition. It's all right, but it's not good. Can we really push it forwards? We hope we can. We have eight additional volumes of diaries to process, revise, add to the data sets, rerun it, enrich the model and hopefully we achieve a better result. Thank you.