 All right. So yeah, my talk is about the three layers of typo correction, auto correct spell checking and grammar checking. Yeah, I've been working on eBooks since 2009. I've done over 700 eBooks. I've written tons of posts, everything about eBooks. And since 2002, I've written just a massive amount about LibreOffice, helping people layer. Earlier this year, May 2023, this user, Cyprius, he was a Romanian user. And he was saying, the Romanian auto correct in LibreOffice is horrible. And so that's what started this all. And so he eventually then got his changes put in. And so now it's in LibreOffice. It's great. Okay. So if you ever wanted to upgrade LibreOffice, these are the folders that you might want to visit. So you have auto correct files, you have dictionaries that you might want to upgrade. So those are like these dick and aft files. And then grammar checking, just go language tool. Language tool is what you want to help. So here's a quick example. So sorry for bad typos. So sorry for the bad typos. And the typos were very bad. And so auto correct is like, as you're typing, and spell checking is then the red squiggly and grammar checking then is green squiggly. That's exactly what I covered right there. So auto correct is it should cover a lot of words that don't exist in valid words and very easy as you're typing, easy mistakes to make. Spell checking should be focused on, yeah, just showing you all the valid words. And now it only underlines the words that don't exist. And grammar checking then is, oh, when you use a correct word, but it's used in the wrong context. Auto correct, you want a lot of, like a one-to-one match. Spell checking and dictionaries, then a lot of those, oh, when you right click a word, okay, it can show you a list of many corrections. And then grammar checking, you really want it if it points out an error, you only want like one or two selections to choose from. The same thing with this. And so auto correct should always be correcting like actual typos. It gets so frustrating. Oh, you're typing on your phone and it auto corrects this word. And it's, no, I wrote the correct word and it corrected it for you. And it messed it up. And so that's what this Romanian user was talking about. He was saying, oh, no, I had to keep on undoing the stupid auto correct. And so some users get so frustrated, they just disable the whole thing completely. And it's just like, no, no, instead, like, we should be making the auto correct much better. Spell checking, okay, grammar checking. And then auto correct, then if it makes a correction for you, it better be like 100% correct, like there should not be errors in there. Spell checking dictionaries, it's okay, you know, if it makes some wrong suggestions, because you can always just select the one that you wanted. And grammar checking, then if you have way too many green squigglies all around, sometimes it just gets so annoying. But it's very easy that if one type of error keeps on happening again and again, you can just say, hey, just don't tell me about this rule ever again. And now you won't see those types of squigglies anymore. And so it's very easy to disable certain rules that just frustrate you. So now here's some quick examples. And so auto correct should focus on invalid words and common typos. And so here are some examples along the left and the right. So these are, again, as you're typing, you just want it to quickly correct. Spell checking dictionaries, again, red squigglies. And so you just want it to not put an underline under actual words. And then you want it to underline words that are, but again, don't exist. They're not correct. And so I wrote here this word mistake. So it sounds the same, but there's no such word. And so mistake is spelled a different way. And so when you right click a word, then you want it to give you actual English words. Grammar checking, very similar, green and blue squigglies. And so I stood in line for an hour. Well, oh, the person actually met hour as in time. I runs away from the dog. Well, runs, that's an actual word. But no, in that case, you can only use run or ran. And you have no idea, both are valid, run and ran, they both work in that sentence. And so again, it's up to the user to now pick which one they want. Okay, so now here's auto correct. And I covered a lot of that top. And what you want to avoid with auto correct is going from valid words into other valid words. And and spelling an accent differences. So like between British and American English, you don't want it to be swapping between those. That would get very frustrating as a user. So here was the example that the Romanian user was giving. There's these words in Romanian, where it looks the same, but then there's just an accent added to it. And so pastoro, I'll try my best to pastoro and pastaro. And one means reverend, and one means shepherd. And so it was the auto correct was swapping between between these two words and both are correct in Romanian. Then this next one, vulva and vulva. And so one means valve and one means uproar. And so in English, then you kind of have color and color, of course, color is the American spelling is so much better. Yeah, so auto correct should focus on common typos. And so here's a few types, transposition, apostrophe, capitalization, spacing, single and double letters, homonyms, and accents. And so a transposition now is where two letters are easily flipped. And so here's a bunch of these examples. And so these words along the left, there's no such words. But it's very easy as you're typing, you can make these type of little mistakes, that's the type of thing that auto correct should be fixing for you. You should be paying special attention to like endings. And so it's very easy for people to make typos as they're typing. And so here's like, I and G in English is used in a lot of words. And so oh, someone might accidentally type IGN. And so you might want to add a lot of those things into auto correct. And then prefix is so what gets put before a word. And so like post is an example or pre, and that's used in many words. And so someone might accidentally type per and make a typo there. In English, then you have a lot of like IE and EI in many words. And so again, you might want to pay very close attention because someone might accidentally type and flip those words. And so again, in English, there's this thing they say, I before E, except after C. And they teach the kids how to spell and everything. But of course, it's a lie, you know, as you get further on, there's words that are exceptions to that. And so you can't just, you know, mass correct this type of thing. Each language is going to be different. If you want more, see that link. Apostrophe errors. And so again, it's very easy to miss, but auto correct, good, it can add that apostrophe for you. Capitalization errors. And so like, okay, Mike, like my name, or if I lower case, if it auto corrects it, okay, that's okay. But usually that's better at spell checking. So spacing errors, then this is when two extremely usually two short words. And so here's some example like he was, is the whatever. And so, yeah, you'll just now want to add a space in there. Whoops. You accidentally forgot the space and boom, if auto correct help that for you. Wow, that's helpful. Single and double letters. And so again, these words on the left, that they're incorrect, there's no such word. And so it's very easy to have Oh, you accidentally only wrote one or two of the letters, when it's actually spelled with the opposite. And so here's some examples. Okay, so especially pay attention. So if a word has like double and double. So like if we look over here, like a different or something, or no committee is a perfect example is MM and TT and EE is in there. And so you have to be especially careful when you're dealing with words like that. Some of many people make a mistake. Oh, they'll only have one of those letters. And so similar just just it's a good pattern to look for words that have double letters inside of them. Very easy for humans to make the mistake. And then in English, then there's words where Oh, you're thinking, Oh, you add an ing to the end. And then all of a sudden, like a second letter appears and so like transfer. So you would think about transfer plus ing. Oh, no, an extra r pops in out of nowhere. And so, yeah, as you're looking through words, you've got to think about that too. So homonyms, then they sound the same, very similar, but it's completely misspelled. And so here's some examples. Again, the words on the left, there's no such thing. But if you read them out loud, it sounds exactly the same. And so again, these are common typos that just humans make. So again, auto correct, it's very good for correcting that type of issue. So again, then pay a special attention to like, again, ANT and ENT, again, it makes the same sound, but this vowel is different ends and ends. It sounds the same, but people can make these little typos here. So again, then you can research if you want more visit that URL. Accent errors, now it's not too much in English. So English barely has any words with accents. And so if you're in Romanian, then okay, this accent issue becomes a lot more. And so in English, you kind of have cafe naive, naively resume and resume, then in English, or some people like the one accent, some people like the two accents. And those are very few words in English. So I'm not I'm not greatest with this. But I have some examples then here from the Romanian user. And so he said, okay, this word took to CUDA, there's no such word in Romanian. So that's definitely wrong. But it could be depending on these accents. And so look, you can see this two accent or the one accent right here. And one word means silent. And then the other word is means like the silent one. And so in Romanian, then they have an ending like an a means feminine singular, and an a with an accent on it means indefinite. And so again, every language is going to be crazy and different. And so anyway, so those those were some examples. So. So that was his issues with the Libre Office. Auto correct was that it was, yeah, it was just it was just so frustrated he wanted to disable the whole thing. Okay, so oh yeah, so then that's a lot of what I was saying is I'm not too familiar with the accent stuff. So again, you guys in your languages are probably going to know more. Okay, so spell checking and dictionaries. So they should focus on showing you the invalid words, underlining uncommon words or spellings, and then giving you when you right click a word, you want it to now give you very high quality suggestions there. And I would avoid including extremely rare words and spellings. So here's an example. So of red squiggly is okay, look, that whole sentence is full of mistakes. And then that other one is that I am wearing clothes on my back. Oh, whoops, you accidentally make clothes. And so like clothes right here. And so if you right click the word, you want it to give you clothes. And so and then another example, this word should be underlined, there's no such word as underlined. So when you right click it, you'll get these five words showing up. And so you can choose which one you want. And then there's an example like I took from LibreOffice. This word card players, it's underlined squiggly. So I right clicked it, gave you the suggestions right there. So spell checking and dictionaries are like a balancing act between do you want to list your every single valid English word? Or do you want it to be helpful at catching actual typos? Or do you want all the words like the rarest and the obscure words? Or do you want your dictionary to only include like the most popular words and the most popular spellings? I mean, it's a very tough balancing act that you have to do. So here's the example from before. Clothes is actually a word. And I'm like, what the hell is this? It does not exist in most of the English dictionaries. And it means any of several plants related to the bird ox as of the cleavers, the butter bear, the cultural. And I'm like, I don't know what and many of those words even mean. And so to hear no normal human 99.99% of the people will never use clothes. They mean clothes. And so, so yeah, so you wouldn't want, you know, clothes to not be underlined. So here's a few more examples. So there's this word pollution. So your most normal people mean pollution. And yet some sort of work Shakespeare play meaning illusion. There's this there's this cheesewood, which is a type of Australian tree. Most people would probably be talking about like cheese and wood, much more common words. And then this calendar is a very rare alternate type of spelling for like some sort of bird. But most people would be talking about calendar as in, oh, you're looking at the date and what you see on the dates. And so that's what most people mean. What calendar it's some sort of weird bird or whatever. And so most people wouldn't want that that bird. And so spell checking programs. So then you have these dick and af files. And so for every language, every language has a dick and af file. And so you have to just find those online and do your updates. And hun spell is pretty much what you would be using. There's a different programs for this. In English, then there's two dictionaries. You have the scowl dictionary and it's proofing tool dictionary. The first one's done by Kevin Atkinson and then Marco Pinto takes care of like British and the other spellings. Of course, the American one. Yes, much better. And then in Romanian, there's this place called grow spell, which takes care of the Romanian dictionaries. And every language, you have to look it up and it's taken care of by all different people. German is going to have something else and Dutch is going to have something else. But you'll be looking for those dick and af files that you want to update. So here's again the spell checking programs and some of them over the decades. And the latest is then hun spell, Lasso Nemeth, I don't know if he's here, but he's floating around at the conference and he works on that. And so hun spell is pretty much the one you want. If you want more information, visit the URL. So here's some useful tools. And so this Google Endgrams. And so they scanned in like every single English book. And then you could see trends over time. And so here's a few examples. Like then there was this word called Renegado way back when and now it's called Renegade. And so you can see then so now you can see these charts of the usage over time and way back in like the 1820s Renegado was being used. And now you can see it's completely unused that nobody uses Renegado, everybody uses Renegade now. And so this Google Endgrams like, wow, it's just so awesome to see the usage of words over time. Another example is so a LibreOffice user, they were saying, oh, look, the word Briar is it's right squiggly underlined. And then there's a difference between the American and British usage. And so here's like the American usage. And you could see the ar spelling is a little bit more popular, since let's say the 1910s. And if you look at the British usage, that Oh, no, the ar has always been more popular. And so this this tool is just amazing. So again, depending on your language or whatever, you'll have to look at that stuff. And then to see the rise of new words. And so like the word biopharmaceutical, it only was invented very recently. And so you could see this that oh, it was it started out in the 1960s. But now very recently, wow, now it became one of the most popular words. And then, of course, back to American and British coloring versus color. And so you can then see that, of course, in American, yeah, back in like the 1830s, they started to correctly spell it. And then you could see, oh, and then there's British, I don't have that set up right now. But who cares about the British? Okay, so yeah, so now last year, then no, a few years ago, then there was a talk about Oh, how he was upgrading, I think the check dictionaries. So check that one out. Oh, and here's an example, then there was this word called Bustrophedonic. And what this was was then this extremely rare word where, oh, if it goes one line left to right, and then the next line, it'll go right to left, and then it'll go back to left to right. And I was like, what the hell is this extremely rare word? And of course, Libra office cannot do this, but it was used in like ancient Greece or something like that. And so sometimes you just stumble across just such, such cool words and you get sucked down the Wikipedia wormhole. Okay, so grammar checking, you want valid words used in the wrong context. And so and punctuation errors. So you want the squiggly, the squigglies to show up on that. So of course, language tool, language tool, that's open source, it's integrated now into Libra office and everything. And so, yeah, go help them out open source. But there's another tool called antidote. And it's just French and English. That's one of the best ones that I've come across in all these years. And then of course, Grammarly, I felt sick putting them on my thing, but they're one of the more popular grammar checking tools as well, like that help language tool. And so here's the type of errors, A and N errors, commonly confused words, whatever. And so I would recommend just watching that talk from Faustam. That's the guy who created language tool. And when he first started language tool, and in that, he goes into much more detail on what a grammar checker does. And so here's some examples. So like, oh, in English, you have this A and N issues. And so Anne hamburger, it would be a hamburger. Anne owl is correct. And Anne hour, commonly confused words. And so you have like the there, there, there, there's three different there's in English. And so again, these are valid words, but oh, it's used in the completely wrong spot. And then capitalization issues. And so like, if you put in a, if you put an exclamation point here, well, then the next word needs to be capitalized. And so it's a very easy mistake to have a lowercase letter there, where I visited Bill Gates. Oh, whoops, you accidentally Bill Gates, he's a famous guy. And so you forgot to capitalize his letters. And so language tool is very good at catching this type of stuff. Punctuation errors. And so did I see you there? Oh, and if you put a period at the end, whoops, you meant in a pie, I mean, a quotation, I mean, a question. Then the next example is, oh, you forgot, you put one quotation mark and you forgot the closing quotation mark. Parts of speech. So I are going to the store. No, that's like how a caveman talks or whatever. And so you accidentally meant, I am going to the store. And so I duplicate. So this is when you accidentally, you type two words in a row, very easy for a human, especially at the end of a line and at the beginning of a line. Oh, it's so easy for you to have two words in a row. And so language tool would catch that and it would underline up. I went to the store. And so you accidentally meant just one thumb. And so how I quickly proved free books, I've done hundreds of books. And so I came up with much faster ways of doing this. And so you had a one by one spell checking and then I use this list based spell checking. I call it spell checklist. And so you can check the entire book in one shot. And you can have a searchable sortable list with word and word count and the language, you could say, hey, only show me the misspelled words. And then you can have a case sensitive search. And it makes it so much faster to check an entire book. And so I've been using these methods for more than 10 years. And so here's an example, I opened up 1.6 million, like this huge amount of journal articles. And boom, now I can just have this nice list along the right side here. I can say, Hey, look, find all words with a hyphen in it. Boom, all the words just show up in the list. I can say, Hey, find all words with this e with accident. And boom, I could see all the words in the list. Show me all capital ing words. You get a small list. And so this was 1.6 million words boiled down into a nice small list right here. It's so much faster to prove free to books, foreign words, if you sort alphabetically, so a through Z. And then all the Greek words just start showing up. The Chinese words start showing up at the bottom. Wow, it's so much easier to then tag languages and mark them. And then again, I could say, Hey, only show me the misspelled words. And so the left is all words. And then the right is all the misspelled words in the book. And so instead of going through Libraphus, it's crawling through 300 pages and finding color. So you can now just see it all in one short list. If you sort alphabetically, so your eyes can then tell this peaked and peaked and Rothbard is a person's last name. But then you can see, Oh, look, whoops, two typos here, Rothbard and Rothbard, Malone or Malone. And so anyway, so these patterns can just pop out with your eyes once you sort all words alphabetically. And then if you sort by word count, then again, these things pop up where this guy's name was used 100 times, but this was only used twice. So now you might want to pay against special attention. Whoops, you accidentally meant to spell it that way. It's very easy to see in the list form. A list based grammar checking then same thing is that list form it just this doesn't exist. This doesn't exist in any tools that I know of right now. But but it would be so much more helpful to have this. But yeah, language tool has a rough thing. If you use their standalone tool, it kind of gives you just a giant list of all the grammar errors in the book. But nothing as neatly organized as this, I wish that it would like nest what it would it should categorize it based on which type of error it is, or which suggestion it's actually telling you that would be so much better because you could say, Hey, correct all of these in one shot. Antidote is the closest and that I've seen and it has something similar. It does categories on along the side. That's the closest that I've ever seen in real life. Okay, and yeah, thank you. And that's that's the end. And then I just have some extra stuff all at the end. If you needed extra resources. I've been writing about this stuff for 12 years. There's just so much extra information here. If you guys are interested, you'll see my slides. Yeah, yeah, okay. Any sort of questions? Anything? Yes. No, this person is writing in this way. This one is a very common change that has been made. And we want to prove some of this in the textbook, which of course it was. So I'm sorry to be procrastinating this work or this work for some years, but I'm actually thinking into the rotation. So I'm very happy to see all of these nice things, especially trying to combine it with your wrist. So I hope you can focus on the work. Yes, yes, I'm looking forward to it. Yes, thank you. Thanks. It's very interesting. It reminds me in the Netherlands, I see efforts who put many words in the dictionary as possible. Troubles that I sometimes have difficulties with. I sometimes have been picking the right for some words and then you gave the example of Troubles in English. And so when we actually get found, which we have big scenarios, we deliver it to my words. That's really helpful. And for the last words and and Yes. And so what happens is again, the spell checking dictionaries, it's always this balancing act of, okay, do you include these rare words because language is always changing. So every year there's like new words, new terms that get introduced. Some become a lot less popular. That's when I showed the graphs with like the Google engrams. If you look at a word like, again, let me see this Renegade Renegado. And so if we were in the 1830s, okay, about 50, 50 people would use this type of spelling. But nowadays, no, no one is using this word Renegado. And so Renegade is then the spelling. And so if so you pick word after word after word and you can then see these trends over time. And then, okay, 10 years from now, maybe again, this type of spelling becomes more popular or less popular. And so yeah, the spell checking dictionaries always need to be like kept up to date and maintained and everything. And so yeah, it's a very hard balancing act in that thing. And yeah, and it just always has to be tweaked and updated. Yep. Yep. Thank you. So one last question. Oh, yep. Yeah. So our cheat would be very weird at this point. And shouldn't we want the dictionary not to call it that because it gives an invalid ground. Is what should change the word Yes. Yes, yes, yes. And so this is why you have the three layers have to work together is then okay, if we include this word cheese with this extremely rare word, well, then grammar checking would then need to check. Hey, is this so with language tool there? I submitted the example AI as an artificial intelligence capital AI. There is this lowercase word I, which is like a three tone sloth three figured sloth. And I'm like who the in like Brazil or something like that. And I'm like, who the hell would be talking about the three tone sloth? Everybody is going to be talking about artificial intelligence. And so they then wanted, okay, AI lowercase should not be, but then the grammar checking will then, you know, underlined. So you'll get a green squiggly instead of red. And so again, it's a balancing act to work between all three layers. Yes, yes, yes. And so again, it's a very hard balancing act. Yep. Okay. Thank you.