 Welcome back everyone. Today we'll talk about optical character recognition from images using gImageReader on Windows. Please subscribe to get updates on new videos and help this channel grow. So the first thing we want to do is actually go get gImageReader for Windows. You can find gImageReader on the GitHub website, so github.com and Manizandro, gImageReader. So this is where the source code for gImageReader is located. But you don't have to know anything about compiling source or coding or anything like that. All we need to do is go to the website, go to this page here, which I'll have in the link below. Scroll down and then you can see the screenshot of it and installation. So we have a couple different options here. We have the source installation. If you want to build this package yourself, if you know about programming, you can download the source. What we're going to be using today is the Windows downloads, the download from the releases page. And then they also have different versions for different versions of Linux. There's also apparently supports and of course you can contribute if you can code. You can do a pull request to them. So let's go for Windows because we're on Windows here and go to the releases page. Then we have gImageReader and the current version is 3.3.1, which is actually relatively new from this year, July. So what we need to do is go down and if you're on Windows, there are a couple different options. But basically what you're going to want is anything with a .exe on the end. And most likely, if you're watching this, if you don't know what to download, most likely if you're on a newer computer, gImageReader 3.3.1qt5x8664 is the one that you're going to want. So if you don't know which one to choose, choose this one. x8664. Everything else basically is source code or for the Linux version. So once we download that, and I already have it downloaded here. So I'm going to double click on it to install pretty much like normal. Do you want to allow this app to make changes to your device? Yes. And then welcome to the installer, click next. And then this is the GNU license. So you might want to take a look at that. GNU licenses are interesting, but that's a different topic. The version that you want standard will install the English interface and then international will have translations. And as far as I understand, there's quite a few different languages. This app has been translated in. So if you want something other than English, install the international package. This isn't for the translations of the documents. This is not document languages. This is for the interface for the program. So I'm going to keep English. Click next. Install for everyone on this computer. Next, the location that you want to install on. Install. So now we're installing and this will install the app plus basically Tesseract as the back end. Then click finish. So now we have the app running or we have the app installed so we can go to start menu. And G image reader. So now we have G image reader app running. I'm going to minimize this so you can see it a little bit better. We have G image reader app running. Now G image reader is interesting for a few reasons. But what I'll talk about first is how we can actually get a file into here. So there's files and if you click the down arrow, you have the paste option and you have the take screenshot option. So imagine that you had two screens and you have a document on one screen and you need to OCR that document. It was a scan from a scanner, an old book or something like that. As you're going through, you need to take a screenshot or copy and paste a screenshot of that image into G image reader. And then they also have the acquire tab. And this is where you can set up for scanning documents. I don't have a scanner installed but you can scan directly to G image reader. So if you are taking some documents, physical document, trying to scan it, I would recommend this way or you can copy and paste which is also relatively easy. Now I have a Korean PDF. I'll just open that up. It's a Korean PDF of paper. Now this PDF itself doesn't need OCR because as you can see I can select the text. This is not an image and it's a correctly made PDF. This PDF however does have Korean and English. It's formatted in a little bit of a complicated way. Two column, like one column with multiple languages and then two columns. But not really too bad because the text is fairly clear. A little bit fuzzy at least to me but fairly clear. The only complicated thing are maybe these headers, the footer and then possibly the columns. So this is the document that I have and I've already created a screenshot of that document. Also you can see the screenshot is a little bit fuzzy but I think if you scale it up it would probably be a little bit better. So I have a screenshot of this document and this I cannot select the text. So this would be a good candidate for OCR because I can't select it so I want to know what this is. So I have my image or my file that I want to analyze already. Notice I don't have the entire PDF. We didn't have a way to just import the entire PDF into GImageReader. We're doing scans like one at a time. Now because my source document is English and Korean. So my source document is English and Korean. You can see it says recognize all English. We also have the OCR mode but basically right now it's set to only English. So what I need to do first is go to the settings menu. And I can click on manage languages and this is where you'll set the languages that you want to try to detect. So there's many different languages. All of these should be the same as default Tesseract OCR. I don't know if this program has actually changed the models any but I don't think so. So here I'm trying to do Korean and I don't need Korean vertical so Korean can also be written vertically. I don't need that because I know this is a modern document that's not going to have that. So if I click apply I'm going to get into a problem. So the following files could not be downloaded or removed. Fail to create directory for test data files. So it's trying to download the test data files which basically is the model of that language that you're trying to detect. So the reason for this is because I have installed this inside a kind of protected system directory and we don't have right permissions to that directory. We don't necessarily want to open up this application as an administrator that does have right to that folder. So instead of doing that I'm going to change the path for the test data. So they even have a hint here. You don't have right permissions to system folders. You can switch to user paths in the settings. Yeah, user paths in the settings. Okay, let's do that. I'm going to close the test data manager and go to preferences. And then our problem here is C program files G image reader share test data. So we don't have right access to this folder and I don't want to start G image reader as administrator just to be able to get access to this. So what I'm going to do instead instead of system wide paths we're going to click down and click user paths. So now what this will do is see users Joshua app data local test data. This is for this user we will store the language definition definitions inside this folder. So it will be let's say user specific. So if you're sharing this with let's say everyone in a group and everyone has their own login that the files that you download will only be available to this particular user. You probably want to work on system wide paths. But if everyone uses either the same account or it's just you user paths is no problem. Okay, so now we can click OK. No test racks languages are available for use recognition will not work because we've now moved the path where the test rack language files are available to click OK. Now we need to go back to manage languages. And then now we don't even have English because English was installed by default so we need to select English because I have an English document with also Korean. Okay. Okay. So then we apply. And now it's going out and you can say downloading E and G trained data KOR trained data. So it's downloaded the files. So then we click close. And then I'm just going to redetect languages just to make sure then. So now we should have the languages that we need installed. You can go back to manage languages anytime and work with any of the languages that you want. Remember, these are being downloaded from Tesseract OCR directly so that the default trainings on the language. So if you have a very specific book or font, it might not work very well. You probably have to make your own trained model, which we'll talk about how to do the training in a different video. Okay. So for now, I need to go back and I have my my document. I'm going to take another screenshot just so you guys can see. Plus it's a little bit easier. So I'm going to open the SNP tool in Windows. Okay. I want to get all of this page. So I'm going to make it a little bit smaller. That looks about right. So from the SNP tool, my mode is just a rectangular SNP. And then I'm going to do new and then select what I want. I'm going to leave that number off. And then I have the file copied. Okay. So now if I need, I can just click copy. I can minimize this now back in G image reader. It will recognize that I have something in the clipboard so I can just paste. And now I have my document inside G image reader. Okay. So now I have my pasted one document. So I need to say recognize all English in US or Korean or multilingual. So in this case, we want multilingual because we have both English and Korean going on. So I'm going to select that. And then now it's, I just clicked recognize all English plus Korean. Now it's processing. And then we have non moon. Not exactly correct. So again, I would probably retrain this model if I cared about this line. The second line, let's say mobile eight. Okay. That's okay. But it added a space. Notice I can, I can edit this test text directly as I'm working on it. And then hook. It's not mom. I think that's a bit. So I think that's a little bit off, but close still close. A saw shouldn't have a space. So these little things, it's mostly actually getting spaces and it's really, really close. There's like one, one character off. But then notice what happens here. We have Korean text and then we have an English word inside Korean text. Well, what is this doing? It's understanding the English text inside the Korean sentence as a number. Right. So that's not very good. That's not what we want. But, you know, better than nothing. So now I can just go in, I can see the, the English text so I can just type, for example, Bluetooth. So I can edit it on the fly and make sure it's okay. And then give on correct. And then MMORPG also detected as a number. And then we have, I think this was also part of the, the G kind of detected. So it wanted to try to make it Korean. It just, it just couldn't figure out anything. And then we have way. And then I'm missing the second line. So, so I would need to add that in. Okay. Now let's see what it's doing whenever it's just English by itself. So design of Bluetooth based MMORPG game in minutes. Well, that, that worked pretty well. Okay. So the Korean side, there were some things a little bit wrong with it. You kind of see, oh, it's not in correct. Okay. And it detected that little star there. And then it thought that this was also Korean. So it came up with numbers. I think that's the issue here. Yeah. Okay. And then we start getting into it. And the text, this was fairly big text, fairly clear. You notice that this image, the PDF itself is relatively fuzzy. The image is, I think even a little bit more fuzzy. Once we started getting into the, the smaller text, that's when things really got, got to be a problem. So here we have Doa and it's what it looks like. Yo, yo, yeah. I can't even barely see it. Okay. It's a bit of a problem whenever you're trying to recognize all because it's trying to figure out or it's trying to guess which language it's currently looking at, English or Korean. So that's not really what we want. And I can tell you that it looks like this detection just didn't work at all. Whenever we go to English, it also didn't really work very well. Okay. So I'd say the, the very clear big stuff worked pretty well. You know, not, not perfectly, especially whenever English was mixed with Korean, but it worked pretty well. So I'm going to go back and instead of detecting all, let's go ahead and just try to detect Korean. Okay. And it's asking me for a Korean dictionary. So let's go ahead and install that. Okay. No, no, no Korean spelling dictionary. All right. Well, whatever. So I just want to detect only Korean. Now if I recognize all as Korean, then all of this English is also going to be messed up just like it was before, but we'll probably have better luck with our main text. Instead of recognizing all, I want to select just the text that I'm looking for. So here all I've done is click and drag the text that I'm interested in. And then now it's changed to instead of recognized all, it's recognized selection. And I've already told it that this is Korean. So let me make sure that I'm getting all of that enough space around. Okay. So now I have recognized selection as Korean. And that is way better. I can already tell that's way better. I think the second character is still wrong. But so much better than it was before. Now the thing about especially PDFs that you'll process sometimes, you'll get these new line characters and one nice feature that I like about this tool is that you can remove spaces, strip line breaks on the selected text. So I just select the text and then strip line breaks. And then now we have it basically how we want it again, probably need a period in there. Notice we still have numbers detected instead of the English text. So Bluetooth MMORPG, anytime a different font was used, it inserted the numbers. So that could be a problem, but it's not used very often. So it's probably okay in this case. So now we have a pretty decent looking OCR of the Korean text. Even though the quality really isn't very good. But you know, it's not it's not old style Korean or handwritten or anything like that. It is still a computer font. So fairly clear, which is why we get a pretty good OCR. So now let's just go back here and see we have our next base. Let's select the next line, next paragraph. And then we need to switch to English. I don't think I don't think English will make too much of a difference, but yeah, English and then not perfect, right? So abstract kind of. And then with the rapid growth of recent wireless technologies with the rapid growth of recent wireless mobile computing application technology and handheld mobile devices kind of struggles, struggles a bit. Actually Korean turned out way better than the English did. I'm not sure if it's because of the font or or what it is exactly, but Korean turned out way better. And this is where the spelling, sorry dictionary comes in because now we can just right click and fix anything that was probably not handball handheld. Yeah. So abstract, right? So now we can just right click and fix a bunch of a bunch of these. So I think if the font was a little bit clearer, maybe we would probably get a better OCR off of this. But that's basically how you do it. So GM and reader has a lot of really nice features. What I like about it is that you can actually see the text you're working on as you're you're working on it. Whenever we were doing something from the command line, we just trans OCR at all the documents and then whatever result we got, we have to open up both documents and then compare them. This kind of does that for us. We can select very easily the things that we're actually interested in working with and then really get that text perfect if that's what we're trying to do. Again, it might make a difference in your document. What you're trying to test or what you're trying to detect versus the model that has been trained up for that document. So I'll show you in another video how to actually make Tesseract OCR models because if you're doing something, especially something like handwriting, this probably won't work very well for you. It might work okay depending on how clear the handwriting is, but especially for older documents, it may not work as well just because Tesseract has been trained with basically older books, which are also typed. So we'll talk about training in a later video. This is how to get started with GM and reader, which I think is a really nice tool. Very basic, very simple to use. You can get up and up and running fairly quickly. There's different acquisition methods, but I think the biggest problem that I see with it right now is that I don't see a way to just import, let's say an entire document or an entire PDF. You will have to make screenshots or single page acquisitions on it. But that's kind of the point. This is really made for getting the text perfectly converted into OCR. That's why they give this editor here. One thing I didn't talk about is saving the output. So with this output file while you're working on it, you can either clear the output, don't click that unless you really want to, and then you probably want to save the output and you can save it as a text file. So this is how you work with the document. This is how you import your screenshots or the images that you're trying to detect the characters on, and then you can basically just have a bunch of pastes, OCR them, make sure the OCR is perfect, and then move to the next paste. So that's it for today. Thank you very much. Thanks for watching. If this was helpful, please like and subscribe. Also, please consider supporting us on Patreon. Your support lets us focus on making better tutorials for everyone.