 Okay, so I'm going to present the OCR information through images and, well, my name, well, something about me, it's, I'm Alison Riana Rios, I'm from Bolivia. I did my master in design and multimedia pollution, computer engineer from Bolivia too, I specialized it in data science and actually I am an organizer of Violet Disco Chabama community, instructor for programming and computing in Networking, Cisco Networking Academy. So let me explain you something about computer vision. I think that this is so one of the main disciplines of artificial vision or artificial intelligence, sorry, it has great impact due to all the advances that it has in algorithms, in hardware and also in software and processing capacity and also the data transmission. We can show many things in different ways but the most popular way to show some, to other people, information is about images. So this field is one of the, one of the greatest, the, like it says, the main disciplines of artificial intelligence right now. We can see that the text information can show some intelligence inside but how to do this with computer with vision, with images. So let's see, how does it work? We need this large, this large amount of data that is required to analyze and recognize images. We can share many things, we can show many images, pictures, the information is already, it's there, like in many ways, like computers, cell phones and TV and other things. So we need this large amount of data to analyze the images. Two technologies are used for this. One is the deep learning, DL and a convolutional neural network, the CNN. How it works, well, you have the amount of data, the amount of images, the pixels, we have too much information for showing. So we use some layers of images, we try to compare all these layers and at the end we have images that are trained, that can be something interesting for us. So what are the goals? The image processing makes it possible to improve the quality of images for later use or interpretation. We have those amount of images of pixels of information for each image. So we can try to do different things about it. One can be eliminate the defects of the images. The other one can be improve the properties. If I have an image that is not very well, that does not have the proper information, we can try to improve that information. And we can also add information, add information to that image. We can use the artificial intelligence as some of the tools that are now available. If you put just a little image, you can make it grow around it. So we can add information about that image. What are the applications for this computer vision or information that we can add or we can use for images? One is identification. As the example, I have one picture, one image or one video, which is many images in one. So we can identify which kind of information we have in that image. If I have a picture, I can show if there is a person, if there's an animal, a plant, if there maybe is a desk or a computer, we can identify that by the recognition, the computer recognition of that image. We have a large amount of data to make the computer understand if that thing is a person. So we compare that image, that thing that we can try to understand what's that. So we compare that thing to all the amount of data that we have behind and we can identify if that thing is a person or another thing. The other one is detection. Once we have the image that it says it's a person or it's a desk, we can detect it and we can take it like, okay, what can we do with that person? We can identify, maybe, detect it and identify if that person is a woman or a man or if it's just mannequin, I don't know. The other one is recognitions. One, we have identified, we have identified our person, we have detected that person is a man, probably. We can try to recognize some of the parts of that person, maybe the face, maybe the body, maybe the hide, or something like that. So we can recognize many things in one image and once we have that recognition, we can show, okay, that person probably has its eyes there, its nose, its mouth in that position, but it's not enough information for recognizing somebody or to show that person is somebody like a name, Joe, maybe. So we can try to restore all the image that we have. We have the person, it's a male, we have its face, so now we can try to restore the image, we can make it clear and there we have all the possible applications of artificial vision here and yeah, there is a great amount of possibilities there. How about the recognition? I think that the capture is the main thing here because we are trying to get some image first, that's the way how we can process all the things before, after death. So first the capture, then the image preparation. Okay, we have the image, however it's taken, I think like, okay, we have made a photo, we have a picture of a photo, then we have to prepare the image. Okay, we can take away all the shadows, maybe we can put more light there, we can add more color if it's needed, I think. Then once we have the image more clear, we can do the extraction of region of interest, which is the Roy. Okay, I have the image right now, I have a picture of a text, probably what about the board, I want to take notes, so I just take a picture of the board. All right, we have to extract all the regions of interest, which is the text inside of the board, I don't need to recognize the board, I need to recognize the text. So there I have some regions which has the text, some other probably are drawings or other things, I don't know. So we have to do the recognition part, we have that Roy parts, the regions, and we do another filter, the news removal. Maybe the image is not clear, maybe somebody has just erased some part of the text, maybe that's, there are many possibilities there. So once I remove all the noise, which is the blurred parts of the images, and we have also some code there to just remove all the noise that we can remove, then we do the interpretation of the results, probably that letter is an A or it's an O, and that's the part of the interpretation, which is important for the recognition of text. If it's not well interpreted, probably the text is not going to have any sense. So we can see here the capture obtaining images. All right, I have a text, I have just a cell phone, maybe a camera or other device, so I just captured it, I obtained all the images. The preparations is like this one, we have maybe here are numbers, well I can see it and I'm a person, I can see that there are three numbers, but the computer does the filtering first, if that part, if that dot, that pixel is part of the text or if it's part of the background. So we make the computer try to understand what the picture does, this half there, and once the computer has that information, we start to process all the information. So the extraction of the regions of interest, the Roy, we have all the text, like right here, we can see just the service trailer break system, and we have only that part. I don't need to understand what's about above or behind or down, I don't need to see anything else than only this text. So I took it, I prepared that part, and I start to recognize the text there, so that's the detection and the segmentation of the image. Then I have the Roy, I have all the information that I have to understand, so I made the cutout of the probably all the parts that has text, and then I have the recognition, the information, the identification of the text. Yeah, here, like I have long text, and we're just using the little squares rectangle, sorry, to show which information I'm going to understand by the intelligence of the computer. After that, probably I have this noise on the text, so that's the removal of the noise, first, probably I have this kind of bad image, I can say that it's not clear enough, so we do that cleanup of the text, and probably it's not necessary sometimes, but probably sometimes it's needed, so yeah, that's part of the noise removal. After that, yeah, we have the text and we can interpret, do the interpretation of the results. So we have this text, we have only that part that I want to recognize, and then I just do the analysis of the text, and probably it will show me the same amount, the same amount of words, and the same words, and probably if the text is not understandable, not enough understandable, probably it will show me something different, but well, that's, there are ways to make it like better filters, or adding more filters to show the images better to clean the information there. So there we have that, that's how we analyze the images and the text of the images there. Which tools are we using for this? I'm using OpenCB, Tesseract OCR, which is for Google, from Google, and Python obviously, so one little demonstration of what's possible with this. I have this information, I just took it from the page of the page of the conference, so here I have the menu, which is schedule, these obsessions, tutorials, speakers, keynotes, and well, this is the part of, well, where I just put, okay, read me that image, filter that image because I don't need colors here, so I am putting just in black and white. The other one I'm trying to say, I need the information which is part of the words and the information that it's not of the word, not a word, it's the background. So that's where I have the threshold, it interprets like which kind, which part of the text is the text and which part of the image is not the text. Then I just put some configurations and I just show the information that it has, and here we have the results. And as I just see, it's the same, it's working, it seems like it's the same information that we have in the image, because this one is an image, not a text, it's an image, and there we have the text. Another example, there, well, I have this one from Wikipedia, so I just put some information of Bragg, I don't know if I said, well, sorry, yeah, and there we have, I put there a Roy, part of the interest of the text that I want to have, so I did the same thing, I put only in black and white, then I use a blur here, and then I recognize the text for only that region. And here is the region that I wanted to have, and the other region that I wanted to have is this one. So here is the text, I did different recognitions, I use this one like with config and only this one with the gray of the Roy of the second part, and that's the way that it shows the recognition that it does this way. So the first one, text in Roy, and yeah, it works very well, probably for my language, I think that if there are some of the signs that are not from Spanish probably, or the Latin vocabulary or alphabet, probably it will show something different, but here I think that that's not a problem, and there the other one, the same. It only shows like more space or less space between these difference. And what can I do then? I can put that text in an image, I can add that information, I can take it from one part and add it to another image probably, and that's what I did here. I use this image, the same amount of text, I use this part, and I took all the text inside it, and yeah, probably it will not have the same meaning because it's not completed here in each year in a, but it works. It's probably I need to configure if I need less amount of information that I have to take from the image, and probably I can take some parts of the text, but it works like in coordinates, maybe not that x and y, maybe until this point, I can take all the information until this point, so we have another kind of information like much complete. And here I just have all the text that I have here, it recognizes all the text, and it puts that text here of another image. It's an image of Bragg, I don't know if probably I can put another image or maybe more images there, and I just like copying, pasting the information of the image of the text and putting all the text copied in that image. There are many other possibilities, just some examples that I can show you. So here we have seen all the possibilities here, like only take the information to paste that information in another part, or maybe there are other more, but what kind of applications we can have, can we have about this text recognition? One can be digitalization, sorry, I think it's wrong spelled. We have the image and we can just took it like the example before of the image of the board. I can understand all the words in the board and put it in a real text in a better way, like digital, for word, for PDF, or I don't know, there are many possibilities. Another one can be translation, and there are apps like this one that uses the recognition of text in some places. You can just use your camera, your phone, and you can try to just take a picture of that word and it just translated to your language. It's very awesome that part because it's easy, it's fast, and that's how it works. The conversion, it's almost the same. We can convert maybe if we have enough data to understand the text. What if we have some signs, not language, but signs, which I probably said, stop or keep moving or a warning in images, we can do that conversion in text. If the person doesn't know what's going on in maybe, I don't know, a different country or there are many signs that are not in its own country, probably you can take a picture and see what's going on there. Yeah, it's better for us. I think it helps us a little bit of difference between countries and things that are different. Another one can be security. This one is implemented for example, this one is implemented here in my country and we have the inspection of the lakes, the numbers of the cars. Some of them can't go into the city, into the center of the city, because there are some very, very large amount of cars there, so if they all want to go into the middle of the city of the center, probably it will cause chaos, probably. We have this kind of security, maybe the inspection of the numbers. If you have that number, probably you don't have to be there and it's an amount of money that you have to pay. There are these examples that it's a car, it's that distance from the camera, but it recognizes and it shows that, okay, that number probably is not allowed to be here and we have, take a picture and we just put the receipt into this mail, probably. Another one can be like this one, like I said, about the signs, the stop, maybe for somebody that can't see very well. It can show, okay, that's a sign of stop, maybe some alarm, something like that. And well, reconstruction, yeah, we have this image, these two examples, maybe this stop sign is blend or something like the way that it cannot be read very well. So we can do that reconstruction of the image and we can show that, maybe that STO, it's a stop sign and that helps us a lot. So this is only some examples of what's probably, the intelligence of the machines can do with vision recognition, with text recognition. We have seen all the parts that we have to do, the capture, the preprocessing, the ROY and finally the analysis and then we can do many mixed things. So of some of the conclusions here are that OCR, the recognition of text in images, but this is the need for manual data entry and it also increases the efficiency of processes, the saving time and resources. It's faster, so of course, it saves us time. Resources, well, we need just probably a camera and some algorithms there. And well, it provides accessibility to different content and precision for decision making. We have seen many examples here and the probability of, yeah, we can make it better, we can do it like better recognition sometime, maybe better capture of the, I don't know, the place, the people, things, it has many possibilities. And yeah, I think that's all that I can see. I can tell you about this recognition of text in images, how it works and what are the possibilities here. And thank you so much for this, for giving me this opportunity to share it with you. And yeah, here are some of my social networks if you want to ask me something. Thank you very much.