 Okay, again in this video series. We are looking at the Google capture flag for 2018 again shout outs to live overflow and John Hammond because they're the ones that brought this to my attention not they didn't actually contact me ring to my attention I just watched their videos and liked it and thought I'd go through this And today we're gonna be looking at this one right here. OCR is cool Again, I've made a script that automates through this if you go to my get lab page get lab comfort slash melex 1,000 Ford slash capital CTF You can download that and look at my scripts which automate it and give you the flag and then you can walk through those and see how I did it and Before I did this one. I actually watched John Hammond's video on it and He did good, but I I went a different route just to be different And I'll show you why because here it even gives you a hint again if you read the story here Caesar once said don't stab me and this is going to be a reference to what you need to do to solve the Code If you don't know what a Caesar cipher is a Caesar cipher is basically when you letter shift code so If I write you a message I shift all the letters by a certain numbers So if I Go, you know a if I say shift by 3 v BCd so anytime there's a D in my letter It's supposed to be an a and so forth and so on and there's actually a program Part of the package on Linux systems called BSD games and once you install that package there's a program called Caesar Which will shift letters for you, which is how? John Hammond did it and I was like, I'm sorry. Yeah, Hammond Hammond and Which was awesome. I'm glad to know about this program But I didn't want to make like a video. This is the same as his video Plus I wanted to although we are going to be using some OCR tools and other things that aren't installed on default system I wanted to try to see if I could do the cipher shift the Caesar shift Without installing without using that program using tools that are built into Pretty much every Linux distribution. So I googled it You know shifting letters over and I found and added to my script instead of using the Caesar program I just use TR which is for replacing programs and it uses a print F and then it shifts And we'll look at that in a moment, but I just did that to be different Than him and make it one less package for people running my script need so let's go ahead and switch over to the shell here and I'm going to move into the OCR folder from my get lab project and in here Let me go ahead and run my script. There's already some files there I don't know if I made the script clean things up, but you run it one second. Please right now It's resize it download the image resizes the image runs OCR in it makes a few corrections on my My my manually and then it gives you that that's why it takes a second I gave it the message. So please one second, please. So let's go ahead and look at my code So again as it Gotta give it Ah so I'm going to use Tesserac to decode the image so when you Right now my script at the beginning removes any PNG's and Just dumps any error output to know so if you don't have Tesserac installed my My script will say you check it and say you have to please install it and then says you'll probably want to you know aptitude if you're on a system that runs apt to Tesserac dash OCR to install it and then exits one meaning that it failed to run because you need that installed anyway after that We make sure that we clear out any PNG's that exist and then we download the zip file and unzip it from the website and Then we say one second, please again. Everything else here runs a little slow Image magic also needs to be installed and may not be by default But is on a lot of systems and here I did something a little different than John Hanman He scaled it to 300 He took the image and scaled it up to see if he got better results and He got close results, but he still had to manually change a number of the characters But but real quick. Let's Look at display OCR so this is the The image you get and If you just you look at it You can see it's just a jumble of letters and it's an image. So we have to either manually type this stuff out or use OCR to Convert it to Text and then we can run our script on it looking through this. It's kind of obvious again. This is a beginners capture the flag all The the flag start with CTF and then have a code inside Curly braces or right here. You can see VMY curly braces some letters curler. He says so right away. We know That this is our code and we already know that it's a sea site the Caesar cipher. So here I can go. Okay, V Now I need to do the alphabet V q r s t you XYZ Where am I s v q r s t u v w x y z a BC, okay, so just counting on my fingers from there from v wrapping around to see is seven So we already know that our Caesar cipher is a shift of seven I mean so at this point you can do it by hand But of course we don't want to do that we want on it because because again if I was to do M I go No, p q r s t. Hope that's our second letter Ct and then the next letter should be f. So y x y z a BC de f. Yeah, so it's a seven shift cipher. So you could you could just manually write this down Not that hard, but we're gonna try to automate it as much as possible here But that's our code right there, and that's all we really care about is just ciphering these words right here, okay? So looking back At my code here I'm using image magic to scale it up double the size double the resolution and I tried a bunch of different Resolutions for this image and I found that 200 did a pretty good job and I also went to the Tesserac website and Searched for best size and they said that Tesserac works best if you have a DPI of 300 so I changed density to 300. I don't know how these two You know affect each other when you're running the one call like this But I ran that and then I output it to the same file and then I dump the output of that of Tesserac into a variable called txt And I was very very close there were Two characters that I just could not get to completely Convert, but here what I do is even though. I know it's a seven Seven shift cipher or wherever you want to call I'm shifting seven seven layers. I wanted to um Cycle through all of them, which is what John Hammond did on his website, but I again did different He used a loop and he used the Caesar program here on looping I'm using a sequence to 25 now another thing, and I'm just picking on on John and at this point. He actually in his video Use Caesar and he cycled from zero to 26 because of 26 letters But really he only needed to cycle from one to 25 because a zero shift gives you nothing and If he went all the way to 26 he's back at the beginning again So really he did two cycles that he didn't need to It doesn't make a difference. I'm just giving him a hard time about that, but no so Here I am saying you know sequence 25, so I'm shifting it one through 25 here And I'm echoing that number and text our text that we grabbed from Tesserac And I am piping that into TR and then we're going to run a sub command here of Print F and we're using that variable. So that's going to be our shift. We're shifting and we're piping that into TR Again, but then the original TR command Is taking all the capital and lowercase letters and shifting them this number? I know it looks a little confusing And the Caesar program is a lot easier to understand but again I wanted to do it trying to use a program that is built into every Linux system out there So that's what I did after some googling. I did not know how to do this originally And then if we look back at our our original image So when you're using an OCR program optical character recognition You're taking image and you're trying to convert the characters in that image to actually actual text characters on your computer Different fonts it's like you can program your OCR to work with different fonts, but here I'm just using the default, you know, Tesserac and What I was realizing well one I you always get this The letter O might come out as a number O or a lowercase So might come out as a capital O or the other way around same with P's P's might be capital P's or lowercase P's It's hard to tell you can usually clean that up by running it through some like you know Spell checks and stuff like that you're working with actual words, but we're working with a long string It that there's words in there, but they're they're all jumbled together So I Did notice that when I ran it through Tesserac My code here a lot of letters were uppercase and if we again