 This is a cipher message This is what they wanted me to tell my audience Someone is trying to make contact with me. Which number represents a letter A is 1 B is 2 C is 3. You can hear him too, right? G U R T H E a Shift value of 13 It's a rough 13 algorithm Today we're gonna be talking about passwords and specifically how I built an AI that can crack them Thank you to AI camp for sponsoring this video if you're new here My name is Josh Beasley and I'm a recent graduate from Yale University where I study computer science Now I live out in LA where I'm active duty military and I make videos about college tech and anything else I'm interested in before I start blabbing on about neural networks and adversarial training loops I thought I would give you all a little 60-second introduction on how you go from a string of characters that represents a password To an AI that can crack them So first off when people say AI or machine learning they mean a lot of different things But this project focus specifically on one type of machine learning called supervised learning So as supervised learning you have a lot of data and attach that data is Ground truth values that you want the AI to predict given the data So if I wanted to build an AI that would recognize handwritten digits The input data would be the actual handwritten digits and the output data would be what the digit actually is So this would be a 7 and this would be a 9 Supervised learning involves feeding a program a bunch of these handwritten digits Letting it spit something out and then telling it if it's wrong or right over time If you have a good model and you do this enough times the computer will begin to catch on to some of the patterns in the data and begin Getting a higher and higher percentage of predictions correct now I use this word model and that's the tricky thing about AI because we're not explicitly coding a program You're actually coding a model that has a bunch of attributes that can be adjusted based off whether your prediction is correct or incorrect There's a ton of different models to choose from the one I use for this project. It's called a neural network I won't get in the details, but this type of model supposed to simulate neurons in the brain So you have a bunch of neurons and when you feed data through them They can either fire or they don't fire and neurons are connected to other neurons and these things called layers and the neurons that are Connected to each other each have a weight associated with that connection now That weight is one of the attributes of the model that I was talking about that She can wiggle up and down to improve your prediction accuracy And as you feed the model a bunch of data and it wiggles all these attributes up and down That is what is called Training and I think that was definitely more than 60 seconds if you're interested in learning some of these AI techniques I highly recommend checking out the sponsor of today's video AI camp AI camp is an immersive Project-based summer camp for motivated students under the age of 18 It allows them to explore the field of AI before going to college and getting locked into a career You'll have the opportunity to work in a small group of like-minded students to build and deploy an impressive AI product Like those seen here with no prior coding experience necessary testimonials and success stories from alumni Really speak for themselves, but your experience could lead to a internship at a top startup in Silicon Valley The best part is about 60% of AI camp students are on scholarship and 10% attend completely free I really encourage you all if you're interested to check out the scholarship application at the link in the description It can all be completed in one sitting back to the product I coded this AI as part of my senior computer science thesis at Yale and I'll just say it was very involved In fact, I had three completely different models that I had to build and run a bunch of different experiments And it's all detailed in the little thesis booklet that I printed out But we're not going to get into that I'll save you guys the research and say out of the three different models that I tried The recurrent neural network was the one that actually performed the best Recurrent neural networks are best used for sequence data and in the case of passwords It's sequences of characters So all the network does is take in a single character and then predict what the next character should be As it receives more and more characters it begins to build up somewhat of an internal memory And that internal memory is what we're going to use to predict passwords So we have a recurrent neural network. Now we need a data set Luckily cyber security is a really really hard thing to do right And there's tons of leaked password data sets online Now all you have to do is feed your recurrent neural network Thousands upon thousands of passwords as it reads a window of characters through the text and predicts what the next character should be And then it builds up that internal memory and develops its own sense of what the structure of a password should be And then all you have to do is wait a couple days because training takes really really long Even when I was using the Yale super computing cluster after all that training You can test your model on a reserved portion of those passwords Just to make sure that it's performing up to par and then start cracking It's literally as easy as supplying the recurrent neural network with a single character and then letting it Just run for as long as you'd like and generate as many characters and therefore as many passwords as you'd like At the end of the day, you have a text file with a ton of potential passwords that you can use to start cracking Those password cracking techniques rely on a dictionary of passwords that they modify Switch around and then basically run through all of them and test to see if they work Now this model is a great way to expand upon an existing password dictionary And ultimately improve the success of cracking a given password So me personally, I train the model on 80% of the data. That's what we call the training set I reserved 20% which is the the testing set and then I let it run for a while generate thousands upon thousands of passwords And then I checked how many passwords that were generated Lined up with passwords that it had never seen before in the testing set And it actually cracked quite a few you can actually see here some of the passwords the recurrent neural network was able to crack Based off training on different sizes of data sets and that my friends is how I built an ai that can crack passwords If you enjoyed this style of more informative video and talking about some cool You know c s tac ai things that i'm interested in Love to hear what you thought in the comments down below as always my instagram dms are open if you have any questions I'll drop a like on the video and it helps me and supports the channel a lot more than you think And if you're new, please subscribe and if you're new, please subscribe and I'll see you all next time