 This has been such an absolute wealth of knowledge. Are you like totally cool with me putting this conversation on YouTube? Is that weird? I'll hang out. I lost, I feel like I lost your audio for one second. What was that? Can you hear me? Yeah, I can hear you. I will caveat that by saying that I am in no way a machine learning professional expert anything. I have just picked this stuff up. So maybe, I don't know, cut out that chunk and put it on at the beginning and say, like, if there are children in the room, kick them out. Hey everyone, it's John from the future. I just wanted to give you a little bit of background as to what this video is. So a few weeks ago, I was asked to join in on a panel to discuss artificial intelligence and machine learning within cybersecurity. And a few of you that may know me know that I do not usually focus on artificial intelligence. So I reached out to my friend Pat who is much, much bigger in that scene than I am. And I asked him like, hey, can I get a little bit of help? Can you help me, I don't know, figure out what stance I could take for this panel? Could you just help me understand everything that you know? And Pat gave me honestly like a crash course in artificial intelligence and machine learning. And it was awesome and incredible. And I kind of wanted to share that conversation with you because I thought it was just fascinating. So it's a high level thing. It's just talk. It's literally me taking notes and trying to absorb everything within the moment. And I didn't think, hey, I'd put this on YouTube after the fact, but I asked him once the conversation was over, like, look dude, are you cool with this? And he's like, yeah man, that's fine by me. So I want to share with you. I hope it's really cool. I hope you get some interesting thought provoking nuggets out of it and I hope you enjoy. So let's roll the conversation. See you there. So first of all, there is a difference between machine learning and AI. So I can tell you about what I've seen academic side and what I've read research-wise. And there's a difference between that and what you see in industry. And it also depends what industry you're looking at. So as you know, things get kind of twisted and distorted once people are trying to get grant money or seed funding or whatever. So the true meaning of machine learning versus AI kind of gets lost, especially in the field of cybersecurity. It's a lot less developed than you think it is. But before I get into any of that, what exactly are you doing? So from, did I send you a screenshot of that email or no, I don't think I did. No, you just sent me a little blurb. So you got tapped for a panel on AI machine learning. They said, hey, machine learning and artificial intelligence are here, but they aren't quite ready for the mainstream. It's one thing to know AI is the technology of the future. It's another thing to think that the future is already here. Learn how tech leaders view and make use of these technologies and what AI and ML can do for you and your business and where they fall short for now. So it's a panel discussion, just me and some other dudes talking about making the jump to AI. Okay, so what exactly are you expected? What exactly are you expected to do here? It sounds like it's a conversation. Okay, yeah, I mean, I can give you an overview of machine learning versus AI and I can probably set you up with some interesting questions to ask. It's an incredibly deep and complicated field of study of the incredibly deep and complicated things that happen. I say maybe 40% of it sees use nowadays in the industry. You maybe get closer to the 70, 80% mark if you're dealing with very high technology AI companies or privately funded labs who are really pushing the edge, which you only get with companies that are already loaded. So like Google, Facebook, they got tons of money to spend so R&D is not a problem. Tesla, they're doing stuff there that's pretty cool. Sure, so yeah, I kind of put together this little framework of what I want to talk about. So first thing I'm gonna say is like, what is AI and what is machine learning? So there's like a big difference there. Machine learning, it's almost hierarchical. So it all starts with statistics. It's just making sense of data using the mathematics of statistics and machine learning builds on top of that. And then I would say AI is built even higher on top of that but AI is very abstract, right? AI really refers to, it's artificial intelligence. So it really refers to a machine system that can make decisions. Whether that's an artificial general intelligence, yeah, that's a field of research that is very young but we're nowhere near having artificial systems that are aware, truly aware of their surroundings or even aware of text. They can draw context, but anyways, I don't wanna get too far on the weeds there. So it's all built on as statistics. And the most simple thing you can think of is, I have a notebook here that I'm just gonna kind of draw on. The first thing you would ever see is just like basic regression. So you have something like this, X's and O's and you wanna say, okay, how do I describe in general what the tendency of like the X's are doing? You draw a line through it, that line is pretty close and it approximates and you can use that to approximate what's gonna happen in the future. That regression has all sorts of applications. You can use all sorts of different types of curves. So you have like polynomial regression, quadratic regression which is just a subset of polynomial regression. It's just like how complicated do you want that function to be? So what I just showed you is just your Y equals X or Y equals X plus B in more general terms, it's Y equals MX plus B, which is your basic line equation. Then like quadratic, it's some parameter X squared. And then you can have all sorts of trailing off stuff. So like AX squared plus BX plus C being your constant. Then you have higher level polynomial stuff and then you have a logistic regression which a logistic curve is really just kind of like kind of just looks like that. It levels off at some point. Well, it doesn't truly level off in reality as whatever's on your X axis goes to infinity. This will also go to infinity, but it very slowly approaches. So you can use it as like a threshold. But like long story short, there's lots of regression that goes on there. And that can be used for a lot of things. Really you just see it in data science nowadays. It's like, all right, I have a bunch of data. I want to make decisions. How do I do that? And it gets used a bunch in SOC operations like security operations, center operations, right? Because just using basic data science techniques which is pretty much just basic regression techniques, you can do long tail analysis of your events. You're looking for low frequency events a lot of times, like all the high frequency stuff, those are a lot of times false positives or it's just noise. And you want to see the one or two actions that really mean something. So driving down into that. And then you can also use data science in threat hunting to kind of close the loop, to kind of automate events or actions you have taken on a really basic level. But that's where a lot of what you see machine learning is now in stocks. That's not to say there isn't cooler stuff being done. You see like behavioral analysis, they call it behavioral analysis or SOAR really is again just more behavioral analysis. They're trying to establish a baseline like what is normal user activity? And let me make a note of that so we can come back to it. And then, okay, so what exists when you start to get into the world of machine learning? One of the big considerations is data sets. So you have to have, that really gets broken down into two things. What features you have? So what individual, I mean features is the best word for it's like what items of interest are you looking for? What things can you say? Machine learning algorithm, tell me more about this specific feature, this event or instance I have seen. So it's features versus how much data you have per feature. And typically you, this is like the M versus N idea. You want the data you have to be much, much, much, much larger than the amount of features you have. So if you're trying to understand what's going on and you only have, yeah, so let's say you have 10 features and you have 50 total data points, that's what you call like a not a well posed problem. Statistically speaking, whatever machine learning technique you use, I say like neural networks and more advanced, it's not gonna be a good prediction. It's gonna be underfitted. It's not gonna be a good prediction of what happens in the future. That's in contrast to the other idea, which is overfitting, which brings up this interesting point with machine learning is that it's super finicky. You have, if you don't really understand the data that you're working with, which goes back to fundamental, do I have a good knowledge of basic, like really just basic statistics and couple with security, which that's quite rare in this field, I think. If I don't have those things, then I can't really determine whether I have a good data set. And the data set has to be pre-processed. It has to be cleaned. It has to be cleaned with respect to statistical considerations, removing biases, things like this, making sure that you have good data versus the features you want. If you have a lot of repeated data, that could be a problem in itself, but I could go on into infinite as to all the things you can do to clean data. But I don't really wanna go into that here. What you can do, there's one technique, as I start to move into this machine learning stuff, is it one technique that does work for low feature data sets, or excuse me, low data data sets. There's a lot of features versus the data, something called support vector machines. This just kind of, this'll apply a, if you think of a prediction into the future, it applies essentially variance, so a prediction of how good your projection is. And you can actually kind of tweak that. So like, yeah, you have, let's see, so you have Xs like this, and you wanna kind of fit that using a support vector machine. Yeah, you might end up fitting it something like this. Yeah, it might fit like something like that because you're starting to get into these higher order techniques. But what it's gonna do is it's gonna provide you with a distribution and a variance for each of those data points. So it kind of tells you, okay, given the data I know, this is how well it's one represented by this line, and two, it's gonna tell you this is how likely that this is a good data point because not all of your data's good, you have some outliers, you even train on outliers as well. And then it's also gonna give you a good idea moving into the future. You can say, hey, this point of prediction, like how well do you think you being the support vector machine, you think this holds up? That doesn't really get used so much in cybersecurity anymore. So I'm gonna move on to another idea, something called fuzzy logic. You might hear about that. That was actually super popular in like the 70s. And I had professors who did PhDs in it. It was like the precursor to neural networks, right? Well, I mean like the idea of a neural network, artificial neural network came around in the 60s. They had the initial first AI boom, where everyone got excited and they thought they were gonna solve artificial general intelligence in a summer, no joke. And it didn't happen. And then all like, then 10 years went by, then people got sad and then the funding dried up. And then fuzzy logic came on in the scene and they're like, oh, look, freaking cool fuzzy logic. That's gonna fill the gap. And people ran with it for like 10 or 15 years. And then really fuzzy logic is just a bunch of fancy if statements. And it's, yeah, you might see it here and there. I was extremely underwhelmed when it was finally presented to me. You'll see like fuzzy logic. It's like making decisions under uncertainty, but really it's just what it's doing is it's chopping up that uncertainty into a bunch of if statements. And then you can kind of tune how you chop it up and how those decisions you'll make on those if statements, but it's very deterministic. If you're dealing with like an actual like system that's modeled mathematically, it's what you call linearizing. So you'll like represent a very non-linear system, which before I say that, oh, yeah. So like, if you had something like just like this little arc like that, describing how your data will move into the future, which I'm drawing this all in two dimensions and it's kind of hard to maybe imagine what that means. When you deal with cybersecurity data sets, this'll be, yeah, 10,000 dimensions. And I can't draw that on a piece of paper, but what fuzzy logic is essentially doing is it's picking a bunch of points and it's just linearizing it and it's just representing that. So, you know, like where you draw a bunch of points, it just draws a straight line. So it's like, okay, the more points you draw, the closer approximation you'll get to this more funky looking function, but at the end of the day, there's a bunch of if statements. So, okay, so now fuzzy logic, I wouldn't really, it's not really anything that you need to be concerned with. Now, what I've been talking about was fitting data with curves. That's important for prediction, but before I start digging more into neural networks, there's two types of like two main classes of problems that could solve with machine learning and one is prediction and projection. So like, how do I look into the future and say something about where I will be in the future? Given new data coming in that I didn't know about, given something that you can kind of take your hands off and let run and it'll be able to kind of predict the future. That's one class of problems. The other is actually a classification problem. So it's actually recognizing something. So you have prediction and I'll give you these notes when I'm done, prediction versus classification. And I think honestly, they're both, yeah, they're both highly relevant to cybersecurity. Why prediction is very useful because it's like baselining user activity, right? If you see that this user always logs on at 7 a.m., they always log on roughly two to five times a day. They always roughly log on from these same IP addresses which are geographically correlated to these locations. Like, okay, now you have a pretty good prediction of what they're gonna do in the future. And when you see them log in from China, you're like unless they're taking a holiday in China, like that's a problem. And so, yeah, Microsoft has this, I'm not gonna throw darts at any specific company. I don't get sued, but like people will say, okay, like set up my fancy tool for a week. And after a week, I'll be able to give you a baseline of user activity. Of course, like you can see where this is gonna have holes, right? So your system admin is gonna have a lot more variation in the systems they're logging in. Your basic security administrator is gonna have a whole bunch more variation in the systems they're logging in. So you kind of have to go in there and look at the false positive of being flagged. You really have to have somebody with security know-how, look at it, think critically about one, the machine learning that is being applied in the background, and to think about their security concepts and say, is this a false positive? Should I just take this off the board completely? So let me stop looking at it. Anyways, a lot of that, you don't really see too much of that, to be honest. You know, you get these really fancy packages, they cost a whole bunch of money, you think they're gonna work, and you don't realize that with complexity comes increased complexity for administering the tools. So you can't abstract away all of the complexity to a vendor, and that's the direction we are heading right now, which I see this firsthand in my job is that everybody wants the fancy new tool, but nobody wants to hire the analyst who now needs more skills, more knowledge, and a wider range of fields to be able to properly interpret those results, properly tune those results. I mean, you need at least one person there who can say, hey, let's tune out these false positives so that standard SOC watch floor analyst doesn't panic, or so that they can just see something useful, because then you get flooded with all these false positives, which again, that's ops normal for a lot of folks. Okay, so prediction versus classification. So prediction, if you think of that as like user baseline activity, classification is split into two chunks. So you have supervised, so it's based on how you learn a classification. So maybe I give an example of classification first. First is I have identified, I have a classifier that identifies a quote unquote zero day activity. Yeah, that's a good example. And that's really where a lot of the research is being done nowadays is you can also do that with prediction, but you can also do it with classification. And what is it, it might make more sense if I describe exactly how classifiers, like what different types of classifiers are. So you train a classifier by feeding it like a metric ton of data. And there are two ways to train a classifier. One is by supervised learning, another is by unsupervised learning. So if you haven't kind of already caught on yet, I've just started talking about neural networks. So you can do regression techniques for prediction, you can do neural networks for prediction. It's not really feasible to do regression for classification. That's where neural networks really make their money, is with classification. So you have supervised learning and you have unsupervised learning. So I'm just gonna draw out a really simple neural network. Oh, let's see. Okay, so that's a simple neural network. It looks like kind of garbage and I'm gonna draw some more stuff on there. But it is a single layer neural network. And this is your, so that's a little bit more detail. So you have your inputs, which is the data you have, that giant data set you have and you're shoving those all through your neural network and you have, you know, a lot of neural networks will have wider layers. Like I'm not really gonna get too far into architecture here, but you can have all sorts of wonky architectures for neural networks. And that gets you all sorts of interesting results. I'll talk a little bit about those, but this is just a single layer neural network and it has outputs. Neural networks are statistics. What happens is you have a back propagation algorithm, which just runs a neural network in reverse by looking at gradients and it's basic statistics applied over and over and over and over again. Now, with those inputs and outputs, you have two ways to train that data, train that neural network to make like classification decisions or predictions about that data. And that is by supervised learning and by unsupervised learning. What that means is that supervised learning means your data coming in, you normally have to split your data set into three chunks, you have your training data, the data you are going to train your neural network on, you have your testing data, the data you're gonna use to test whether that neural network is working or functioning properly. And then you have your testing data, which is completely, you don't touch it at all. At the very, very end, once you've trained your neural network, then you apply to the test data and then you just see like, how well does this perform against data it's never seen before. Now, actually splitting up that data set into those three chunks, that has its own considerations and requires like some skill. What were the three I heard training and testing, but I think you said training. Training, testing and validation. Okay, okay. I thought I would say testing twice. Yeah. No, it's training, training, testing and validation and then there's, you'll get different numbers. Like sometimes it'll be like a 60, 30, 10, split or a 70, 15, 15 split or whatever. Yeah. Anyways, supervised learning, those inputs coming in, you, when you're training, you have labels and you already know what the output should be. So you have to like tag every, all of your training data with what you expect the output to be. That sounds horrifying. And it is because when you have a huge data sets, like I have been in the room before where you got a guy's 10 years, PhD and computer vision and he just, he clicks on a box, like repeatedly every time for three hours, just, just tagging his data. You know, it's, and nobody else can do it for you. A computer can't do it for you because a computer doesn't know how to do that. You're trying to teach the computer how to do it. So like there's nothing else you can do there. Like there have been attempts to get third parties to do this. I think Amazon did it with, they'd pay like one cent of every image you classified or whatever. Was that like the mechanical Turk stuff? Is that right? Yeah, exactly. Yeah, it was mechanical Turk. So that's what they were doing. They were, that was supervised data sets. It finds, it trains to find structure that you tell is there. Unsupervised learning is where it gets really interesting. It finds structure in, it finds structure in unstructured data automatically. So you're basically telling the computer, hey, there's interesting stuff here, find it for me. And you have like some knobs and switches that you can, you know, like turn and flip there to say like, okay, a clustering algorithm is one example of this. You have a bunch of data. It's an unsupervised algorithm, but you tell your clustering algorithm, okay, there's probably four nice clusters of things. So then it will look for structure and find you four clusters or five clusters or something around that ballpark. And it will make its own correlation decisions. So what it's like, mind we're going back to statistics now, it's all statistics. You're like, how does this data relate to other data that is being ingested? And it will group things based on correlation that it finds. And it's a black box method, you know, a terminology white box, black box. It's very well known in machine learning and artificial intelligence. And all the neural networks are what you consider black box. You like, once you've trained a neural network, there's really no way to open it up and look at those training layers and say, ah, yes, this is why it's not working the way I wanted it to. Or, ah, yes, this is why it's working the way it's supposed to. That is like cutting edge research right now is figuring out, hey, what of those nodes and those layers can I set to zero? Which ones actually contain the important information? What can I, can I actually extract information from it as it's training? The answer is yes. To what extent? I don't know, that's really cutting, very cutting edge right now. But unsupervised learning is where you'd want to head if you were trying to do classification in the security setting. You don't have time to label all of the things that could happen. In fact, you know, the types of labels, like the labels you would need, the sheer quantity of the labels you would need grows almost exponentially in the cybersecurity domain. If you think of just five basic cybersecurity attacks and all the different ways it can manifest in all of the different networks that exist in the world using all the different protocols, you know, and then you say, okay, get somebody in there and write all of those down, you're already like, this is already infeasible. It's not humanly possible. And there's not five possible attacks out there. And you're getting a new one every, now I feel it was like every day, but anyways, unsupervised learning is where it's at right now. And that is probably where you'll find interesting things in cybersecurity. In fact, yeah, there's a guy at the Academy right now who's going for a Fulbright fellowship who I was talking with him about what he can do research-wise for cybersecurity. And he wanted to find a malicious activity just by having access to lower level information in a computer. So maybe he just has access to all of the logs a operating system produces for itself in addition. So like we're talking about windows, there's a bunch of logs that are created outside of the EVTX logs. So maybe he just has access to that and all of those EVTX logs. And then you can run unsupervised learning algorithms against that. Anyways, it's again, we're in the domain of research right now. And it's not, and you're gonna hear people talk about this like that their company is the one that poops gold and it's not true. All right, let's keep going on with neural networks. So there's a couple of different types of neural networks. You've got convolutional neural networks. So convolutional networks are popular, but I don't really see their value in cybersecurity. They are used for image processing. So what they really do is, yeah, you've maybe heard the term deep learning. That's also been a great buzzword. Deep learning just means you are, well, if I describe the basic functionality of convolutional neural network first, that's a bit easier. So you need to talk in terms of low and high frequency. So let me see if I can quickly find a picture on Google. It's pretty well known, like the low versus high frequency data and an image, of course, I just see a bunch of ugly graphs. You can actually see it with a, oh, that's a good one. My screen real quick, let's see, share. Okay, that's a good example of low versus high frequency information. So your low frequency information is the meat, it's this, it's 98% of your information and it's how something, it's actually how compression algorithms work. But so your low frequency information is most everything. Your high frequency information is the fine details. Of course, that is a sliding scale. Now, when you talk about compression algorithms, like, well, you can talk about like JPEG 2000 was one of the first compression algorithms out there. It's lossy and it's lossy for a reason because they figured out this technique called principal component analysis. It's just a matrix technique where you just like look at the eigenvalues. But when you ever hear eigenvalues, just hear important bits. Whenever you hear the important information, like you break it out into a matrix and you can see like how much of that information is actually important. And turns out you can chop out like 90% of the eigenvalues and still contain all of the information a human being can visually process. And that's where JPEG 2000 came from. They were just like, hey, why don't we just lob out 90% of the data and now we have a compression algorithm and it worked, it worked really well for a while. Obviously that's not gonna be great for like, I don't know, data files that you need that aren't images, but it could probably be a hilarious demonstration. But yeah, you're certainly not gonna be running Doom on that. But anyways, that's convolutional neural networks. What they do is they're not just a single neural network like I showed you here. They're typically chains of neural networks. So now we're starting to get into this more dynamic idea of chaining things together. So what convolutional neural networks do is they extract high frequency information out of images. So all of the really fine little details that distinguish images from each other. We don't care about that low-frequency information. You look at that low-frequency information between penguins, you can't tell which penguin's which. Yeah, you can't tell if that's Frank and that's Greg. It's impossible. But if you just look at the high-frequency information, cut out all the low-frequency stuff, you can actually tell the difference between them without actually having like an image that's really visible for a human being. But for the computers, it makes perfect sense, right? So convolutional neural networks, as you continue to like pipeline them together, you can extract higher and higher frequency information and you really just look like you're narrowing the scope, narrowing the scope, narrowing the scope. And all of a sudden, yeah, okay, now I can tell the difference between like blue jays, like different species of blue jays or whatever. Or I can figure out like when there's a red light like my self-driving car, I can figure out where there's a red light, that kind of stuff. But that's convolutional neural networks. It really just applies for images for the most part. Then we have something called recurrent neural networks, which is you've probably heard about. And that brings up a interesting point that it doesn't just matter what data you have, right? So you have a giant pool of data, which all the neural network approaches need a lot of data. And then you need to convert that data into something that a computer can ingest. But that's not the only type of thing that a neural network can operate on. You can actually structure your data into graphs. And the nodes can be important pieces of information. They can even represent dynamic processes a node could even be a description of a computer that has an endpoint, as long as you can adequately describe it in a way that a computer will understand. And then all of the vertices they connect them. So I'll just draw out a simple graph right now. So that's a simple graph. It's not connected because it's not closed. Like there's not a limit cycle there. But that is what you call an undirected graph. When you see something like that, it means with no arrows, it means all the nodes are communicating bidirectionally so equally back and forth with each other. But each of those nodes can represent a process. It can represent just straight up data, but you can actually build incredibly elaborate graphs in higher dimensions than just on this piece of paper. You can structure it in a way that you can actually structure them. In fact, this is how our brains are, like they're actually using this to model how our brains work in three dimensions. You can structure that if you can have things like that. This gets used in social dynamics. So they model, this is like a huge Cambridge analytical thing, right? Analytica thing. So actually this is kind of interesting and relevant for cybersecurity because this is information security. Social opinion dynamics is a very, very hot button topic and people are getting quite good at it right now. So you can model your data, not just as a flat data like it's in a database, but you can also model it with structure. And since we've started to figure out how you can model your data as a structure, so like your data is represented in a graph, we've also started to say like, huh, okay, what happens if I have data for let's say 300 million Facebook users, Cambridge Analytica, and you can start to make profiles of who has opinions on what topics. And then you can make lines connecting them, so vertices connecting those nodes, the node being a person, that value at the node being their opinion on a certain topic, and you can figure out, hey, what are the opinions of all of their friends? So the lines that are connecting them. And then you start to ask interesting questions. So like, okay, I have this really elaborate graph of 100 million people, and I've trained this through my neural network and I can see the interactions between all of these nodes and I can see which interactions weigh heavier. So now I can start to say, okay, what happens if I tweak this guy's opinion over here? Will this cause cascading effects? And that's actually quite well known now. And that is a lot of the techniques that were used in the previous election to, you know, you call it data science, but in reality, there is a lot of actual machine learning going on under the hood there. Now those graphs, you have simple graphs, so that means the nodes themselves are like a single number. Usually an integer. Then you have complex graphs where each node can be like more complicated so it can represent a computer or like a series of differential equations. And that's an open field of research is steady, like figuring out a steady states, like what that graph looks like longterm if I tweak a node on a complex network. So yeah, so like that has big implications in social opinion dynamics. Now once you start structuring data in a graph like this, you start to enter the world of recurrent neural networks, right? So you start to say cascading effects because of changes in that network, let's train on it. And it kind of just starts to roll out into the future a bit. So like feed forward, it rolls out into the future a bit while it's training and says like, hey, what are the follow on consequences of this? So now you're starting to see deeper into the research in neural networks. So it's kind of all building on itself. Recurrent neural networks are a bit more recent than convolutional, convolutional more recent than neural networks, neural networks more recent than support vector machines and then regression analysis is the beginning, right? So recurrent neural networks, yeah, they could have implications in cybersecurity. I don't know where you would need better understanding of the activities that occur on a normal network and the follow on consequences of actions. And that sounds really abstract, but that's really, you would need a way to represent that, like algorithmically so that a computer could understand it. The last thing is that I wanna touch on for architectures is Gantt adversarial networks, so Gantt networks. So you can actually train neural networks. So what they found was, you train a neural network, you get a certain percent validation. So it's like this neural network is gives you good results and good results is a loaded thing as well. Whenever you talk about a false positive rate, you also need to talk about false negatives. So most people just talk about false positives. So they'll say, hey, my neural network gives you the correct output 99% of the time. So the false positive rate of 1%. They're like, okay, but how many times does it tell you it's correct when it's actually wrong, right? Excuse me, how many times does it say it's wrong when it's actually correct? That's a false negative, right? So it says no bad guys here, but there's actually a bad guy on your network. That's a false negative. And that's actually really important. That has much bigger implications in like genetics research really than necessarily cybersecurity, but it would in cybersecurity as people start to do that more because, and it's really, it's kind of under a veil right now. It's up to the end user to say, hey, what is the false negative rate of your fancy AI tool? That's my trust relationship with it, right? Because I can't see when it's wrong. When it's a false negative, I can't see that. I can see false positives because they flood my SIM or whatever and I can whack them all with those, but I can't actually see when there's a false negative. And since I can't see them, I can't tune them. So that is a pure trust relationship. Is that something that needs to be considered now? I honestly don't know because most companies are keeping proprietary secrets. So you wouldn't really know. Gantt adversarial networks are interesting because what they do is you actually have two neural networks and one of the neural networks is an opposer. It's an adversary. And this could have interesting implications in cybersecurity. So what one neural network is saying, and it's easiest to describe in images, one neural network is saying, okay, you fed me an image, my neural network will classify this image as something. So it classifies the image and it says, oh, hey, that's Greg, the penguin. The other neural network says, no, you're wrong. That's not Greg. And you just have this back and forth, right? So you have an adversary on neural network that tries to determine when the original neural network makes a mistake. That make sense? Yeah. It also tries, you also see this with like deep fakes. So it tries to determine, you can have an adversarial neural network that tries to determine when something is a deep fake or not. And what that actually leads to is much, much more realistic deep fakes. That's actually how deep fake works at a 10,000 foot view is they constantly keep budding, these neural networks keep budding heads with each other until the final result is a result that even the adversary, which is a neural network, can't find the difference. And we've already talked about high frequency versus low frequency data is like a human being was left in the dust a decade ago when it comes to figuring out, whether that image is real or not. Yeah, and now we've got another point where we can start to leave a computer in the dust where if it's determining whether an image is real or not. So like cybersecurity implications, not really on the image side, but if that can be formulated in a way where you can train your prediction or classification like is there a bad guy on my network? Hey, is there a weird behavior going on? If you could train that adversarily, you could probably get a better result at the end. I don't really know. I haven't really seen anything going on in that research for security. It honestly seems like machine learning and artificial intelligence in cybersecurity is a faint ghost of what it is in the rest of the world with perception engineers, self-driving cars, AI driven, whatever with big companies like Facebook and Google. You see them doing all sorts of things that are very cutting edge, but they're also so far to the right of what the rest of the world can really do on an average day. And then even then in cybersecurity, the barrier for entry is so high because you already have to have cybersecurity knowledge. So basically you take your pool of people who know anything about cybersecurity and then you whittle it down to the people who also probably have a bachelor's or master's degree in computer science with a focus in machine learning and AI and you're left with like what, five people? So yeah, that's basically what happens. So what you've ended up seeing is you've ended up seeing this deluge of, hey, look at my fancy AI and all tool that I polished really nicely. Please don't look under the hood, but also even if you did look under the hood, would you know what you're looking at? The answer is quickly becoming no, not at all. Now, I've talked about machine learning. All of this has been machine learning. Now, what is AI? You see that word get tossed around a lot. The way I like to think of it is AI is decision making. And sure, okay, you might say, yeah, okay, well, machine learning makes predictions. Yes, machine learning makes predictions. It can also classify things, but it is a very, very, very narrow process. You get one input, you get one output and that output is very singular. It's like, is this the specific type of penguin I was looking for? Is this the specific type of user behavior I was looking for based on logon activity? You're not gonna see something more complex adjust coming out of pure machine learning. AI, what AI intents to do is it attempts to operate in an environment. So what does that mean? It means you have to have a computer system that can make not just one decision, but multiple decisions that have cascading effects on each other and it has to be able to make those decisions in the presence of uncertainty. And then also, you need the outputs for your AI to be effective. You need to cover all of your edge cases. So what does that mean? So AI can actually be a whole bunch of pipeline together machine learning techniques, right? You can have, you can set up a bunch of different machine learning techniques, neural networks, whatever to capture a bunch of different types of endpoints or excuse me, inputs. You can capture a bunch of different inputs. You can have a certain number of outputs, but you need to make sure that you're not gonna accidentally feed data into your AI system that forces it to make decision on something that has like a 10% chance of success or a 10% chance of understanding correctly. And those are what you can call your edge cases. And that's incredibly difficult to do, especially since you don't really know what's going on under the hood. So typically when I see AI in advertising, I laugh and I say it's not real. It's just marketing. That is a, again, that is very cutting edge research. Let's see. I don't really talk about natural language processing, but that's also just neural networks. That has implication security wide. Basically your inputs to your neural network would be text. And then your neural network groups that text and it tries to provide context to itself. It tries to understand that text through context. And how does it understand that context? It understands because you fed it enough data. So it basically says, hey, have I seen this grouping of words before? Oh, okay, well, since I've read six billion books, I can look back at those books and I can see where else it has appeared. And then I can also look at, hey, okay, well, what was the title of that book it appeared in and what were the surrounding paragraphs, those groups of words looking like. And yeah, it sounds pretty trivial, but I mean, honestly, we've gotten to the point where you have these natural language processing algorithms that can write journal articles that look pretty damn good. And you can have natural language processing algorithms that can do subtitles in real time, like I saw it a day on one of the calls I was on. And so what does that mean? That means you're gonna get automated social engineering. That's what's gonna happen. You're just gonna have people doing automated spearfishing attacks, automated social engineering. It's gonna be phone calls that run through the entire cybersecurity process of from the time your victim picks up a phone and says hello to the time that they're, I don't know, handing over their admin credentials or handing over their login credentials or clicking on an email you tell them to click on and by you, I just mean a computer. The voice problem is rapidly going away. That's also natural language processing. So like how do you make the voice sound human-like? I don't know if you've called in to say, I don't know, micro-center lately, the voice sounds remarkably human. Now, I don't know if it's a robot, maybe it's actually a human, but they're getting quite good at it. The same goes for writing spam emails. You're gonna get spam emails that read incredibly well. And what do we talk about nowadays for helping people figure out what is a spam email? You say, hey, look for spelling errors, hey, look for improper formatting, but very soon that's, and I say very soon, right now it can already be done, but very soon it will be achievable from the layman. Get on GitHub, you pull down a repo, you script kiddie it basically, and then you have your own automated phishing email sender. That's gonna be a boon for cyber criminals. The same way that you can go on GitHub and find your own self-driving car kit as Gerald did, and then put a camera on your car and then take it on the highway at 70 miles an hour. Of course, the GitHub repo says, warning, don't do this unless you know what you're doing, but the implicit is okay and the implicit message is, yeah, well, a bunch of people are gonna do it anyway, so what the hell? So yeah, you'll see natural language processing becoming a big deal, especially since most breaches start with social engineering. Yeah, that's a really good question. That's the whole list of stuff I had. That was incredible. You got any questions? So I think the conversations that are gonna be really interesting to have on the panel are the stuff about the phishing email automation because that's an immediate danger. The behavior analysis that you'll see in your antivirus or in your firewall, do you see any products that are literally advertising? And I know you mentioned like, hey, whenever I see AI in a commercial, I tend to laugh because it's just not there yet, but are there products that are already doing that, that are trying to make that, stake that claim in the ground? I mean, yeah, there are loads of them. I don't really do security acquisitions, so I can't name a lot. And I can name a few for you, but I'm not giving them a recommendation or anything. It's just what I know off the top of my head. Yeah, Microsoft is doing it. They have this ATP, Advanced Threat Protection package where they say that they can identify. Well, the first thing they're claiming is that they have this like elaborate security center that is mostly automated, that all of the email traffic that would go into Office 365, for example, like Exchange, all of that email activity is gonna be filtered using their proprietary security algorithms, and then it's going to be fed, all that data is gonna be fed into their security center where it's aggregated from all of their customers worldwide. So again, back to the problem of big training sets, they have a massive training set, a data set right there because they have a lot of customers. And then that feedback does say, okay, well, we're gonna use that feedback to actively improve our advanced threat protection, our ability to determine adversarial activity. They make the claim that they can determine zero-day attacks and they can find the zero-day actors. A lot of people make that claim. My completely unknowledgeable opinion on the matter is that the only zero-day activity you're gonna find is really through log-on or log-on analysis or very aberrant user behavior, okay, great. It's like technically speaking, you found a zero-day actor if somebody uses credentials to log in from China or from a geographically disparate place or yeah, but is that really zero-day the way we think of it? No, zero-day is that we're thinking of a novel exploit. What they're doing is that they're seeing, what they can also do is find like DLLs or executables that run out of particular context that they should, you see as a Windows defender nowadays is like, maybe you talked to Caleb about, he gets user on a box, but, or excuse me, he gets an initial foothold on a box, but every time he tries to spawn like a cmd.exe or PowerShell, it's like, oh, I'm sorry, you're trying to spawn cmd.exe or PowerShell from an incorrect context sort of immediately kills it. It's like, is that detecting zero-day activity? I mean, theoretically, yes, but it's not gonna tell you the zero-day exploit. It's just gonna tell you that somebody is trying to do something that's in an improper context. Did that require machine learning to do? No, that really just required some intelligent security-minded coder coding that in. Yeah, but they're gonna advertise that AI machine learning because nobody really asks those questions. And at the end of the day, if it works, it works. And Windows Defender is working quite well nowadays compared to Microsoft's Stormy Past with security. What are other things that are being done machine learning-wise? Going to security conferences, I think there's a company called Red Canary or something like that. Maybe it's Red Card, maybe, I think it's Red Canary. Yeah, so you see machine learning being done for threat intelligence. So all the things I mentioned about like social opinion dynamics, things that can be used against an unsuspecting populace to do information warfare, they can also do a natural language process and they can also do things like sentiment analysis or behavior analysis on traffic through ISPs. So they have a contract with an ISP and they say, hey, give me your NetFlow data. And then they correlate that NetFlow data with other open source intelligence feeds for like, hey, what domains, what IP addresses have been doing, what adversarial activity? And they also work with their clients to get feedback data from them as to what adversarial activities they've seen. And they use it to score domains based, like they use it to score domains on their trust readiness. How good is that? I honestly, I don't know. I'm hesitant because that's gonna have a lot of false positives, first of all. And it's, that's one of those things that it's quite hard to weed out because what they're providing you is they're just providing you, say, like a dashboard. They're just providing you with a way to say, hey, give us a domain and we'll tell you if it's trustworthy or not. It's really hard for you to tune that tool because again, they've used neural networks and tuning a neural network, it's a black box algorithm. You can't open up the hood and start tweaking it. So you pretty much get what you get with them. Doesn't mean they're not actively trying to improve their product, but it's not very friendly from the user perspective. And honestly, if you wanted to, like if somebody put a gun to my head and was like, you have to start an AI machine learning company like tomorrow, you know, the best way to do that is to figure out a like simple approach. Don't hit it with a sledgehammer but involve a user in the process. You know, I have a friend of mine who wanted to use neural network style approaches to help out fishermen in Iceland. He's from Iceland and he wanted them to, he wanted to help them figure out where the fish are. And the first thing they said is like, who the hell do you think you are? You 26 year old, whatever. I've been fishing for 30 years. I, of course I know where the fish are. And it's like, okay, whoa, whoa, whoa. Like I never said you didn't know what you're doing. How about instead I provide this product where you can feed it input data and it can change its behavior based on the training like the extra data you give it, which kind of brings up this other important point of online versus offline machine learning, right? So offline is like you have a giant pool of data and you feed it through your neural network and then you've trained it and now it can make decisions into the future. That's really not the direction that the world is heading and you see, yeah, like Amazon uses recommender system. So that's a neural network. So it like determines like what products you'd be likely to buy next and then you have a monitoring center to do so. And that's an online algorithm because if you ever notice you click on a blender and you buy a blender, then they think you're becoming a connoisseur collector of blenders. And yeah, so like you see, like it doesn't have a great ability to make common sense judgment calls that a human being would make. You're like, okay, no one in the right mind is gonna buy more than one blender a year. But they can't train the recommender algorithm to do that. If a recommender algorithm knew that, it would be closer to AI. It would be able to make contextualized claims about things, it would be much broader in scope. But they don't really need that because people aren't gonna stop using Amazon because they got recommended 50 different types of blenders. And so like what you would wanna do is like you go to the back to this fisherman and you say, okay, hey, Mr. Fisherman, I have something that will help point you towards the fish. You don't always have to use it. It's, you know, you pay for it monthly. And when you find fish and you have great fishing spots, you plug in the date and time and how many fish you saw or how many fish you caught into this algorithm and it will only improve and it'll improve with your skill. So the better you are, the better this algorithm will become. And no one else will be able to see the data and you just benefit from it yourself. Well, that's a lot easier of a sell because now you're bringing in the expert and you're bringing in their opinion to help change something online. So that's an online algorithm. Same thing with the security professional, right? Now you got a security analyst who's got maybe you got a tier three analyst, you know, who knows their stuff. You know, they've been on the watch for for a while. You know, they want that kind of control. They wanna feel like they're using their expertise. They don't want it to be all abstracted away from them. So if you have this product that says, oh, domain bad, then, you know, they're gonna be hesitant to trust it over their own judgment, over their own analysis. And you do that long enough. You do it long enough where you find false positives, which, you know, you're gonna find false positives. Then after a while, people get tired of it and they stop using it. You just see it is like, when you feel like you can't work with the tools that you've been given and you can't change them to suit your needs and people just stop working with them. So, yeah, kind of went up on tangent there. But so, yeah, I said, Microsoft, I said, Red Canary. It's a shark market. I think I saw recently that it's something like 75 or 80% of the machine learning AI startups are gone within two years. Like they come up, they get their seed funding. They try their best. They either get bought up or they fail. And it's really quick, which is it's not a buyer's market. Like you don't want to be in that position. You don't want to be holding a tool when the company goes out of business. So, yeah, cool, man, anything else? I took, no, I took a ton of notes. So this has been awesome. Thank you for doing this. I appreciate all of, even like the whole background and everything, so. Okay. Yeah, all right, cool. I'll send you, it's actually not a lot of, this has been such an absolute wealth of knowledge. Are you like totally cool with me putting this conversation on YouTube? Is that weird? Oh, hang out. I lost, I feel like I lost your audio for one second. What was that? Can you hear me? Yeah, yeah. I will caveat that by saying that I am in no way a machine learning professional expert, anything. I have just picked this stuff up. By being in an environment surrounded by people who do that stuff in an academic setting, not really in a industrial setting. So there is a decent chance that 75% of what I told you is totally wrong, but I don't think so. So maybe, I don't know, cut out that chunk and put it on at the beginning and say, like, if there are children in the room, kick them out. Yeah, I mean, if you're cool with it, I think that would be really cool. Sweet, thank you, thank you. Yeah. Awesome. Well, this has been fantastic. I ping Gerald kind of on the same conversation to see if he would have anything to offer because I am by no means an AI dude or machine learning lover, but this can help. Yeah, I mean, he'd be a good resource. I know he's put together these pipelines before and so he's got more of an eye towards, hey, how difficult is it actually to set this type of stuff up? Hey, how hard is this to actually put something in production because I didn't even touch on what it's like to put something in production. That's a totally different nightmare. But anyways, I just pasted my notes. It's just kind of a framework of what we talked about and it's roughly in temporal order. So maybe just start from the top and work your way down and it'll be the same order as what we talked about in the video. Awesome, awesome, awesome. Thank you so much, dude, I appreciate it. Cool man. It took a little bit of time, but thanks so much. All right, thanks, John. Take care. Yep, appreciate it. Bye.