 Welcome to intro to AI measuring similarity using Euclidean distance. I'll explain what that means a little bit later So basically what we're looking at today is we're trying to measure the similarity between two Things okay, and what I'm not talking here about two objects like you know length width and height you know I have a ruler on the screen what I'm talking about is the similarity between For example people if we were trying to determine our two people Compatible or who is one person most compatible with in a group? How would we measure that and we're going to do that by measuring? Distance I'm going to show you what that means but taking a look at what's on the screen now You see that there is a an arrow at 10 centimeters. There is an arrow at 20 centimeters There's inches up above for my fellow Americans and you can see just you know Just by thinking about this you can see the distance between those is 10 centimeters Okay, so it stands to reason that if something were you know at 15 It would be closer than something at 20 is to 10 So we're just we're just basically measuring distance Okay, so we're going to see how this works in AI and machine learning. This is kind of a precursor To clustering which which hopefully I'll get to some day So let's take a look at some code and what we're going to do here with this particular Problem, so let me make that text a little bit bigger for you just in case. I'm not sure how big your screens are So you'll see here. We're trying to determine the compatibility between people So you see here. We've got three questions Now we could have a hundred questions. We could have a thousand questions. It doesn't really matter But whatever number of things that we're measuring we need to measure them in a consistent way first So I went on the internet. I googled like most people do and I found this thing on core I'll put the link down below. It's it's here as well The three questions the three things that most highly correlate with Compatibility are the ones you see here on the screen. One is I like horror movies to I want to go travel abroad on my own and Three I want to go live on a sailboat. I'm gonna throw it all away and live on a sailboat So what I've done here is I've added what's called a Likert scale So scale of one to five so one is strongly disagree to is disagree Three would be neutral for would be agree and five would be strongly agree So let's talk about distance here. Okay, so what I've done is I've created three people Person a person be and person see and I'm only looking right now at I like horror movies So you can see here person a strongly agrees person be strongly agrees person see strongly Disagrees So just just by looking at this just common sense tells you that at least in this aspect Person a and person be are far more similar. They're far more compatible But how do we do this mathematically? Okay, so what we can do is measure the distance. So for example person a to person be so we'll call this distance I'll call it a be equals person a and the zero index of the list and Person be zero and Then we'll do distance from a to see equals person a Zero minus person see zero So if we print those out, we'll say distance a b and We'll print And yes, I know that it's very easy to calculate in your head We'll get to that in a second. We're gonna we're gonna add a few more features So our attributes I'm gonna go ahead and compile it not compile. I'm gonna run it I've been teaching a lot of Java lately. So I'm thinking compilation So you can see here the distance the mathematical distance Obviously between five and five is zero Okay, and the mathematical distance between Five and one is of course four So if we're looking at this we can see that the lower number Zero Indicates a closer compatibility at least in that particular area Yeah, now I want to do something here real quick I want to reverse these just to kind of show you something so person be minus person a It's person a Zero, let me go ahead and just copy that And I want to show you what happens when we do this and you know if you think about the math, it's pretty straightforward oops Okay, so let's go ahead and run that now if we run in this again Okay, you can see here. We get negative four. Okay, so which is not what we want because negative four is less than zero Okay, so Yeah, not not what we're looking for here So what we're gonna actually do Okay, is we're actually going to watch what I do here. We're going to end up squaring this And I'll come back to wine in a few minutes So we're gonna square that and then we want to take The square root of that as well And you'll see why in a minute. So I think we can do math that sqrt I'll have to import math here in a second math sqrt Let's go ahead and import that real quick import to math Okay, so let's go ahead and test that make sure we get the Zero and four now notice how it turned into a float zero point zero and four point zero We're just actually more what we want anyway So in this case you can see that this gives us basically the same answer we got before Okay so That's with a single attribute which which makes it pretty simple. So let's go ahead and add our second attribute So which is I want to travel abroad on my own So let's say, you know person a is really adventurous and really wants to go ahead and do that Person B is yeah, not so sure. That's really the way they want to go and person See is really adventurous and wants to do that. Okay, so now Now it's getting a little more complicated and Instead of kind of calculating everything I'm gonna put it like manually we're gonna have to calculate it We're gonna make a function to do this but what we want to do now is to Take this distance Okay Add it to this distance I'm sorry take this distance square it add it to this distance square it and then take the square root of that And the same thing we're gonna take the distance here square it We're gonna take the distance here square it and take the square root So if you think about that That basically comes out to the square root So let's say math dot sqrt basically it's a So it's basically a minus b squared and I'm not sure how to write that squared plus B Because it was a b minus end up being b minus c even we don't have a c here This is basically the same as you would do with the Pythagorean theorem. This is why we call this Euclidean distance. Okay, so it's basically a squared plus b squared, which is the distance here The distance here is this is a this is b. So a squared plus b squared We take the square root. Okay, so that would give us if we actually when we say if we plotted this out on a two-dimensional Cartesian plane We would actually get the distance in 2d space. Okay, so let's go ahead and create a method To do this so deaf that we're gonna define a method we'll call it distance And we're gonna do person We'll do p1 and p2 sort of person Okay, so what we need to do is we need to iterate so for I in range The length of p1 now this assumes that the length of p1 and the length of p2 are the same So we'll call this sum Say total because some is protected total plus equals Okay P1 I Minus p2 I Just leave it like that So this is gonna this is the same as person one so this minus this Now remember it's squared So we need to square that It's the difference in the distance is squared and we're gonna add up that total and then we are going to return the square root of the total so We're gonna try that so we're gonna say distance Person a person be and Distance Let's say B. Sorry a c equals distance person a person See That there's someone okay, so let's go ahead and run it see what happens got an error total Okay, so we're gonna do it's total equals zero we got a by that we're gonna set total to zero first Okay, so this gives us a distance Of three and four So this minus this squared So this minus this is zero So that's zero So here we're squared zero five minus two is three three squared is nine square root of nine is three Okay, so five minus one is four four squared is sixteen Two minus five is three three squared is nine so sixteen plus nine Is what's that 16 but did I count that right five? three Say did I do that correctly? That should be 16. Is that not correct? Plus equals Person one person two that's great total My calculating the oh duh. It's person a minus person. See you got got it. Duh Okay, so five minus one is four That's squared five minus five zero. So this gives us four squared. Okay, so my bad there Um So let's actually let's go ahead and do that. Let's go ahead and do distance Uh, was that was that bc? B c equals distance person b and person c Okay, so we'll print that out. Let's go ahead and run that. Okay, so you can see still A and b are the most compatible So a and b are pretty similar They have a five and a five And a five and a two They got a five and a one and a five and a five so that kind of makes sense Then here we got you know a four distance. We got a three distance kind of makes sense that a and b are still the most compatible So let's go ahead and add in I want to live on a sailboat. So let's say we got a one here We got a five here And we got a I'll just try a four here. See what happens. So let's go ahead and run this now notice I've just added an attribute But because I'm using the length here It doesn't matter so I can add as many attributes as I want as long as they're lined up Okay, and that the scales kind of match up. So let's go ahead and run that Okay, well look at that. Okay, so we can see we kind of got a tie um Maybe those weren't the best numbers to have chosen Let's go ahead and make this a two and just kind of see if we can get a little bit of of difference here Okay, there we go Okay, so we still see that by a slight margin that a and b are the most compatible Uh, let's go ahead and put this down to a three See if we get some better numbers here Okay, now now we see a little bit um, now we see that a and c Are the most compatible because their distance is 2.8 And then here we've got 4.6. We've got 5.0 So looking at this again, it's not obvious here Which one is quite, you know, which one is more compatible Um, because the numbers are pretty, you know all out there Um, now just something to check just to see if your algorithm is working correctly Is, you know, I would do one of them Maybe that should be a one since we're not using that one And one So if we run this we should see a zero distance here And the same distance between a and c and the same distance between b and c I think that's probably four Or maybe not Um, oh 6.9. Okay. Who knew? Um, so you can kind of see yeah, this is the same This is different because it's 16 plus 16 plus 16 Square root of that and then same thing or 16 plus 16 plus 16 square root of that So is that right? Yeah 16 to 48. Yeah, that sounds about right. Hey, we're we're pretty close on that Um, so this is a way to measure Compatibility Okay, so we can measure it that way. Let me go ahead and just put some Put those back to where they were It's a little bit more interesting. I think with quite varied Numbers, okay, so what I'm going to do is I'm going to assume that these people are My choices, okay So I'm going to go ahead and put them into a list. Okay, so find the most compatible So I'm going to make a list of people Okay, so we've got person a person b person c Then I'm gonna go ahead and do I want to make a person called me Okay, and I'm going to put in my information. So I like horror movies. No, I do not like horror movies at all Um, I want to travel abroad on my own. Yeah, I'm going to put that I make that a four And then I'm going to go live on a sailboat. Oh god. Yes. I want to make that a five So now what I want to do is I want to find the lowest Compatibility, okay, so Or I should say the highest compatibility, which means the less lowest distance So I'm going to find I'm going to say look uh, say closest It's me. It's easiest that way closest equals zero point zero Actually, no closest equals Uh, what I gotta do is I'm going to do distance Person a and me So I'm going to assume that person a is the closest Because it's possible. I'm going to do this following four person in people. I'm going to iterate through the list Okay distance Sorry, sorry d let's say d Uh equals distance Uh person and me And then if d is less than Closest so we'll say the closest Equals d and then the n oops closest when it's done we'll say print The closest Uh I say the person Is this is actually just going to give us the the number We'll deal with that later. I guess if we have time is Closest Let's run that and see what happens Okay All right, so the closest person Has a compatibility of one point four one Which is pretty good So let's go ahead and Do the following we'll say names Equals uh a Again, we would normally do this with objects and things but person b person c And then so closest equals d and what we got to do is keep track of the index so closest I equals Uh zero, so we're going to assume It's the first person In the list I should probably make this Uh people zero people Zero again, this is just lists, you know, if you don't not familiar the list this is probably not the video for you um So we'll say closest I equals Oh Can't do that because it says person and people So for i in range and this is kind of the process Uh length of people And apologizing this is getting a little getting a little out there. Um Yeah, so person Equals people I there we go and I'll fix that Okay, let's try this again is is Uh, let's see names Closest I should have used objects here Living there. All right, let's see if this works Okay So according to my calculations, the closest person is person c with a compatibility ratio or whatever you want to call this kind of compatibility distance of 1.41 Which you can see is much closer than Those guys were to each other Uh earlier, so let's see why that is so I have 145 This person has 154 so you can kind of intuitively see that You know, we are much much Closer in compatibility than we are than I am with these particular people, okay, so It's kind of how it's done. Um, I know it's a little bit quick And I you know, maybe the if you're not up on your lists and iteration Might be a little bit of a challenge to follow what I did there, but let me walk you through it real quick Um, but the idea here is how do we measure Euclidean distance? over a range of Attributes, okay Now again, there's a couple assumptions here In this case, you know, all the attributes have kind of similar how you put it Kind of similar numbers, um, and that they're equally weighted So the weight of this is the same as the weight of this is the same as the weight of that We're ignoring weights, uh, and then we assign a numerical value to each of these And it's a numerical value that you know kind of makes sense. I think Um, so we created three people and we gave them certain attributes We're just using lists here nothing nothing fancy We measured the Euclidean distance I probably should have called this Euclidean distance, but that's okay So what we did was we took the sum of the squares Of the of the differences. So this minus this squared plus this minus this squared plus this minus this squared that whole thing Square-rooted Okay, and that gives us the Euclidean distance again if this was 3d space that these were points in 3d space It would give us the distance between those points in 3d space So this is actually why this is why it's called Euclidean distance because it comes from 2d Euclidean planes Which extends to 3d and which can extend to any number of attributes Um, and then here we just kind of measured the distance manually Okay, if I want to comment that out We just calculated it just to test it And then what we did here was we made three people person a person b and person c They have the same index Okay, so zero one and two I Answered the questions for myself and these are these are accurate questions. This is these are my real answers And so what we did was we're trying to find the closest person So what we do is we start out. We say okay. Well, it could be person a Okay, so let's get that distance and we're trying to find the closest person an index and that's going to be zero And then what we do is we iterate through Each of these and then we calculate for each person the distance Between that person and me And if that distance is lower than what I already have And we say the closest is d and then the closest index is i to keep track of the index Again, we should have done this with I should have done this with an object But we'll live and learn and then once we've iterated through all the possibilities We print out the result and this is a common pattern And again, so if we run it we get the first closest person is person c And yeah, we looked at the numbers 1 4 5 is clearly much closer to 1 5 4 than it is to 5 2 5 and 3 5 2 so intuitively it makes sense Now again, we could add as many attributes as we wanted as long as there were You know comparable attributes for each particular person Um, and then this gives us like I said the most compatible person based on these questions and these Answers so this is how we could find a potential Match or something that is similar and what's interesting is once you learn this You can use this kind of same concept to look at images and how similar are images does image a Is image a equal to or is image a similar to image b is image b similar to image c This this concept is extensible And also it's something that you would probably again if you're going to use start to do clustering One of the ways we can do is we can measure again this euclidean distance And then use some techniques for example k means To find clusters and to group By similarity now again, there are libraries that do this as well I'm trying to show you here at the very raw basic level hopefully in a way that my high school students can understand How we can do these things with just a little bit of math and a little bit of iteration Okay. Anyway, that was that Oops. Thank you for watching and as I like to say keep on coding