 Hello everyone, so this week we are going to talk about searching algorithms this class is called introduction to data structures and So far we have worked with some data structures. We worked with an array based list We worked with a Python or excuse me a linked list And we talked about some abstract data types as well a queue a stack an unordered list a deck But a big part of data storage is being able to search it right so We will cover the basic search algorithms, and there really aren't that many And then we will use our discussion of search to talk about a way of doing searching really fast With a particular data structure, but let's just talk start out with really simple searching and how it works Okay, so throughout this These this week I'm going to be referring to Search handout to yours when I talk about the search handout I mean this thing so you should have a copy of it up somewhere and in it You will find the algorithms as well as the exercises that we will do during these lectures. Okay, so first of all, let's define what we mean by searching and the context of data structures and algorithms a Search algorithm determines Whether an item is present in a collection of items Easy enough. Is it there? Is it not true or false now many search algorithms in computing will also return The index of the item in a list. Where is it in your list? Right? This is going to be an integer What's the index is it at index zero? Is it at index three index four? Just knowing where it is in the list can be useful, right? So that's all searching is you're looking through a collection of items and the only collections we care about right now are lists To find out if the thing is there So we're going to talk about searching list data structures And there are really only three approaches pretty simple and pretty intuitive at least the first two are One is the first is the sequential or linear search We'll talk about that in this video then a smart sequential search and Finally a binary search, which is quite different probably than any algorithm you've seen so far Okay, so what is a sequential search or a linear search? Well, it's how you look for a book in a bookshelf That's not sorted or organized in any way if you want to find a book in this So list of books. Well, what would you do? You start at one end or the other and you go down the list and you say Is this it is this it is this it is this it and you just go in order you go sequentially or in a line That's the premise of sequential or linear search, right? If you've got an unordered list of things right like when I say unordered I mean the values are in any order whatsoever, right? They're the order that you put them into the list whether you used a panning or adding or inserting maybe you've removed stuff The order of the items in the list are based on those func methods that you call not their values, right? So there's really no better way to search an unordered list Than to start at one end at the start and just go down the list. All right So here's the algorithm for it. You can find this algorithm on your search sheet Or your search handout and this is written in pseudocode. This is not valid Python. This is not valid Java This is not valid anything. This is pseudocode, but it tells you how the the algorithm works So one of the things we're gonna do in this module is get practice tracing algorithms Okay, so when in doubt of how these algorithms work go to the pseudocode What we are mostly interested in them, right? So what we're interested in doing is finding the best search Okay, how do we determine how good a search is? well effectively, right This is an algorithm so we can apply a big O analysis to it and if you did the big O analysis of this algorithm, you would very quickly find that it is big O of n Okay, number of Things that are in the list. Okay So what were you are what I will ask you to do when you're thinking of searching is to count the number of Comparisons that we're doing as we search So a comparison is whenever whenever you search you have a target the thing you're looking for a Single comparison is when you compare this target to an element in a list That is one comparison When you go to the next element of a list and you compare it to the next element That's a second comparison. Okay, so when you are counting search algorithms I'm going to ask you in your homework to count the number of Comparisons you have to do between a target and an element in the list. That's the comparison. Okay When it comes down to the algorithm a single comparison basically equates to a single iteration of a loop All of these search algorithms. We're going to talk about this week run in a loop, right? So one iteration of the loop is equivalent to one comparison and that matches up That's going to match up with the big O kind of T event counting that you do It's the looping in the size of the list that are going to dominate how efficient these algorithms are Okay, so let's not get wrapped up in that too much at this point Let's do it by example and let's count and just make sure we understand how the different search algorithms operate Like I said, they're mostly straightforward except for binary search But let's make sure that we know them because you have to know them, right? You got to know how these different things work All right, so on the third page I believe it is of your handout you will find this example list And we're going to run through some examples on this list Okay, so switch over to your page three and then let's do these examples So we had a brief technical failure there I had intended to do this part on the whiteboard, but I had a technology failure So I will be doing some this interactively in the slides But I encourage you to follow along on the third page of your handout and kind of do what I'm doing here So we're gonna trace the linear search algorithm also known as the sequential search algorithm, and it's really quite intuitive, right? Basically what happens is here's the algorithm here on the right And this is on the first page of your handout A is the list of things we want to search. So in this case, I'm saying that A is a Python array-based list, but this same algorithm will work just fine for a linked list Though you'll have to change it just a little bit to handle dot next and things like that N is the number of elements in the list. Okay. Where am I getting that? I'm just getting it from this description right here So what's N? How many elements are in A? Well, I believe that there are eight elements in A Let's do it there. Okay eight elements in A Target is the item that we want to search for so I'm going to be doing this first example here Sequential search of unsorted lists. Let's look for number 332. Okay, so 332 is what I'm gonna be looking for here Excuse me. Sorry. This is gonna be a little dicey. I'm used to doing this on the board or paper not on a computer Alright, so let's look at the algorithm. So set found to none. Now again, this is pseudocode. It's not Python Set found to none and set loc variable to zero. Okay, so these are all variables that we're dealing with here While loc is less than N So while location is less than N the number of elements in the list and it's not been found Okay, found is none. So it's not found yet. Do the following if the target we're looking for is equal to a sub-location Set found equal to that location So this is an algorithm that is not returning true or false. It is returning the index Where is the item in the list and you can very easily check, you know, convert that to a true or false Is it greater than or equal to zero then? Yeah, it's true. You found it If it's the if you if the target is equal to the item at this index you found it hooray Otherwise set loc equal to loc plus one in other words move it up a slot. Okay So let's step through this and our target here is 3 3 2. Okay, so for those of you who struggle with your debugging I know this exercise may be a little painful for you Please follow this right make sure you can step through an algorithm by hand. It's an important skill to have Okay, so while location is less than n Okay, zero is less than eight. Loc is less than n and not found. Yeah, we haven't found it If target equal equal a sub-loc. Okay is three three two equal to a sub Zero look is zero right and what we're saying is location is basically right here. So is this element equal to the target? No, it is not equal to the target. So what do we do? If target will look set look equal to look plus one so move Location up a slot to here All right. Now we loop back around. Let me kill this. Sorry kill a slack. Let me loop back around Loc is still less than n and we still haven't found it if target is equal to look Three three two equal to a sub-location. Why yes, it is it is here, right? Set found Equal to look. Okay, so my location is one and Then we go back up to the while loop, right? This is important that you understand this when you're tracing loops, right? I Have found it, but now I go back up to my while loop and check the condition While look is less than n it is one is less than n and not found Ah, but I have found something found is one So I exit my loop and I return found Okay, so that's it. It kind of climbs up the list sequentially looking for the target Okay, let's do the next example the next example. Our target is 91 Okay So this time Our target is 91 and we're gonna reset our algorithm Okay, and let's trace again the tracing of this you're like Dr. Lehmann, I know how this search is gonna go It's still really important that you get the skills of tracing down of seeing how Individual variables change as you go along. Okay, so we'll go through this one more time Actually, you know what for this one No, for this one, we'll go through it in detail for the next one. We'll go. We'll kind of skip ahead a little bit. Okay So set found equal to none set location equal to zero my target is 91 While location is less than n. That's true and not found do the following if the target is equal to a sub-loak Is it equal to it? No, so what do we do? Move a step. Locke gets locke plus one loop back around Locke is still less than n and it is still not found. So is the target equal to this? Nope. All right move the location again Loop back around neither of our loop conditions are met. So we compare is 91 is the target equal to this? Nope. Okay, we step Right, let's shorten it up now kind of all right. Is target equal to this? Nope. Okay, move again Our loop conditions are still met right basically while locke less than n This means have you walked off the end of the list if not keep looking and not found Have you found it once you found it stop? So that maybe is an important thing to call out, right? Linear search stops when it finds the first instance of a thing, right? So Target is 91. So is this equal is a sub-loak equal to the target? Yes, it is so what do I do? set found equal to the look Found is for I exit my fls block Come back up here and While locke is less than n it still is but it is now found. So I'm done. I exit the loop Come down here and return found. Okay, but back to the point If let's say we were searching for one two three If we were searching for one two three it would get here and it would stop right? The search algorithm is going to stop when it finds the first occasion of The target in the list and that's okay, right? All right, so How many comparisons did we do let's ask that question well for 91? We first compared the target to five three three two two comparisons six eight 91 so I in total I did five comparisons to find 91 right including the comparison to the actual value So I did five comparisons here now. Let's look at our next target, which is going to be 22 Okay, so as you quickly scan this list You'll see of course that 22 is not in the list now The computer doesn't know that the computer is not omniscient and it can't it can only perceive things one bit at a time Literally, so what's going to happen here? Let's count the comparisons again Now you know how linear search works look here is Our target our target is now 22 is 22 equal to this no Move our kind of where we're looking up a spot. Is it equal to this? Nope No Location no, no, no, no, no, okay. We're here All right, so now what happens right? We're at this last part of the loop and let's say we're just getting ready to inspect Location seven. Okay. We start our while loop is location less than n seven We're at location seven is less than eight and we still haven't found it. Okay if Target equal equals a sub loak. It does not right. It does not Three is not equal to 22 Then loke gets loke plus one. Okay, so loke is eight So loke is effectively saying here. Well, there is no here, right? so But we come back up to the while loop while loke is less than n is Eight less than eight No, eight is equal to eight less than or equal to would pass But that's not true loke is less than eight is loke is eight less than eight No, so the loop stops Right, we jump down to the end and we return found so when we return found here What are we returning? None, right? It's not there. We didn't find it. So if I get none back from the search algorithm I know the thing is not in the list. Okay, so that's the premise of sequential search It's very simple. You start at the beginning and you just kind of climb your way up and you say are you it? Are you it? Are you it? Are you it? Are you it? No? Okay, yes or no, and if you found it return the index of that thing. That's all there is to it All right, so let's answer some of these questions though How do you know in looking at this algorithm when the target is not present? Well when the location is equal to n You know that the target is not present. That's the case that we just ran into here Okay Second question you need to know the answers to these questions. All right At least when given this the pseudo code the linear search compares two objects for equality What method must be defined in a Python class for it to be comparable with an equality operator? Okay, so if you had a list of say bank accounts and you want to search through and ask question if a Sub-loak is equal equal to target and The lists full bank accounts and your target is a bank account What method do you have to define on the bank account class to compare them using the equal equal operator? Well, that is the underscore underscore eq method underscore underscore eq All right, if the data is sorted can we improve the efficiency? All right, well think about that for a second if this list were in order Could we do better? Well, the answer is yeah, of course it is. That's why we have two other search algorithms, right? We'll talk about those in a little bit so Other questions, let me back out of these All right, go back over here Now we're on to if you're filling in your search handout, we're on to the bottom of the second page Okay, so we're asking here the question the the slide in the chart are kind of asking the same thing How many comparisons are required when the target is present in the list? Okay, so again a comparison is comparing the target to a single element in the list an individual element of the list So in the best case The target is in the list. Where is that target going to be? In a sequential search. Well, the best case scenario is that the target is in the first slot It's in the slot at index zero, right? If it's in that first slot. You only need to do One comparison, right? You only need to do one comparison. I'm gonna get that center Hmm. There we go. It's a little better Interesting. All right In the best case, it's one comparison Okay In the worst case Let's suppose that the target is still in the list What's the worst case number of comparisons you have to do? Well The best case is it's at the beginning of the list You do one comparison. The worst case is that it's at the end of the list Okay, if you have n items in the list Then that means you need to do n comparisons Again in the worst case All right Okay, now Suppose the target is not in the list You're searching for something, but it isn't there Well, how do you know using sequential search if an item is not in the list? You know, it's not in the list Remember the answer to our previous question You know it is not in the list if the location is equal to n So what that is telling you is you got to go through the whole list To know that an item is not in it Okay So there's actually no distinction between the best case scenario and the worst case scenario here In both cases, let me just fix this up real quick In both cases You have to go through n items to find out All right, and that's kind of the the downside of sequential search It's an immensely simple algorithm, but if you have a million items 10 million items a billion items The only way you know for sure if something is not in the list is you have to go through and check all billion items Right that could be slow If you're especially if you're doing that check a lot for some reason you're searching a lot That could be slow If it's only a couple thousand items, yeah, whatever you're not going to notice that But if you get into really big data, this could be a problem searching for something that's not there, right? um So that's kind of a sequential search Next video we will improve upon this a little bit with a smart sequential search But we have to transform the data before we can do that. We'll talk about that next time