 There are a set of methods that you can use on arrays. There's the sum method and it's also, or it's a, they're both functions and methods. The other thing Prabhu mentioned in the break is sum is also a built-in function. So if you just start up a Python interpreter, don't import numpy, don't use any of this stuff. There is a sum method and you can use that to sum up the values in any sequence. It can be an array or a list, either one. That one will work on arrays, but it's going to be a whole lot slower than using the sum that's out of numpy. Because that sum is specialized to work on numeric or numpy arrays. So I would encourage you and, you know, if you look at these, let's see right here. Sum, if you hit the question mark there, it's going to tell you this is coming out of numpy. If I del sum here and now do sum, that uncovers. We've shadowed, there's the original built-in sum. We put a sum in front of it when we imported the one from numpy. If I delete the one from numpy, we see the original. And this one will work. It's just slow for numpy arrays. So let's make sure I can get, yeah, so now we have the right one. Yeah, so if you do min here, let's see, is it shadowed? No, so there are special functions for min and here we have it. If you do a.min, then you can get the minimum value in an array. It's the same as the sum operation and it's arguments that it takes. There's also a special function called amin that is for arrays. So it doesn't shadow the original min. It defines its own version of that and it also takes the same arguments. So min and max behave very much the same. I think these, you know, the min value in here is zero. And in fact, you do find the value to be zero. A handy thing to also be able to do, a lot of times you want to know what's the index of the minimum value. You want to do that sometimes if you're using, maybe you have multiple arrays that you're indexing into and you find the minimum value on one and you want to get the corresponding values out of another array. So arg min, arg max will allow you to find that value. There's also a few statistical methods that are available. The mean, a dot mean, again you have to specify the axis if you wanted to do it along a specific axis. And here we have some values and we call a dot mean on our array and we indeed get the, we've collapsed the rows up because we specified axis equals zero. And we end up with 2.5, 3.5, 4.5. A dot mean and mean do exactly the same thing. And then there's also this function called average. That's available. It also calculates the mean, but it also has an extra argument. So you can supply a set of weights. And here you're doing a weighted average. So it's going to use those weights to weight the individual rows in this case, in the averaging process. And then there's standard deviation and variance as well. Other various methods that are useful. Yep, you'll find that you use this quite a bit in different scenarios. Clip allows you to specify a low and high threshold for your array. In this case, imagine we have one, two, three, four, five, six. And we want to get all the values that are below three and clip them to the value of three. Any values above five clip them to being equal to five. Indeed, that's what we've done here. A dot clip and then three comma five. There's also a clip method. So if we have, do we have, so we have A there. We can call clip A to between three and five. And it returns that value. So any value below three was set to three. Any value above five was set to five. There's the rounding method. So you can A dot round and this is going to round even value. It always rounds to even value. So 1.5 and 2.5 are both rounded to two. You can also specify a decimals or a precision there. And so when you specify I want to round to the first digit, then that's we'll round. Instead of rounding to, rounding off to an integer, it will round off and leave one point place after the decimal. You can specify however many places you want. There's also an extra function here that is available on arrays. And it's point to point or peak to peak. And that calculates it finds the minimum value in your array and the maximum value in your array, subtracts those to find what's the dynamic range of your array basically is how you can think about that. All right. So here's a bit of a review slide. We'll go through a few things. There's some basic attributes. I've left off some attributes here because they're more advanced. I mean you don't need to deal with them unless you're dealing with the memory structure and that sort of thing. These are the basic ones you're going to use day in, day out. So the D type, remember, is the type of the individual elements in the array. Shape is what's the shape of each, the dimension along each axis. The size is the number of elements in the array, the total number of elements in the array. The item size is the number of bytes in a specific element. In bytes is the number of bytes in the entire array. And indem is the number of dimensions in the array. One, two, three, four, whatever it may be. There's a set of shape operations that are going to morph the shape of your array, change it. So flat, flatten and ravel are all similar. Flat always returns a view but it's not an array. It's an iterator. Flatten always returns an array but it's always a copy so it's never a view into your array. Ravel might be either. It does the most efficient from a memory standpoint that it can return to you. So it could be a view into, it could be a copy. Resize allows you to specify a new shape but if that shape is not, if it doesn't have exactly the same number of elements as your original array, it's not going to work. There is actually a function, a method called resize that will allow you to, typically when you create an array and say it has a hundred elements, it has a hundred elements. It's not going to change. However, there is a resize method and so you can use it to add extra elements, resize to 110. There are gotchas here. I mean this is not something that works all the time and I'll show you. So A is an array, we can say resize this to 15. Let's see if that will work. I thought if I just created it, it would be okay. A.resize, isn't that what I just did? Oh, so that worked. I'm not sure. All right, so it extended the array out, added some extra elements in the end and it looks like it continued the sequence there. The problem here is if I say B is equal to a slice of A, so now B is a portion of this. Now B and A are pointing at the same location and memory with a different layout. Well, when you do a resize, what happens? It has to go and says, well, I've allocated this chunk but it's not large enough. It has to go realloc or delete this memory location, alloc another location. Well, if you have multiple things pointing at that location, numpy is not free to just go and delete that memory. And so as soon as you apply a view, or if you have a view of an item, if we come in now and say try to resize this guy, then it says we can't do it because it's referenced by other items. And so I haven't used resize very often because it only works in a very small number of situations. You end up with references to the same array quite a bit in numpy. And so that's there. I typically, if I'm building up an array, I'll build up a set of lists. And once I'm finished, like if I'm reading in a file, and I'm reading off individual rows out of a data file to fill in an array, then I'll read in all of those items as a row and I just append a new row to my dataset, end up with a row of rows or a list of rows. And then at the very end, after the function's over, or before I return the value, I'll just cast that set of list to an array and you end up with a 2D array that you can return. So resize is there, but not always that useful. Swap axes, we didn't really talk about. It allows you to say, I want to swap these two dimensions. And so if you had something that was, you know, size 3, 4, 5, if you swap the first and last axis, it would be size 5, 4, 3 on the dimensions. And then there's a transpose that allows you to specify a lot of axes that you want to transpose. A dot t will do the, it's a shortcut for transpose and A dot squeeze removes any elements from the array. Fill and copy are there. We talked about coercion over here, being able to convert from one data type to another. NumPy handles big Indian, little Indian issues. You can tell it that no, actually I want to byte swap these values and it will handle that for you. So you can handle the conversion of big Indian, little Indian. That's nice if you're going between machines that have different, you're going to Motorola to Intel or whatever it may be. Complex numbers, we talked about asking for the real and the imaginary attributes and taking the conjugations of the values, conjugate values. All right. There's also a few other methods. We didn't talk about a whole lot. Saving out to a file, you can just do A dot dump to a file and that's going to dump a binary representation of your data onto, into a string that can be written to, actually that will dump straight to a file. Dump as will dump to a string. And so these are quick and dirty ways of saving your arrays out. If you use the to file method, you can specify a quick and dirty way of writing your array out to a disk and you can specify the separator. So this is going to write out an ASCII set of data with space delimiters or whatever it may be and you can specify a format. If you have a floating point value and specify percent 3.2f and it will just write out each of your values with that format. These are not too high-tech versions of the storing and saving data. There's a library called sci-fi.io that helps for reading and writing to files to disk or arrays to disk and it's probably more useful in a lot of cases. All right, search and sort. We saw the where operation A dot non-zero is basically the same thing as that. It's just going to return to you the set of indices for anywhere that the values are not zero. So if we do A equals array 0, 2, 3, 0, 0. A dot non-zero just returns to you the indices for these two values because those are the ones that are non-zero. So can you do A greater than 1? Creates an array and you can call non-zero on that. That would be or A less than 1. So we've created an array and then asked for the non-zero values. We created a mask. This will be false true. That's the same thing basically as where on that same array. All right, non-zero. There's also sort. So this does an in-place sort of the arrays. We saw yesterday list dot sort. Use that a whole lot. There's also arg sort which is going to return to you the list of indices in the order, whatever the sort order ought to be. But it doesn't modify your original array. So it just, if you return that list of indices and then you index with it, you remember this, fancy indexing, that would give you the same thing as your sorted array. So that's a handy way of doing that. Search sorted allows you, once you have a sorted array to find out where values should go in that array, if you're going to insert a new value. So that's a useful feature for histogramming, things like that, to find out where new values would fit in an array. Or yeah. So element, the element math operations. We saw a clip. We saw round. Cumulative sum, cumulative product are there. We'll see those a little more in a few minutes or the equivalent of those. And then there are a set of reduction operations that reduce the order of the array either completely. If we call these without, you know, sum with axis equal none takes the whole array, no matter how many dimensions reduces it to one value. If on the other hand you supply axis, it's going to reduce the dimensionality by one by summing across that axis. So there's the set of those values. The only two that we didn't really talk about too much are any and all. And any returns true if any values along that axis are non-zero or true. And all is the equivalent of doing an and. All the values in that array have to be non-zero. So the question is, if you put a string in an array, will all the string operations work? They'll work on the string itself but they won't work on the array, right? So I mean if I have an array here, is equal to an array, and I'll put in hello, goodbye. Eventually I'll get it in there. D type is equal to stir. So now I have an array of strings, right? So I can ask, is hello in a? That will work. But I can't do, I can't do a dot upper on this array or anything like that. This is going to fail because there is no upper method on an array. However, if I do a 0, then I can do that or I can do a list comprehension type thing where I do x dot upper 4x in a. And that's going to return. And if that's a list now instead of the array, right? And so if you wanted to create an array that has those values, you're going to have to do that to cast that back. Does that cover what you're? Okay. So how can we create arrays? We've seen a few ways. I've shown you a range and just creating an array using the array conversion function that converts lists to arrays. There's also a range we've already seen. You can specify the star, we saw range yesterday, you can specify a start. And then it just assumes that if you don't specify any value that the stop is none and the step value is one. And the d type is specified as none by default. And that means that numpy should do the best it can to try to figure out what the d type should be. So here if you do a is equal to a range 10, then we have an array. If I instead do a range 10, then now our array is just integer type. So a dot d type here is n32, a dot d type here is float 64. If I want to range from one to nine, then I specify the lower end is inclusive, the upper end is exclusive. So now that's what that value is. And if I specify a step value of two, then I get that. If I want to make sure that my d type is only float 32, then I can specify that and I get an array that's d type float 32. So there are a lot of ways of doing that. You have to be a little bit careful on, you know, I have an example here from zero to two pi, stepping five pi over four. And we step up and notice it's exclusive on the upper end. It doesn't include the 6.28 value. And that's because it behaves the same way as ranges on a list. It's inclusive on the lower end, exclusive on the top. The problem that you run into here is in floating point math it's never exact, right? And so you'll run into cases where you specify this and sometimes you'll get that last value and sometimes you won't. It all depends upon whether two times pi compares to be greater than or less than, you know, eight times pi divided by four. Well, you know, you're taking risks there, right? The way to get around that is you can always do, if you have this two times pi and then, or you can do step here. I'm just pi over four. So that's my step value. And then what I can do is a range from zero to two times pi. And if you always want to get that last value then you can do plus step divided by two. So that adds a piece of it. It's always going to be a little more than step. And now you're always going to get that last value. That's kind of a pain, right? I mean that's, so a range is kind of nice for just creating simple stuff. There are other methods for creating more typical things. We'll run into those. So ones and zeros. Ones and zeros you just apply the shape of the array that you want. Ones and zeros are the same. There's also the identity matrix. So you can return the values with the ones along the diagonal, zeros off diagonal. Again the d type is float 34 by default. The old behavior of identity in numeric was to return integers. Same goes for ones and zeros. And so if you need that behavior, if you're dealing with legacy code, then d type equal int will provide you the same behavior as the old versions. There's this nice new routine called empty that allocates an array that has nothing in it. I mean what it basically does is it just allocates that space on the disk. And the place where that becomes handy is when you're calling into, which we'll do quite a bit of tomorrow, calling into C or Fortran arrays. A lot of the time you actually do the allocation of your data in Python and then hand it down to C or Fortran. Well you can do that using ones, right? We've just done it here. This will work just fine. But what's happened here? You've allocated an array with an empty array and then the algorithm had to go through and splat ones all the way across this. And you're about maybe to overwrite all those ones. There's no need to zero out all this memory or write ones across all this memory if you don't have to. And so empty looks pretty ugly when you create one and you look at it. But if you're about to just splat stuff over it, it's just fine. You don't worry about that. So a typical thing is to create an empty array and pass that into a C or Fortran function immediately to get filled up with values. There's also the fill method that we just saw a little bit earlier. These are basically the same capabilities. Fill and you can put that, just assign that in if you'd like in a slice but it's slightly slower. A really handy set of functions. Lenspace is really a workhorse that gets used quite a bit because what Lenspace allows you to do is generate an evenly split up array. And so what this says is I want an array from zero to one with five evenly spaced elements. So zero, one, two, three, four. So there's five elements if you don't count. One, two, three, four, five if you count across. So let's look at Lenspace right quick. You can go from pi to, or negative pi to pi, and I want ten or eight evenly divided elements. Here actually you're probably going to want to do, you often want to do one more because you get the end points of your array. So if you want them to be spaced every pi over four or pi over eight, then you need to specify one more than the number of divisions that you want. Commonly used. There's a log space function that allows you to do the same sort of behavior just in log scale. And then there's these funny looking little critters with an underscore at the end of them. They're two of them to try to make it handy to create arrays. This is a little bit of MATLAB envy, I guess, in that MATLAB, the one place where I've noticed a large difference between kind of expressiveness in numpi and in MATLAB, is in being able to specify arrays. The creation of simple arrays is really easy in MATLAB. So these are some methods that are, try to provide that capability. And you'll notice they use indexing. This is not a function call. So this is kind of this special indexing approach that you have. And here, if you do R underscore, this is going to create a row array from zero to one spacing by 0.25. This is basically the same thing as saying A range, zero to one comma 0.25. However, if you come down here and use these 0, 1, J, and what is this? Well, it's a complex value. So if you use a step value that's complex, there's no reason for this. I mean, that's just kind of the, if you specify a complex stride, it is interpreted as the number of steps that you want to have. So this behaves like linspace, right? So you can use either of these. The place for rows, linspace works just fine. The column array does the same thing, but it does it as a in-by-one array instead of an array within elements. Concatenation is kind of nice here. I mean, you have to call the concatenate to add multiple arrays together. You know, if you want to concatenate, if you want an array like this one, so if I want r and I'll do 1, 2, 3, 4, 5, 6, 7, 8, then that takes those and it concatenates basically those rows and scalars together. And doing that, you know, you can specify a concatenate. I want to concatenate array 1. Well, maybe it will work if I just do this. 1, 2, 3, 4. So it concatenates together a set of sequences and then 6, 7, 8. Maybe I have to put all of those in there. So you can do, the concatenate operator provides the same capability. That was the standard approach in NumPy. It's fairly verbose though, right? And so these r operators came about to try to simplify that. So again, if I take this guy and I make that a column array, oh, these probably have to be, so it's still not quite as convenient. These are going to have to be columns if I'm going to concatenate them that way. I think maybe if I do that, it'll work. Well, it concatenates them as each column. I thought I'd put them down as columns but it actually concatenates them. Treating each of these as columns. So those are handy little functions. If you play with them, you can kind of get used to their behavior. There's a method called mGrid for creating arrays. This is saying mGrid's going to create two arrays and they're useful when you're creating a grid and you're wanting to create x values and y values that are in some distance in a Cartesian plane and then you're going to do some mathematical operation. You need all the x's and all the y's. And so here the x values are run down the rows, so 0, 1, 2, 3, 4. These are just duplicated, right? Row after row or column after column. Same for the y's. They're duplicated row after row. So the mGrid gives you a mesh or a full two-dimensional grid of the size that you specified up here. And there's another one called oGrid for open grid and this is only going to give you the x's and the y's. And we'll see where this actually, you know, well, I show down here, there's a method called broadcasting where if you take an x array and a y array, it actually does the outer add or outer product of all of these elements. We'll go over that in a little more detail in a few minutes to show how that works. So a lot of times using oGrid, you don't have to do this mGrid method. You can do oGrid and that saves you on memory and on speed. There are actually matrices that are objects in Python or in NumPy. So a equal mat and then we have a string 1, 2, 4, colon 2, 5, 3. This is again a little matlab envy where we're trying to do exactly the same way matlab would create an array. You can pass in a string with that format. It'll create that array a. And if we print this out, then we indeed have a matrix object here. And now if you do exponentiation, that's going to be matrix, you're taking an exponent of the matrix. It's not element by element. That's exponential of the matrix. And if you ask for the a.i, that gives you the inverse of the matrix. And one pointer. If you multiply a times a, this does a matrix multiply because these are matrix objects. And so hopefully you get the identity matrix when you multiply the matrix by its inverse. So there's also this, you'll run into problems in electromagnetics. A lot of times you'll have a large array that you're trying to create. And you might have interaction between elements that are electric fields and magnetic fields and vice versa. And so you'll have it when you're creating your array, the upper quadrangle is going to be the electric-electric interactions or elements that specify that. Another part of your array is going to be e interacting with h. Another section is h interacting with e. And then the bottom quadrangle is h interacting with h. It's very common in a lot of problem sets where you have kind of regions of your matrix that are interpreted different ways or are built using from different parts of a problem. And so it's not uncommon to want to be able to do something like this where I have an array a and I have an array b and then I want to set them up where I have a and then b and b and a or whatever it might be. So this approach allows you to construct an array that way very handily. You can refer to the elements instead of having to pass in a and b. You can just put them in a string and they get built this way. Trying to do that with concatenate would be a little more effort. The set of functions that are available on general arrays, we talked about the trig function sum. If you are worried when you're taking the arc tan, if you need to know what the angle is or have the, if you need to pass in x and y to know what quadrant you're in, then arc tan 2 will return you the angle in the correct quadrant based on the sign of x and y. These are all common. These are basically the c functions from the underlying library. We don't implement our own. There's a set of others here. All very common functions that you might have available. The only one that's semi-different is the hypotenuse. If you give it an x and y array, then it's going to give you the Pythagorean theorem for that. Or the square root of x squared plus y squared. There's a lot of other functions available in NumPy. This is kind of a selection of them. If you do NumPy dot and then tab, you're going to get a whole lot of functions, but it's useful to just look at that when you're in the lab this afternoon. A lot of these functions over here deal with asking, is this a complex value? Is it not? Is it a pointer? There's a set here that is, or actually I guess all of these here are important because they deal with NANDs. If we want to put a NAND in an array, I'll just show this. If we try to do this in Python, it gives you a division by zero error. However, if we create an array with those values in it, and I come along and divide that by zero, then I get an array with NANDs. Negative nth, negative 1, number i and d, that's a NAND and then this is positive nth. When you go on the Linux machines, the numbers may print out differently. This uses whatever the printing method for these values is on the platform you're on. But that's what those values are. You can use these functions to say, is finite any of these values? They're all false because none of those are finite. You can ask if any of these are NANDs, and both negative and positive N for true and is NAND, the middle one will be true. So these are methods that allow you to test for that. This is one of the things you I guess need to be aware of is you have to use these functions if you're wanting to compare or to find out if you have a NAND in an array. Before we were doing things like B is a range, or if we look at B, if we want to know if B is equal to 3, well, there's one of these values in here that came up as true. If you ask if A is equal to NAND, let me make a new array A divided by 1, 0, so we have C is equal to NAND, they all return false, which is a bit disconcerting if you're looking at that. But the deal here is this is using a floating point comparison operation. If you look at the IEEE specs, IEEE specs for comparing any value to a NAND is undetermined what it's going to return. And so you can't do comparisons. NAND is actually represented as a special bit pattern in the floating point standard. And ISNAND actually goes in and checks that bit pattern. Do we match that bit pattern? And so you have to use these ISNAND, ISNF operators when you're checking if you have a NAND. Shape manipulation, there are a lot of functions for these. Allow you to test is an array, or I want to return an array that's at least this number of dimensions, 1D, 2D, 3D. So if you have a 1D array and you say at least 2D, it's going to do those indexing tricks we saw earlier with NAND to make sure it's at least two dimensions. H-stack and V-stack and D-stack are nice methods. H-stack is going to take a set of arrays and stack them horizontally next to each other. V-stack is going to stack them vertically and D-stack is going to stack them back. Splits do similar things where they split them out. We won't go through all these, just be aware of them. There are a few functions that come from a MATLAB legacy like flip-left-right will flip a matrix or an array from left to right. Rotate 90 degrees will take the elements and rotate them by 90 degrees. These are nice with images and there are a number of other functions that are listed there. Obviously we've kind of flown through some of those. There's a set of documentation that's available on the sci-pi site. You can go to the sci-pi site and go to documentation and you'll see a list of places you can go. One of the handy ones to go to is this NumPy example list with doc. It's one of these links down here. When you go to it, a woman while she was learning NumPy, I think she's in France. As she was learning it, she just went down and she worked with every function and wrote a little example for every example. Then one day she said, hey, I just uploaded all these to the website. It's cool being part of an open-source community because of that. I've only shown seven here, but I think she started with 90 and now it's up to 130 or 40. If you click through, and you're curious about how to use one of these functions, one of the places you ought to go today when you're in the lab, go over here and just go down that list and start playing with some of those functions and look at how they work. It's some of the better documentation that you'll find on NumPy are on that set of functions. We've already run through quite a few of these. I'll go quickly. Binary operations, as we've said, are bitwise or element-wise. You just multiply the elements. There's also corresponding, if you do it with its element by element, if it's two arrays, there's also a function that corresponds to each of these, so add will take these values. And as we mentioned before, a plus equal b is, you can put that in an operation if you want to do the same functionally, add, b, and store that in a. Those aren't commonly used, but they are there. Well, so what about the speed? These slides are a little bit dated, but they still give you some idea about what kind of performance you can get using different operations and the effects of different things. So here we have the floating point performance here. There's a version of MATLAB that was on the machine for this version. You're noticing this is comparing against numeric 23, so I expect MATLAB has updated quite a bit since now, and we've certainly updated quite a bit since now. The values are at least indicative of things. So MATLAB has some performance. It's always float 64, or it was in this version. And now if we use our float 64, some places were faster, some places were slower about the same performance. There's some caching effect that's happening in here, right, where you're getting arrays that are long. They fill up the cache. You get the maximum performance from accessing cache. So for a very short arrays, there's a whole lot of overhead for doing setting up the element by element add and that sort of thing. So there's not much performance for very short arrays. As we get into longer arrays, it looks like we get the best performance if our vectors are on the order of between a thousand and 10,000 elements. And then we start falling back off as we leave cache out here. And so our arrays have to be fetched as they're doing an A plus B. You're having to fetch out of the cache. But floating point 64 is reasonably quick. It falls off some. You can get a little more speed if you want to using float 32. And this is just for a multiply of a vector. And then there's two functions in PsiPy, D dot and S dot. And these are highly optimized. They're the BLOSS routines that come out of the... BLOSS routines are the basic linear algebra operations. And S dot and D dot are a couple of those. And so if you use those here, you notice that you get a much larger speed, especially for large arrays, than you do if you're using just raw numeric operations. So those are available for use as well. If you do an in-place multiply on an array, you can see the difference here in the performance of doing float 32 versus the float 32 plus this float 32 in place. You basically have an offset and it's fairly high because now you're not having to fetch. You don't have to allocate a new array. You have only two instead of three arrays in memory that you're looping over. If you're doing an A times B written into C, then you have to allocate the C and then you have to keep all three arrays in memory. Well, you've cut one of those out and you've reduced the allocation. So you get some of the speed improvement there as well. As you're playing with your algorithms, you're going to get a little more speed out of them. You'll just have to play around and figure out which ways work best to tune them and get better performance. We've seen some comparison operations between A and B. If you want to say A equal equal B, this is from the last version of numeric I would expect. Now instead of printing out 1, 1, 0, true, true, false, what's the same basic thing? And there's also the equal functions. You can use any of the operations that are listed here above to do your comparisons and to do logical operations. There's also a set of bitwise operations. So if you want to do a bitwise or between A and B, then you can use the pipe operator or you can use the function A, B and it's going to do the bitwise or of those two elements or of each of the elements here. So 16 or ordered with 1 gives you 17, 2, and 32 gives you 34, et cetera. All right, bitwise inversion. Let's see if you want to invert the array. This is just inverting the bit pattern. Whatever the bit pattern is, it's going to print out the values for that. And then the left shift operator. So this is taking A, which is our original array and just shifting the values to the right or left, excuse me, by three elements. And you have the resulting bit pattern shown there or this resulting result values. All right, so that pretty much concludes our discussion of arrays.