 The slides are available through a link on the conference website. And as well, there's a set of exercises which you can get at by clicking on this link when you load a front page. Okay. First of all, thank you for coming. I have to admit to at least three conceits about this workshop. This first one is the words in the parentheses. The second conceit is that I actually think that I can give a good accounting of APL in 16 expressions. And the third conceit is that I think I can do it in hour and a half. And the fourth conceit is that in the abstract of the workshop, I said that it would be approachable for beginners and yet thought-provoking for experts. There's no use in not setting high goals, eh? So here it goes. The first expression is one of the first, is probably the first APL one-liners ever written. This one here. In Iverson, the inventor of APL had it in his 1962 book, section 1.4. So here is an example of using it. X is a four-element vector. And the expression applied to X gives a one minus one or zero result for each element. And stated conventionally, it would be if X is greater than zero than one, L is if X is less than zero than negative one, L is zero. So it computes what's called a signum of X, the sign of X, for real numbers X. And then shortly thereafter in the book, Iverson used the second expression, which is X times the first expression. And stated conventionally, if X is greater than or equal to zero than X, L is minus X. In other words, the absolute value of X with no branching involved. And what makes this work in APL is that Boolean functions like less than, greater than, equal to, and so forth have values that are zero or one rather than true and false. And the second thing in APL that makes this work is that functions apply to entire arrays. Sean Hughes in his talk referred to this as whole something, whole value. So APL had this 50 years ago. You know, Ken Iverson, the inventor of APL was kind of an impish guy, he goes around causing mischief. So in the year 2000, at the annual meeting of the American Mathematical Society, we had a booth there and Maple also had a booth there. Maple is an algebraic computer algebra system. So we went to their booth and said, how do you find a number of elements of a vector greater than 100? Now he knew very well, they didn't have zero or one, he knew very well they are true and false. So in APL it's immediate, right? How many elements of X are greater than 100? Reduce or plus insert X greater than 100. Now he left the Maple booth a few minutes later and they were still scratching their heads. And then the last thing in APL that makes this work well is a very simple function precedence. Right to left. The second expression is one for computing the average. If you were here this morning at Morton's talk, he also used this example. We call this a train but you may think of it as a functional form, it has three functions. So what it does is it applies the left function and the right function to the argument or arguments getting two results and then it applies the middle function. So average is the sum divided by the number of items. So for example, if X is a vector of 20 random integers between zero and nine inclusive then average of X is 4.4. At this point I'll go to an APL session and do some examples. So X is one E7 for zero. So that X is a vector of 10 million random numbers between zero and one and the average of X. So I'm executing this to give you an idea of the speed of it. It's entirely interpreted, there's no compilation involved as yet. Okay, in this further examples, 0.1 plus 345 reshape of iota 60. So iota 60 are the integers from zero to 59 inclusive. I reshape it into a 3 by 4 by 5 three-dimensional array, you have a question. And then I add 0.1 to it and that's the result. So now I'm going to find the average of various averages to do with. So by default, average applies to the leading axis or the leading dimension of X. So you get a 4 by 5 result but we have a thing called a rank operator which allows you to apply a function to sub arrays. So here I'm saying apply average to the matrix season X. That's what the rank 2 is and getting a 3 by 5 result. So rank is a generalization of map because map, I don't believe allows you to control the size of the sub array that it applies to but here I can specify a number which indicates the rank of the sub arrays that I want to apply the function to. And it works for monadic as well as dyadic functions. For dyadic functions you can specify two ranks, the rank on the left and the rank on the right or just one rank which will apply to both arguments. And you can say average rank 1 which means apply average to the vectors in X and what do you think average rank 0 does? So the average of one number is just that number itself. And this is true for every number in that three dimensional rank. So that's one of the characteristics of APL is that we try hard to get it to work consistently on edge cases. So even though it doesn't make too much sense to find the average of a single number, it still falls out from taking care of the edge cases properly. Okay, I call this the index of selfie. What's a selfie to the man on the street? You take a photo of yourself, right? So here I'm applying index off to an array itself. So this symbol, which Morton also presented in this morning's presentation, we call that index off. It can apply to two different arguments. It doesn't have to be the same arguments on the left and the right. What it does is it finds the index of the first occurrence of an item in the right argument in the left argument. So let's try an example. Oh, I'll just use quad A. So quad A is the letters of the alphabet. If I say quad A, index of B. So it finds the index of F in the alphabet. No, what quad A, V, quad A, sorry. Yeah, and if it can't find it, then it's the length of the left argument. So F is the letter five, O is letter 14 and so forth. And the first letter's first index is zero. Yes, I'm sorry? Yes, yeah. And since space doesn't occur in the letters of the alphabet, it gives you the length of the left argument. I'm sorry? Yes. Not just strings, but numbers, arrays, and boxed arrays. OK, so what I'm saying here is that x index of x is a very useful combination, I mean, a calculation. Because the analogy is that it's like an ID number. So for each item in x, it computes an ID number. And questions of identity on x are very often answered very much more efficiently on ID numbers than on x itself. OK, so here are a few uses of xi over x or x index of x. Finding the unique elements. This second expression is an efficient computation for an inner product of two matrices. And the same expression is an efficient computation for an outer product of two non-simple vectors. And it's also, and the last one is an efficient computation for using the key operator. I don't have the time to go into details of each of these. But we can write these as equivalences, an algebra of functions, because you can write this on the left-hand side, and then a match, and then this on the right-hand side, and with the appropriate arguments, it will tell you that it's the same. OK, x and ty are relational tables, names, sex, country code, and age. And index of works on this representation of relational tables, just like we saw for strings and letters of the alphabet. So for example, the first row of ty is rows. So this is row 0, 1, 2, 3. And then this is row 1, and so on. And the last two rows of y don't occur in Tx, so they're indices of 5. And these are selfies, Tx index of on itself, and Ty index of on itself. Now this way of representing relational tables is quite inefficient in APL, because each of these guys in the box is a proper array. And arrays in APL are self-describing, so each of the guys in a box has a 32-byte overhead of information about the array itself. So you can imagine it's quite inefficient for a scalar number like 29 to have a 32-byte overhead. So a more efficient representation of a relational table is what's known as an inverted table. So what you do is you take all the values of column 1 and collect them together as one single, in this case, one single character matrix. All the letters in column 1 collected together as a character vector and so on. And they're very short functions for converting from one to the other. So invert will convert the nice-looking table into an inverted table. And to get it back, you apply the invert function or the uninvertion. So this kind of thing is very common. You check it out by using match. So x is this, and this expression just shows that an inverted table is the same number of items in each column. And this comma bar, each of x, displays the inverted table x in a very nice form. And this slide just shows the space comparison between an inverted table and the ordinary table. So tx is the one with all the boxes. It takes 920 bytes. And x, the inverted table, takes 272 bytes. And I do 1,000 times replicator of the table with all the boxes and invert it. You see the uninverted form takes 880k bytes. And the inverted form takes 55,000 bytes. So it's a factor of 16 in this case. So not only that, but you get hit in time for the uninverted form. So this is showing that in the inverted form, if I do a comparison on the column, it's 14.77 times faster than the form with all the boxes. And in this example, the size factor is factor of 40. So now, how do you find the index of each record? Well, in the ordinary case, you just say tx index of ty. Here, we can't do that, because here I'm looking for this is an item. So it's looking for this item of y and x. And it's not going to find it and so on. But you can get the index off on tables by first uninverting it. I'm a lazy type, so I don't want to write uninvert. So uninvert, to me, is vert. So if you vert x and vert y, you can find the index of the records. But suppose you have a huge table, like maybe one with 100 million or even a billion rows. You're not going to have space to vert it before doing the index off. So here's a derivation of an efficient index off for an inverted table. So we saw in the previous slide that you can vert the inverted tables and then do the index off, and it works. So the second line just substitutes the definition of vert into the expression. Substitute is a word I've heard quite a few times in these couple of days. So I'm just substituting definitions. Now the next line is the realization that this funny symbol n closed rank, negative 1, is just a scalar encoding. And I said that for questions of identity, quite often the questions are more efficiently answered on the ID number, which you get by index off, than on x itself. So I'm using, in other words, I can use x index off to get a scalar encoding, which works just as well for questions of identity than on the data itself, which is n closed rank, negative 1. So here's an illustration of how it works on column 0. So I pick out column 0 assigned to x0, and closing rank in it, negative 1, I get John, Mary, Monica, Min, and Max. And index of x0, I get those numbers. And then for y, same thing. If I pick out column 0 of y, and then closing the rows, I get that. And then if I do x0 index of y0, I get those numbers. These integers, when you're talking about identity, work just as well as these guys. And then this, the other stuff, you have to learn a bit more of APL to know what they're doing. What happens is that if you do x index off each, you get a four, in this case, you get a four element vector of integer vectors. And the upward pointing arrow mixes them, forms a four row integer matrix. And then I transpose it to get a four column integer matrix. And then index off works the way we want it for integer matrices. So here's an illustration of this intermediate result. So transpose of mix of x index of each y, I get that. Likewise, for x index of each y, I get this four column integer matrix. And on the left-hand side, that's how you can compute index off if you have enough space. I invert each of the inverted tables and apply index off. And here, you don't have to un-inverted table. You compute these indices and then use index off. So this is one of the advantages of functional programming that John Hughes talked about this morning. You can do formal manipulations on programs. They give the same result. But this one is very much more efficient. Next one is bar chart. So the argument x, in this case, has six elements, which are integers. And I want to plot a bar chart of those numbers. So 3, 1, 4, 1, 5, 9. The way I did it was the intermediate steps on the intermediate results on the right-hand side. I find the maximum of x, which is 9 in this case. And then the integers from 0 to 9 minus 1. And then I do a commonly called outer product. But I like to refer to this as a table, function table. Because everyone knows function table. In school, you learn addition. And they give you an addition table, and multiplication, and so forth. So here, we're computing a function table using greater than as a function, getting 1s and 0s. Function tables are a good way of looking at even functions that you know, and especially on functions that are new to you. For example, how do you know what does a function table of a commutative function look like? Yes, it's symmetric. When you transpose it, it will be the same. So you can describe that to kids, and they would get it. Because you flip it, and it's the same function table. That's what commutative means. So you can try it. You know greater than and greater than, equal to, equal to, and so forth. Try function table even on those functions. I think you may learn something. I learned this expression for computing a bar chart when I was first learning APL, too many decades, than what I wanted to say. And at that time, sorry? Yeah, yeah, yeah. But it's one line. And it was already in the APL milieu at that time. It's something everybody knows, right? And this expression drove home to me the point that in APL, the index or the subscript can be an entire array. That wasn't obvious to me. But after looking at this, you say, ah, yeah, yeah. Next expression is one for computing all the permutations of the first-end integers. Now in TDD or test-driven development, the methodology recommends that you start with the simplest case you can think of and then go for it from there. You know, but I like to start in the middle. The reason is that in the very simple case, you may have particular properties that don't hold in a more general case. And you have to kind of backtrack and go for it again. So here are all the permutations of order 3. You get it however you get it. And then try to generate all the permutations of order 4. So you kind of sit down and look at it. And here's something that comes close. What you do is you add 1 to all these permutations and then can't lay a 0 column, assign it to Q, and then use Q to index into 0, 1, 2, 3, 1, 0, 2, 3, 2, 0, 1, 3, and so forth. Now if you collapse these together, then you have the answer. Those are all the permutations of order 4. This is one of the what I call benchmark problems that I have, a personal set of benchmark problems. It's the benchmark not in the sense of timing, not only in the sense of timing, but it's a benchmark to test my own understanding. And more importantly, test on advances in the APL language. So the thing on top is the best I could do in 1981. And it's pretty close to the best that you can do in APL at that time. This thing is as bad as it looks because it means go to. So go to, if 1 is greater than or equal to n, then go to line 0 or exit and so forth. And in these two lines, it's looping around. But the idea is all there. It's using t, which is the permutation of order 1 less, and then indexing into a vector. So this year, the best I can do is the function on the bottom. Now, a little bit about notation. It's a recursive function. And these are what's called guards. For a guard, if the expression to the left of the colon is true, then you return the result as a result, the expression to the right of the colon before the next diamond. So if n is 0, then you return 1, 0, reshape of 0. Otherwise, you do this thing on the right. And the notation is that this upside down triangle means recursion. And having that way of specifying recursion means you can have recursion on anonymous functions. So whatever name you assign to the function doesn't matter because this thing already indicates recursion. So go back to the previous slide. If we collect together these vectors that I'm indexing into, 1, 0, 1, 2, 3, 1, 0, 2, 3, 2, 0, 1, 3, and 3, 0, 1, 2, is what I'm trying to generate on the bottom here. So the intermediate results are compute a function table using equal as the function. In other words, the identity matrix. And to that, I apply gray down rank 1. So gray down are the indices that you need to sort the thing in descending order, right? Yeah, because if you're sorting down, the first index should be 0, and then the rest. And then the second row, if you apply gray down to this, the first index should be 1 because you want to pick the one first, and then the rest. And then 2, and then the rest, and then 3, and the rest. Of course, there are other ways of generating a matrix like this, but I live for expressions like this. And then finally, you use q. What q and q slide to index into this matrix, rank 1. Again, as I said before, rank is a generalization of map because I specify this indexing function. I can specify that it applies to vectors on the left and vectors on the right. And then this stuff is to collapsing the three-dimensional result into a matrix result. And it's not the last word. This is a version in 2016. Some of these sub-expressions can be improved. But the language, the APL language, cannot yet express the thought that I have. OK, next slide is the next expression is the parentheses nesting. In other words, the def or nesting by parentheses. So yeah, so x is this character string representing an expression. You can see that when you see a left parentheses, you increase the count by one, another one, and then remains the same. And then when you see a right parentheses, right parentheses, that decrease by one and so forth. Here it increases again, stays the same, decrease by one, decrease by one, and then so on. And these subsequent lines are intimate results that shows what's going on. You find the index of x in this two element vector of the left and the right parentheses. So you can have a 0, 1, or 2 result. Use that to index into a three element numeric vector, 1 minus 1, or 0. And then do a plus scan on that. And you get exactly the result that you calculated. Now, if you remember from the first slide, there's another way to compute this expression. If x is equal to left parentheses, you get a 0, 1 result. Minus if x is equal to the right parentheses, you get another Boolean expression. If you find a difference, you get the 1 or minus 1. And then if you do the plus scan, you also get the same answer. So you write another function, depth 1 or something, and then you can compare these two programs. And they should be the same. But I prefer this form with the indexing because it generalizes to when you have more than two possible choices. So you can use index off to get a result which can be greater than 2. And then index into possible, index into a longer vector than just three elements. There's actually an illustration of computing a function because this pattern of indexing into something and then using the indices to index into something else is another way of computing a function. You're probably not old enough to remember this. But when the new math curriculum came in, they introduced functions as a set of order pairs. And there was a lot of criticism. What is this new math? But I remember that lesson because I know that a function can also be considered as a set. And sometimes it's more efficient to do table lookup. This is what it is, this table lookup than to actually compute it using other ways. Quick sort. It was mentioned in one of this morning's presentations. So I quickly looked it up. And Haskell also has a very short quick sort. It'd be interesting to compare the two. Anyway, here it is. And this version of quick sort is what we call an operator. What we call an API operator. Or higher order function. Because it takes a function. First of all, it takes a function argument to produce a second function. And then you apply that function to arguments. So what the function argument is, in this case, minus, is the thing that tells me the ordering when I compare elements. So what Q requires is that the result of the operand function has to be less than 0, equal to 0, or greater than 0. Conventional. So here I have random integer vector. If I specify minus as a comparison function, that's it. So in APL, you wouldn't actually sort this way, because there's a more primitive means, which are highly optimized and much faster than this. But the purpose of this one line function is to, well, notation of the two of thought. It helps you to understand what is this quick sort thing. OK. Carrying on with Q. So here's a function, operand function, for comparing and closed strings. So if they are equal, if the two strings are equal, I return 0. Otherwise, I compute something whose details I will not go into. But it'll be negative if alpha is less than omega. And it'll be positive if alpha is greater than omega. And it can sort and close character strings. Again, Q is a recursive function. So again, there is a guard here. If the number of elements in omega is 1 or 0, then return omega itself. Otherwise, recursively, that's the upside down delta triangle again, bell, apply to the items of omega, which are less than 0, catnated with the items that are equal to 0 in the comparison, catnated with the items that are greater than 0 in the comparison. And the comparison is done on a randomly chosen item, what's known as a pivot in this algorithm. OK. Now the application of this, which is called order statistics, suppose you want to find the 47,000th element in the sorted order. How do you do that? Well, one way to do it is to sort the whole thing and then pick out the element that you want. So here, S1 is an expression or a function for computing that, sort the whole thing and then pick out the gray the whole thing to be more precise and then pick out the appropriate element of the gray and then use that to index into the argument. So in other words, or for example, the zero order statistics would be the minimum. If you use index, which is the index of the last element, then it will give you the maximum. And if X is an odd number of elements and you specify the middle, it will be the median. And here, S is an operator using the quick sort partition idea to compute the order statistics. Same answer. So how about timing? So S1 is the one that sorts the whole thing and then picking out the right one. This one is using the quick sort idea. And this one, hardwires the comparison to make the timing fair, because this is still using an operand function to do the comparison. So here's X is a million random integers between 0 and 1 billion. And then finding the 4E5 or 4E5. Anyway, 4E5 element of the sorted vector and doing a timing on it. CMPX is a utility for doing timings as well as comparison. So it will give you the timings in the form of a bar chart. And as well, if one of these results are not the same as the first one, it will put an asterisk on it. So it is both a tool for comparing equality or matching as well as on timings. So you see that SN, this one with a fixed comparison, beats the one that sorts everything, hands down. Golden ratio. We saw scan before, the plus scan. And it's a higher order function or operator. So if I give it the function plus composer divide or reciprocal, plus chart, reciprocal just means x plus reciprocal of y. So of course, that computes a continue fraction. If I apply it to the 16 element vector of all 1s, I get those numbers. It happens to converge to the golden ratio. Compute it exactly as that. Of course, it's not exact, because girl root is not exact. I've set up the default in my session to display six figures. But you can get it to display all the figures that it has available, which is 16 or 17 or 18, depending on the number or the computation. So it's IEEE 64-bit flows. And then if I compute the LCM of these flowing numbers and one, surprise. Or maybe you're not surprised. But what you get are the Fibonacci numbers. And the last one is the equivalent expression in J. J has rational numbers with infinite position. So you apply plus composer reciprocal scan on these rational numbers, you get exact results. Again, with the Fibonacci numbers showing up, I have 203.45, right? Yeah, OK. Good shape. I'm sorry. OK. That's one of the goals of this workshop. Inner product. John mentioned that it is workshop this morning. And other people have mentioned it. Now, the classical way of doing inner product is you take a row. You mentioned that in your workshop too, Martin. The classical way and the way they teach you in school to do inner product, you take a row in the left argument, apply to each column the right argument, and so on. So every row against every column. But this expression also computes inner product. Again, it's like comparing equivalent functions. So in this case, for those arguments, they're the same. But this comparison is actually more powerful than application on any old argument. Because it's applying it to random arguments. So unless you're unlucky or the algorithm is unlucky, you're not likely going to get that truth on random arguments. So what it is is using rank, this generalization of map again, applying two vectors on the left and matrices on the right. So what it's doing is what I call row at a time. It takes a row of the left argument, apply it to the entire matrix on the right, and then pick the next row, apply it to the entire thing on the right. And the benefits are manifold. It has very beneficial cache effects. As you know nowadays, in modern computers, cache is king. So in the row by column way of doing inner product, the matrix doesn't have to be very large. And to compute the result, you can get a cache miss on every element of the result. And I'm saying that for row at a time, this way of doing things. It doesn't have that undesirable effect. Also, you can take advantage of zeros in the left argument because the sort of checking for zero is amortized over the entire row of the right argument. So you can check for zero too also in the row by column way of doing things. But there you're checking on every element of the result. So it's not really cost effective, aside from other detrimental effects in the execution because you're doing checking and then probably branching in the machine code. And lastly, if the left argument is Boolean, in other words, zero or one, or even has a small number of possibilities, then you can do this table lookup thing that I mentioned before because the number of possible answers is small when the left argument has only a small number of possibilities. OK, so I'll illustrate the workings of this algorithm. First of all, row by column. Now, dot, the operator that's using plus dot times is a proper operator in the APL. So I can give it any old function. And quite often, it's quite illustrative to use n close of alpha omega as a function. So it's taking every row in the left argument and applying it to every column in the right argument. So it's 20 against 0, 21 against 5, 22 against 10. And then the next column, the same row, 20 against 1, 21 against 6, 22 against 11, and so on. And here, I'm picking a second row of x, 23 and 0, 24 and 5, 25 and 10. And then the next column, 23 and 1, 24 and 6, 25 and 11. In contrast, for a row at a time, so here I'm using alpha length of omega for the function. Just looking at what rank 1, 2 means. So it's taking the first row of x, applying it to the entire of y, and then taking the second row of x, applying it to the entire of y. So rank 1, rank 2, rank 1, rank 2. And then if you look, you focus in on it, giving it a more complicated function. So inside the function, there's a rank negative 1 there. So this part has the same structure, but now I'm looking inside each of the rank 1, 2 application to see what it does if I give it a rank negative 1 thing. And so it's using, in each row, it's using each scalar, because rank negative 1 means the subarray having one rank, one less than the rank of the array. So it's 20 against the first row of x, y. 21 against the second row of y. 22 third row, y, and so on. Morton. Right, this is, yeah, I mean, no, no. Yeah, good point. The interpreter, the interpreter, the implementer, having understood what this thing is doing, then wrote his C accordingly. And then, of course, in the QA, you're going to compare that expression against the plus dot times. And if you had a row by column implementation, you'll also compare it against the row by column implementation. And oh, yeah, oh, yeah. Because if you haven't understood that and you start writing a C, you get all mixed up, get it in the details. There's one final interesting story about this. Like, we've coded this row at a time method for several years. And then in the current release, we switched to Visual Studio 2015. Having used Studio 2005 for all those years. And we finally switched over to the new for us, Visual Studio 2015 compiler. And inner products sped up by a factor of 3.3 with no changes to the C code. And when we looked at the machine code that it generated, it's using vector instructions. Which, for comparison, we also coded explicitly the row by column implementation. So for that one, it's not able to use the vector instruction, but for this row at a time implementation written in C, there's no way for us to tell the compiler that, hey, we're doing row at a time. But it recognized that that's the situation. It can use the vector registers more efficiently. And we got a factor of 3.3 with no changes to the C source. Some days, you earn your pay, right? That was one of the days. Katie's theorem. So I still have a lot to go, so I'll just quickly go through this. T are all the non-singular Boolean matrices. I'm using them because I want to do groups. And I don't want a commutative group. So these are the non-singular Boolean matrices. And if you form the group table, again, here's the table again. Using Boolean matrix multiplication, you get the group table. And then if you take this group table and find the indices of each element in the ravel of the table, you get this bunch of numbers. And the point is that when I relabel, I'm not changing the group theoretic properties. It's like I'm using a different font to display each of these guys. And if I'm using a different font, surely it doesn't change the group theoretic properties, right? So you look at this table. And for a group table, necessarily, the rows are permutations. Otherwise, it's not a group. And then if you look at each row, i.e. if you split the matrix into its rows, you get all the permutations. And what are some natural things you do with permutations? You index. So you find a table using indexing on these permutations. You get another table. So R of G is this. And then you compute the table using indexing. And then you relabel it again. Lo and behold, these are the same. So that is Katie's theorem. Every group, finite group in this case, is isomorphic to a subgroup of the group of permutations. How many characters? That's a very succinct way. You can see yourself discovering Katie's theorem if you have this available to play with on your computer. I discovered this myself in 1987 because I had APL to play with. And I thought I'd discover something, but actually it already has a name, Katie's theorem. Discover or proven in the mid-1800s. Symmetries of the square. If you have a permutation, you can form a Boolean matrix from it and get the permutation back. And the significance of the Boolean matrix is that if you index using the permutation, that's the same as matrix multiplication using the corresponding Boolean matrix. So actually there's an isomorphism between permutations and Boolean matrices. And what are some things you can do with matrices? You can rotate them, flip them, doing mirror images. So in this chart, those are the eight possible versions of the square that you can get using rotations and reflections. And in each on the edge of display, the operation that I had to use to apply to the matrix in the upper right-hand corner to get that image. And if you apply that operation on Boolean matrix, getting another Boolean matrix and then converting that result back into a permutation, that's the thing that you can use to get the corresponding permutation. So this is a quick way of proving lots of identities on these functions on permutations. So this is saying that if you look up the table, this row and this column, and the claim is that this composed of that, you get the identity. So this composed of that, you get the identity. And if you apply grade down four times to a permutation, you also get the identity. And apply grade down three times, you get grade up, composed of reverse. There's a quick APL expression, one line, that I can use to check what I just said. But in the interest of time, I won't execute it. But it's in the exercises. You can try it in the privacy of your own home and see that it does work. Interval index. Now we're thinking of adding this as a primitive in mixed version of APL. And of course, we want to model it. And that's the model, or simplified version of the model, and get our users to try it to see whether they like it, to find out what applications they have. Typically, we like to work that way because you can write down the specs in words till you're blue in the face and give it to people. They can read it, whatever. But until they can try it, it's not very effective. So what interval index is, the left argument is a sorted list. And being sorted, it divides the universe into particular partitions, the universe into non-contiguous intervals. And the interval index finds the interval of the index that contained an item of Y. Yeah, and it works for strings and closed character vectors and so forth. Anything that's ordered. Now, we get to this funny number. This past summer, I had the opportunity to have an intern. She's a high school senior. She's an aspiring mathematician. So what better book to recommend to her than Hardy's, a mathematician's apology? So I thought I would read the book again before recommending it to her. And I came across this statement. The number of primes less than 1 billion is that number. No, what happened to the utility confused exactly that. Negative 1 to function encoding. PCO of 1E9, I get the number of primes less than 1 billion is 50,847,534. Both of those numbers cannot be correct. They could be one is right, the other is wrong, or they can both be wrong. So I did a Google search. It turns out this number is right. And there's a pictures description of the wrong number. It's called, it has a name, even though it's wrong. It's called Bertelsund's number, which Mathwell says is an erroneous name, erroneously given the erroneous perhaps he was funny. So even though the internet says this is right, how do you know that's right? I mean, for me, I also have the additional dilemma of already giving the standard spiel to the summer intern that you've got to prove everything. So how do you prove a number like that is right? Well, one thing you can do is to prove, correct the program that computes it. So here's a very short program for computing it. I won't prove it here, but it's very easy to prove. Compute bit vector having length n. You can use the bit vector to compute the actual primes. You can add it up to get the number of primes less than n. But if you sum it, it doesn't work. Work space full or not enough space. It's likely that it will remain work space full for a long time yet because it generates a Boolean matrix of size 1E9 by 1E9. So not only do you get killed with space, but it will take a while to compute the answer. So here is kind of funny. Here's a program you can prove correct, but it can't compute the result. Isn't that funny? So what I did was I, oh, PCO itself, I can try proving it correct. But it would be a daunting task. It has about 20 lines and there's a couple of long tables in it. So I wrote from scratch an efficient program likely to be correct, likely to be amenable for proving. And it also says that. And the optimizations that it uses are instead of doing the marking for each number, it only does the marking for non-prime multiples of each prime less than the square root of n. And multiples of 2, 3, 5 are initialized by a more efficient method because it's an optimization because if you get rid of all the primes that I mean all the numbers are multiples of 2, 3, 5, you already got rid of a high proportion of the numbers. And then subsequently, you only need to do the odd multiples of primes greater than 5. And multiples of a prime, to be marked, can start at a square. Looking at this function, the hardest line, I think the hardest line to prove correct would be this line, the m line. Because it's the number of times you have to mark each number. Why is it the hardest? Because in computer science, the two hardest problems are naming, the two hardest problems are naming, garbage collection, and off by one error. So I've traveled all this way. And I'm only 200 miles from Kumbakuna. Is that how you pronounce it? Kumbakuna. So of course, I have to talk about Ramanujan. So this little expression started when Jay, who was here last year, sent out an email saying, who is to the development group? Who is interested in going to see this movie? And Fiona says, ah, Mr. 1729. I'm in. And John is in, and so am I. And shortly thereafter, Nick Nikolov, another of our colleagues, found a block entry by Wolfen talking about saying that in Wolfen language, the language formerly known as Mathematica, in Wolfen language, you can compute partition P of 200 instantaneously. I'll explain what partitions are in a minute. But that sounds like a challenge. So can I compute partition P of 200 instantaneously in APL? So here are the partitions of the integers from 0 to 6. Notice the 0 case. Again, APL tries to handle edge conditions. So me too. So these are all the partitions of integers from 0 to 6. So for example, 4, how many ways can you write 4 as a sum of vectors of positive integers? So it's 4 itself. This is a vector, by the way, it's not a scalar. 3, 1, 2, 2, 2, 1, 1, and 4 ones. And the order does not matter. So how many ways can you do it? Well, I solved this problem before using Euler's pet pentagonal number theorem. This thing, you can find out on the internet. And you notice it's recursive, but not only that, it's highly, multiply recursive. Because P occurs in it, in itself. So here, the guard is 1 is greater than n, or omega. Then the answer is 0 less than or equal to omega. Otherwise, you recurse. And REC computes the numbers that you need to recurse on. So Pn of 0 to 6 are the number of partitions for 0, 1, 2, 3, up to 6. And Pn 30 is 5, 6, or 4. So I'll just go into my APL session. I already have Pn defined. So Pn is here, then REC is here, and Pn of 30. Ticks a while. Now, the trick with live presentations is that when something ticks a while on your system, you keep talking to distract the audience to make it seem not as long. But it's going to come back. Trust me. But the time it takes shows you that you shouldn't try this on 200. And I'll show later on why Pn of 200 is not a good idea to compute. I'm waiting. No, no, it's plugged in. It does. There it is. OK, so why is it not a good idea to apply it to 200? Well, I'll show you why. Because it's going to use REC compute. Argument that it's going to apply recursively to getting those. And for each of these guys, it's going to recurse. So on 199, it's going to recurse on those guys. And then after it's done that, it's going to recurse on 195, and then 188, so forth. So what to do? Well, the memo operator was mentioned yesterday. We call it an operator because it's a higher order function. So the key thing with the memo operator is that f of m is the same as f. That's no compromise with that. But it keeps a record of arguments and results for reuse. It is commonly used for multibuy recursive functions. Sounds like just the ticket. So here's an implementation of a memo operator in APL. I won't try to explain the details, but the key thing is that it takes your definition, in this case, pn, and then inserts some code in it to do the memorization. OK? So what's next? Yeah, so pn memo of 30, pn memo of 200, let's do that. Yeah. So that's the answer for pn of 200. You can check it against the previous slide, see that's the same answer. Yeah, how long would pn of 200 have taken? Well, there's a way to estimate that by counting the number of function calls. So pnc is the associated number of function calls. It has the same form, same recursive form as pn itself. So I can apply memo to it and get the number of function calls. And so pn of pnm using memo tells me that's the number of function calls for 30 and for 200. I timed the pnc of 30 before, it was 35 seconds. It's not too far off. You saw it taking however long it took. And then extrapolate. So with the extrapolation, it would have taken 1.31e42 number of seconds. Or 4.15e34 number of years. In that number of years, not only would your computer have crumbled to death, but the universe would be well on its way to entropy death. I'm so happy because I have time to do. Oh, yes. Oh, no, this is still part of it. In J, a dialect of APL, it has extended position numbers. And it has the memo operator built in. So pn of memo of 1,000 takes 52 milliseconds. And it gives you the exact answer there. So the answer to the challenge is that pn of 200 can be computed instantaneously in APL for some suitable definition of instantaneous. And the memo operator, even the model that I have, works on anonymous functions as well as functions with non-scaling results. If you go into the exercises, there's couple of exercises on that. Finally, so glad I can do this expression. It's Euler's identity in the opinion of many mathematicians the most beautiful equation in all mathematics. 0 is equal to 1 plus the exponential, exponentiation, you know, e to the pi times 0j1. So in one expression, it combines the fundamental quantities of 0, 1, and 0j1, or i, the square root of minus 1, and the operations plus times and exponentiation. So I'm going to prove this by first proving Euler's formula on sine and cosine. Now this is not yet executable in APL because of this infinity. There's a 30, more than 35-year-old proposal for adding infinity to APL, but it's not yet implemented. Because for some strange reason, there's no commercial call for having infinity in APL. So what these are are power series coefficients on the three functions, e to the 0j1 times omega. So it's 1, 0j1, minus 1, 0j minus 1. Repeat it, weight it by factorial i whole to infinity. Similarly, for sine, it's 1, 0, minus 1, 0, 1, 0, minus 1, 0, and so on. And sine is 0, 1, 0, minus 1, and so on. So in other words, if you get rid of weight, which is factorial i whole to infinity, you get those numbers. And you sit there for a while and it doesn't take too long. So you'll notice that if you multiply s by 0j1 and then you add it to C, you get identical vectors. And since those series are absolutely convergent, you can reorder the way you add them instead of add across, which is what the power series is saying, you're adding down before comparing. You see that exponential is cos plus 0j1 times sine. In other words, order is formula. So order is formula again. Pi, cos of pi is minus 1. Cos of sine of pi is almost 0. And then on the right-hand side, if you add it up, you get minus 1, almost minus 1. And on the right-hand side, you have order's identity, 1 plus exponential pi times 0j1. If you don't know, here I'm showing off, because that's 0. It's quite unusual, because often what you get is a very tiny number, but it's not exactly 0. The vicissitudes of falling point numbers. So if these 16 expressions are not enough for you, I'm working on a book called A History of APL in 50 Functions to appear on November 27, 3 o'clock in the afternoon, because that's when the first workspace was saved in APL. And when it's the 100th anniversary of APL, I'll write a history of APL in 100 Functions. Any questions? Yes. Yeah, that is one of the things. But you notice in this APL session, we have these checkered things. So initially, you don't want to fool around with the keyboard. You can just click on the things on top and get their character. Yes. And that's multi-metric? Yeah. Yeah, so it gives you a little aid in my mind. Anyway, the keystroke you need and a brief explanation of what the symbol is to jog your memory. And so this is matrix divide, so forth. OK? Yes. No, we don't have dictionaries as a primitive type, but you can easily implement it by using index off and then indexing. So you would have an array of the words of interest through index off into the words and then index using subscript or indexing to get the value that you want. So that would be a dictionary. Yes. The question is, one of my examples has using indexing to subscript into a table to get speed up. And the question was, can I do that on, I guess, an expression somebody enters on the keyboard? Yeah. Yeah. That's a relation. Yes. No, it doesn't do that on the fly, but it has that in some primitives. So if you're forming one by integers, for example, there are only 256 possible answers. So it's not going to laboriously format all those numbers. It's going to index into a table. Yes. What would the motivation for using J, a dialogue, a API? Oh, at the time, many of the things I could do here were not possible in APL yet. And for some strange reason, after I joined dialogue, these things started to show up in APL. Yes, in the back. Yes. The question is, is there a version of APL that replaces these funny characters using normal characters available on other keyboards? I guess you can use J, which is ASCII only, but using digraphs, or sometimes even trigraphs, but the ideas are the same. I tell you. Yeah. Yes. OK. Sorry.