 Okay. My name is Dean Wampler from Object Mentor, that's Uncle Bob's company, and I've had too much caffeine this morning, so if I talk too fast, slow me down. What I'm going to talk about is functional programming and why that's interesting for Ruby programmers. Jim Weirich told us the other night that we need to start our presentations with a story, so here we go. Once upon a time in a kingdom far away, there was a law that ruled the land, and it was called Moore's Law, and everything was hunky-dory for a number of years. The processor speed, performance, et cetera of the processors just increased, basically doubled every 18 months, which is an exponential increase. But then there was a tremor on the force, if I can mix my metaphors, and a few years ago we started to see that turn over to go flat. And so in order to increase our continual gains in performance, people, basically the overlords that tell us what to do said, well, let's just add a lot more CPUs and add a lot more cores in our CPUs, and we'll just keep scaling that way. Unfortunately, the scribes in the kingdom said, so how the heck are we going to write robust concurrent software for all these CPUs? And it turns out the ivory tower scribes in the kingdom said, we've known how to do this all along. It's called functional programming, and if you guys had just listened to us all this time, you wouldn't have this problem. So the scribes ask what is functional programming, and that brings us to today, and hopefully the outcome of the story will be a good one. But what we're going to first do is talk about functional programming a little bit. So isn't it true that we already write functions in our Ruby code? I mean, we call them methods, but I mean, what's the big deal? Well, when we talk about functional programming, we're really talking about what functions mean in the mathematical sense. And so, for example, if I pick the value of x in sine of x, then I'm going to fix the value of y. Therefore, this is not working, variables are immutable in functional programming. We can assign them once, but we can't keep assigning the same variables. We're used to thinking of variables as variable, but once you've picked all but one in your equation, then the rest are fixed. The other thing that's very interesting about mathematical functions is that they don't change any state. There's no state variables or object variables that they're changing. They do whatever they do, and then they return everything that was done in the return value, and that's it. And the beauty part is, we say these are side effect free functions. And what's nice about that is they're very easy to reason about. I don't have to think about the global system to understand what this function does. I can call it as many times as I want, and as long as I use the same value, passed in, I'll get the same results. I can run as many of these at the same time as I want, and I don't have to worry about threading issues. So they're really handy for all the problems that we face in software today. And then getting back then to concurrency, if we don't have any mutable state because their variables are immutable and we don't have side effects, it turns out there's nothing to synchronize anymore. And almost all of the problems in concurrent programming deal with synchronizing access to mutable state. It's important to know that it's mutable state we were worried about, because if it's immutable, we don't care how many people are reading it simultaneously. Turns out Jim Wyrick, right after lunch, is giving a great talk on threading. And it would be really good to attend that talk if you want to learn more about the issues of concurrent access to mutable state. Another property of mathematical functions is they are composable. I can build new mathematical functions by lopping together a bunch of other ones, like this definition of cosine x in terms of square roots and sine x. And we say that these are first class citizens. So I'm passing these functions on the right-hand side as if they are values, like data. And one of the ideas behind the list is that code is basically data. What we like are having first class functions. We like to be able to assign our functions to variables, pass them around, manipulate them, and so forth. So let's talk a little bit about how to do this stuff in Ruby. Let's talk first about immutability. So here's a classic definition of a person object. And of course, one way to make it immutable is just to have attribute readers but no writers. Now, I'm sort of elitding the definition of the initialized method. You know what it's gonna do, so it's not really worth showing. So if I actually instantiate one of these persons, I can read the first name, and that'll return Dean. But if I try to set it to some new value like Bubba, that's not gonna work. I'll get an undefined method error. So I can't change the state of this object. Pretty trivial stuff. Shameless plug. Turns out this is that same definition in Scala. It's one line. Well, it's two lines because I had to wrap it. But you basically, it does all the magic for you that you have to do in Ruby. So we think of Ruby as very succinct and elegant, but there are some languages that are maybe a little bit more succinct in some ways. I'm saying this because I'm actually writing a book on Scala. This is a shameless plug. There's another way we could actually handle immutability. I suppose that we don't want all person objects to be immutable. We want some immutable, some not immutable. Well, we could always have our read-write accessor. But then we could just freeze the individual objects we care about. So before I called freeze, I could rename any field in the class. But afterwards it's frozen, I can't modify it. However, it's important to remember that I can assign new objects to the same variable, the Dean object. That's not disallowed. Even though I've frozen it and all kinds of stuff, it's still letting me do that. What I froze was the object, not the thing that's pointing to it. So this is fine. So it's important to keep those kinds of things in mind. Now there is a drawback, nothing comes for free in this world. And if you're creating lots of little objects because say I'm merging two hashes together and just creating a new one rather than doing an in place merge, there's the risk I'm going to create a lot of objects. There's a lot of overhead, garbage collection, all the usual stuff. So there's always performance issues to be wary of. Okay, let's talk about side effect free functions in Ruby. What does that mean for us? Which ones of these are side effect free? Well, certainly the attribute readers are side effect free. We just talked about that. Obviously the accessors, well, this is generating readers and writers. And of course the writers are not side effect free because they're changing things. Obviously the initialized method can't be side effect free because it's got to set up the state for the person object. But that's the sort of thing we're willing to live with. And hopefully we don't get objects referenced before the initializer finishes. I'm not actually sure what the Ruby VMs do about that. But like in the Java VM, for example, you cannot access an object until the constructor finishes. Now this method is, at least on the outside, we don't know what the inside is doing. That is inside the select method. But this is also side effect free because I'm not modifying the array. I'm just building a new one by selecting on the elements. However, it's important to think about what might happen if this array that was passed into me is being modified by another thread while I'm iterating through it. What's going to happen then? So even though this method in isolation is side effect free, maybe in the context of a concurrent system, I have to worry about what's happening to the array. And which ones of these things are side effect free and what's actually mutable? Here's a problem right here. I have to change the value of this loop index every time I go through it. So that's an issue for us. Now down here, it might look like this is side effect free. But in fact, when I'm printing to standard output, I'm changing global state and effect. So in fact, these methods, like prints, are not actually side effect free. Now maybe we don't care about that, that might be okay. But it's still something to think about that there's something going on, some status being changed in my global system. So then what do we have in terms of first class functions in Ruby? So a little example here, let's define a predicate. And all this is going to do is return true if the first character in a string is the A character. And then I'll have like a filter method. Actually, we saw this before. Well, I'll pass in an array and I'll yield to whatever block was passed in, sort of not shown on each element in the array. And then I'll just set up an array of arbitrary strings and I'll call filter. And notice what I had to do here, I had to wrap the predicate in a block. And then that will actually call that, is this an start with an A kind of string. And then it'll return these three strings out of this list of strings. So pretty straightforward Ruby. You could also do this trick where you call the method and get the predicate thing as a method object. And then I could use that by basically turning it into a proc. Personally, I think this is cleaner, clearer. You might like this better because it's cooler. I don't know. But what I cannot do though, I can't just pass that predicate method, even at predicate, directly as a block to this function. I can't really do that. I have to do at least something like this or maybe some other trick that's like that. So we can really think of blocks as being first class citizens in sort of the functional programming way, but not methods. The methods really are not first class functions because we have to do those little tricks, but it's not so bad. We can live with this, I think, most of the time. Another thing you'll see a lot in functional languages is recursion. Because again, I can't change this loop counter. So somehow I have to do looping when I can't use each for some reason or whatever. So this would not be allowed in pure functional languages. And the way they do that is with recursion. So I might have this print array, or I just want to output the contents of the array. It's basically the same thing I showed on the previous slide. I'll pass in the array and the index into it that I care about. I'll return if I happen to be at the end of the array and there's nothing left to print. Otherwise, I'll print that ith element of the array. And then I'll call the function again. This time, passing the array unmodified, but passing a i plus 1 is the value. So notice I'm not actually modifying any state in this method. These are all going to be new values on the stack. But the question you might ask yourself, is this really better than a loop? I mean, it might be better in terms of thread safety, but it's maybe not as clear what's going on as that previous loop thing. So it's kind of nice in a language like Ruby where we can use imperative techniques, sometimes when that's best, and functional techniques when they're best. So let's actually look at a slightly more interesting example. Does anybody know what this is? This spiral. What was that? Yeah, it's actually part of the golden ratio. This is actually the Fibonacci spiral. It turns out that if you look at these little blocks in here and ignore the spiral for a second, like this little block has side 1, side 1, and there's two of them. This one has side 2, side 2. This is 3 by 3, 5 by 5, 8 by 8, et cetera. This is actually called the Fibonacci spiral. And it's basically based on the Fibonacci numbers. Now, by the way, did anyone, if you went to the Aristotle talk this morning, you saw a very similar spiral, just one of the pictures he showed in the talk. So this is actually something you see in nature a lot. This is like the shell of a nautilus, for example, follows this spiral. And it is actually, I think it's in the asymptotic limit. It turns out the ratio between the Fibonacci numbers is equal to the golden ratio 1.6, whatever. Really, a cool example of how simple mathematics has profound implications in nature. But anyway, here's how you would define it in a math book, for example. If n is 0, just return 0. If it's 1, return 1. Otherwise, return Fibonacci n minus 1 plus Fibonacci n minus 2. Nice elegant little definition of this sequence. And if we write it in Ruby, it has sort of a similar elegant quality to it. I'll just use a case statement, so I think you can pretty well see what's going on. I think there's a bug in this, actually. It should be case n. Sorry about that. It's funny how you notice things right when you're doing the talk, but not when you've been reading the slides over and over and over again. What I think is interesting, though, about this definition is that I'm using case matching. And we'll get back to this as something else that's very profound in functional programming in a minute. But what I love about this definition, except for the bugs in it, is that it's very clean and concise. If n is 0 or 1, then we'll just return n. Otherwise, make the recursive call to F of n minus 1 or F of n minus 2. Now, there is an issue you have to be aware of, though. And that is stack overflow. If I ask for the Fibonacci number of 1,000 or something, I'll probably blow out the stack. And it turns out that while this is not a good example, there are tail recursion optimizations that can be done and are done in most functional languages. And also is coming, I noticed yesterday, in Ruby 192. But you do have to be aware of this. And you could certainly rewrite this as a loop to avoid stack overflow. So again, nothing that comes for free. There's always drawbacks to any decision we make. There's a set of three classic operations that you see a lot in functional languages. And the list data type is like the classic thing that everybody uses in functional languages like Lisp, which isn't pure functional, but we don't need to get into that. First of all, there's mapping a list from one set of elements to another where I do some transformation on each element. I might want to filter the list like what I showed earlier where I just want to pull out a subset. And then there's this thing called folder reduce. Say I have an array of integers and I want to add up the total of those elements, then that would be a folding operation where I'm basically reducing things. And of course, we have these in Ruby. So for any of the standard collections that we're used to, we already have a map function by name. The filter is like find all or grep. And there's obviously synonyms for a lot of these. And then we're used to using inject as our fold and reduce method. Now here's something that's important to remember. And that is no matter how complex your object graph is, it always reduces at some level, in some sort of tree structure, down to primitives like numbers and strings and stuff like that. And collections of other objects and primitives. And if you keep that in mind, then sometimes it's actually better to have an external function that's not like a method in the class and just apply that function to your collection in the class. And just to provide an example, let's go back to our person class. And let's say we're going to have a list of addresses. And that's just part of the stuff that's passed into the initializer. And notice that I want to freeze this array so that nobody can modify it. I want to keep my immutable object here. Come on, here we go. And then I'll just create a new person, pass in a bunch of addresses like the Apple address at the last line there. And it's fine for me to call each on those addresses. That's OK. But if I do in place sort, then that's going to blow up and fail because the object is frozen, the list of addresses. But the interesting thing, though, is if you think about what behaviors might I implement outside the type? Well, I might define a validate address method that'll make sure that any address in my list is actually in the United States Postal Service database. And you can actually buy these databases, for example. And then I'll just iterate through my addresses. I might even do this inside the initializer. And I'll make sure that this is a valid address you've given me, otherwise I'll raise some exception. But that's not necessarily something you want in the person class, right? Even in an address class, maybe you don't want to have this kind of context-specific logic embedded in it. So pull it out somewhere else, but use the fact that you have a collection that you can work with to do this validation, to get access to the data when you need it. Another quality you see in functional code a lot is it's very declarative. Remember the initial definition of the Fibonacci sequence that was this nice little curly brace? And then if n is 0, it's blah, blah, blah. And pretty much this buggy code kind of looks the same way, where I'm using the case statement to define the different clauses. Well, it turns out domain-specific languages often have this quality of being more declarative than imperative. And the classic example, of course, is from Active Record, where I describe just the relationship between person and other types in the system. But I don't tell you how to do it. I just say what I want and then let the system figure out how to do it. So it's nice and succinct. It's obviously high productivity for the developer to just sort of declaratively say what I want and let Rails figure out how to do it. And it also gives Rails the freedom to do it in a bunch of different ways if needed to optimize and so forth. So when you can avoid being imperative and be more declarative, that usually has a lot of benefits. OK, so we talked a little bit about concurrency being interesting for us. And I hope that you also saw that there might be other qualities, other benefits of functional programming beyond concurrency. But I want to talk about a particular approach that you see in some functional languages. Erlang, for example, and Scala use the actor model of concurrency. And the idea here is that you have these autonomous agents maybe running in a separate thread or a fiber or some mechanism like that. And the only way they communicate is through messaging rather than trying to manipulate the same shared state. And I picked an example from one of the three actor libraries. This one happens to be the Omnibus concurrency gem that mental guy wrote. There's going to be a talk tomorrow, I think it's tomorrow, on another one called Dramatis. And there's also one called Revactor that's saying primarily at Ruby 1.9. But here's the idea. So I include all the gem stuff. I'm going to use two objects that will basically be the messages I send back and forth. The first one will be request a greeting. And I'm going to pass in the reply to will be like the mail address of the actor that's asking to be greeted. And this is sort of a common model, too, is that all these actors have a mailbox and you're just sending messages to a known address to talk to them. And then the actual greeting will be the object that comes back, and that will be hello Bob or whatever. And I apologize for this slide. It's the most complex in the talk, but I couldn't really trim it down any farther. So let's just go through it. So I'm going to spawn a new actor that could be in a separate thread or a fiber or something like that. And here's my default greeting, which will just be hello. And then I'm going to loop forever and wait for messages. And when they come in, I'll receive the message. And then I'll do a comparison. And these are like Ruby case statements, actually. And in fact, there's a gem that's part of this package called the case gem that adds more behaviors for doing matching on types. So if you happen to send me a message that's a greeting object, then I'm going to save that as the default greeting. And notice I'm going to duplicate it rather than risk having access to shared states. So I'll just duplicate the value. So now I've reset the greeting to something else. But if you request a greeting, and I apologize, I had to abbreviate the word request, again, for space reasons, then I'm going to actually reply to you. And this is how you send a message to an actor. You use the less than, less than operator, if you want to call it that. Remember that this request greeting had a field that was the reply to mail address. And so now I'm going to send you a new object that's basically the greeting. Again, these are structs. So that's why we have the array syntax here. I'll send you that greeting that was maybe set previously, or it was the default, whatever the case was. And then, obviously, I have to close all my blocks, if you will. OK, so I hope that's clear. But really, the key point, though, remember I mentioned pattern matching earlier when I was talking about the definition of the Fibonacci number? This is an extended version of essentially what a case statement might look like, where I'm now matching on type fields. And then here's how I might actually use it, where I'll have some oracle that does the greeting. And I'll send him a message to change the greeting to howdy. There here's a way of getting the current actor in my current thread or whatever. And then I'll send another message to the oracle to request the greeting and pass my email address, if you will, so that I can get the reply back. And then I'll actually loop myself as an actor and wait for a greeting to come back. And then I'll just print that to the screen. And to actually show this, I think I didn't. Anyway, all right, I just said that already. But it'll come back in this case and say howdy, Dean. It turns out that in functional programming, they use pattern matching as sort of this fundamental concept of modularity in kind of the same way we use polymorphism in object-oriented programming. Now, it turns out that pattern matching like case and switch statements is kind of a bad wrap in object-oriented programming for reasons like the open-close principle. If I add a new type, oh my god, I have to open all my case statements again and handle this new type. But it turns out that that's not necessarily the case. And I wish I had more time to go into this. But I think there's a great synergy between using case statements or pattern matching at boundaries of things, like boundaries of messaging coming into an actor and then using polymorphism as sort of the internal behavior. And it's something that's worth investigating. Maybe that's homework, if you will. It turns out Scala has some really elegant ways of combining these two things. There I said it again. I used that word Scala. How about that? All right, let's recap some of the messages learned from this. First, prefer immutability in your code. A nice thing besides concurrency is that if I have an immutable object, I don't care who I give it to and have to worry about them messing it up in some way. So if you ask me, give me your list of addresses. If that list is frozen, I don't have to worry about you doing something malicious like adding another address or deleting addresses. So this is really a general principle that whenever you can use immutable data in your code, it's usually going to be better code, more robust, easier to reason about, and all that good stuff. Also, prefer side effect free functions, methods, whatever term, or even blocks. The advantages are many. It's very easy to reason about the behavior and isolation of that method. It's very easy to just call it as many times as you want and not worry about what's going on. It's, hence, like a concurrent environment where you have multiple threads calling the same method. You just don't care. Now, it's important to mention, too, that the sort of distinction between what's visible on the outside and what may be going on on the inside. My Fibonacci method might be side effect free on the outside. If I call f of 10 1,000 times, I'll always get the same number. But internally, I might actually do some imperative stuff, like maybe cache previously requested value so I don't compute them over and over again. That means that if I choose an implementation decision like that, I'm going to have to be careful that I don't break this quality of side effect free-ness, at least from the outside, in particular that it remains thread safe if somebody actually uses it concurrently. So the nice thing about that, though, is that I give somebody something that behaves in a thread safe way and they can sleep well at night knowing it's not going to cause them problems. And I basically isolated the problem of making sure that it's thread safe in sort of a well-known context. Whereas if we don't think about thread safety at all, if we just have side effect functions everywhere and mutable data everywhere, then we've made the problem much bigger of making sure it's thread safe. So it is important to think about sort of what's the boundary of where I want to use imperative, non-functional idioms versus when I want to stay in sort of a functional mode. The other thing to ask yourself then is if I have these collections and I can apply functions to them, maybe I don't need all the behavior possible inside the type, maybe I can have some behaviors like validating addresses in an external place and just keep sort of the essence of the type pristine. Sort of the platonic vision if I can use the philosophy metaphor since we've had a great talk about that already. What is the essence of a person versus what's all the external stuff that a person might need to do or I might need to do to a person and so forth. So sort of defining that boundary is useful as well and thinking about functions and applications of functions and composability is a good way to kind of frame that discussion. And be careful not to get too carried away with these functional idioms because you're going to run into issues like what's the overhead of making all these little copies, what's the overhead and the risk of blowing the stack up if I do a lot of heavy duty recursion, performance as well as what's the boundaries of the allocatable memory for the stack and so forth. And this is your homework assignment. Think about how to combine pattern matching more effectively with polymorphism, especially at the boundaries of like subsystems, actors, whatever. And I do encourage you to go to Jim's talk after lunch and the dramatic talk tomorrow to learn about actors in more detail. And I actually finished a little earlier than I thought. Any questions? So when you were looking at functions and their side effects with collections, you were only concerned with the collection itself and not the elements in the collection. What's the reason for that? If you're changing elements within the collection, couldn't that cause problems? Right, so if I'm only thinking about, to repeat the question, thinking about manipulating the collection but kind of ignoring maybe what might be happening to the items in the collection, then that could be a problem, right? Did I say that reasonably well? OK. Yeah, and that's one reason why I called freeze on that list of addresses. I wanted to make sure that you didn't modify the elements in that. Actually, does that work? You're freezing the array. See, you get into these issues when you start composing data structures of what am I actually freezing and what's actually not frozen. So for example, if I didn't freeze the addresses in the person, maybe I can't modify the address list directly, but I would have been able to manipulate the stuff inside it. So yeah, you do have to think about that. And I did mention that even though I'm iterating through an array, say to print it out, it may look like it's, well, it's not entirely side effect free, but the array looks like I'm not touching it. In fact, somebody else might be touching it. And functional languages tend to be a little better about giving you these primitives to lock down things. And in Ruby, we kind of have to apply what we have available to us. And also partly, be thinking in our minds, is there a way that I can preserve immutability throughout my code to make sure that I don't have to worry about somebody damaging the type or rather the object somewhere? In the back. I think we would like to pause and process the specification there, that there are arguments. If that variable is already in scope. So it's worth to be saying, X is side one. From say, array 1, 2, 3, we need to do X, what's X? When you come out of that, you're gonna have X as an X, and now we need to do 3. I put the scheme back on this, basically an X. For me, it makes Ruby pretty close to unusual as a functional programming language. Because every time I pass a lot of work out, I know I made these side effects. If I have an array, and this didn't even happen if I assigned a block to a numerical array. So I say, F is a property, whatever. So I may not even be visible from people in source. How do you know what's visible? Right, so if I can paraphrase the question. I think it would make correct me if I misunderstood it. But for example, a closure is basically freezing some states so that you can, it's available to the closure later. For example, if I reference a variable outside the closure that was in context, then that variable might be changing, even though I think I have a side effect free block. Did I say that right? So that's the case where Ruby doesn't let us preserve correctness in this sense. We just have to kind of be sure that we're doing the sensible thing, of not referencing stuff outside the block that's out of our control. If we can avoid it. Yes? Yeah, you know, right? If you grab a problem from somewhere, if they have a main variable, two bar, and you have a very two bar, you'll decide. Right, so an issue of what if I'm using someone else's proc that they give me? What if I've been very careful in my library to write side effect free code, but I have this proc that comes in that fails all the tests. And maybe, and you cited the example of maybe they actually use a variable name that collides with a variable name I have locally or something, then yeah, you could break things down. I think it would be, you know, the great thing about Ruby is because it's dynamic and has fantastic metaprogramming. I think there's maybe a little mini market here, if you will, for tooling that would actually kind of try to verify correctness in these ways, like look for cases where you're modifying data that maybe you're not aware of. I know, for example, the D programming language, which is intended as like a successor to C++, I believe they have a new keyword and I think it's called clean. You can apply that to a function and the compiler will check that you're not doing anything in a side effect failing way. Let's put it that way. So that would be kind of nice to have tooling that kind of verifies as best it can that your program is side effect free or whatever. Yes, sir? I think it would be one of the things that I'm going to do. So Ruby 1.9 block variables may be local now and I'm not sure myself, so that may be true. Anything else? Yes, sir? How do you deal with crazy backtraces and running recursive functions in Ruby? You said that again, I didn't quite get it. Don't backtraces suck to Ruby when you use recursive functions? Yeah, yeah, if you want me to. Recursive function when you get in a backtrace can be heinous, yeah. So workarounds that would make that a little bit easier. I mean, yeah, so it's another case in point of there's no free lunch. You're kind of paying a price if you're using recursion. Maybe clarity isn't so good, like the for loop I showed, and debugging can be hard. Besides the fact you might blow the stack. Anybody else? Okay, thanks a lot.