 This talk is functional PHP. If that's not what you're here for, then one of us is in the wrong room. And I hope it's not me, because that would be very embarrassing. My name is Larry Garfield. You may know me as Krell Online. Or you may, those of you who are here early this morning, may know my alter ego, Lord over engineering. When I am not overacting horribly, I am a senior architect with Palantir.net, where a web development shop based in Chicago doing mostly Drupal developments. We work mostly with universities, publishing companies, museums, cultural institutions, health care. Large nonprofits is our primary target audience. For Drupal 8, I'm also the web services lead, so the whiskey initiative, which has nothing to alcohol. I apologize. I'm also the Drupal representative to the framework interoperability group. It's kind of the United Nations of the PHP world, with all of the good and bad implications that analogy has. I'm advisor to the Drupal Association and general purpose lovable pedant. And my colleagues may dispute the first part of that, but not the second. But that's really the last of the Drupal we're going to talk about in this session. I'll give you a heads up. This is going to be a fairly heavy talk. We're talking about computer science concepts here, not just general programming. So please bear with me. I will try to keep it as understandable as possible. But let's first go back a little ways, because if you want to understand something, you need to understand its beginnings. I don't remember where this line came from originally. It was on Twitter somewhere. If you look at the history of computers backwards, it starts with kids fleeing and smart men and ends with some smart men solving really hard problems. So let's go back and look at some of those really hard problems people who are now dead were trying to solve a century ago. Let's start back in the 20s and 30s, 1920s and 1930s, when logicians and mathematicians were trying to figure out, what does it mean to compute something? This back when a computer was a person, usually a woman, who sat all day and did math for her job. That was a computer. The electronic things came in later. And scientists were trying to think through, what does it mean to compute something? What does it mean to calculate? And there were two main camps that developed. One was a paper published by Alonzo Church describing something he called lambda calculus. Calculus, because he's a mathematician and mathematicians like to call things calculus, because it sounds cool. And lambda, because he got really tired of writing the word function out in longhand. And so as any good scientist, he started using Greek letters instead. And so lambda is an alternate word for function. That's pretty much all you need to think about there. Fundamentally though, he was describing computation. What is this thing that computers do? What is this thing that logic does? As a relationship between mathematical functions. In this worldview, everything is a mathematical function. And you can do anything with that. He was in the US. And within a few months of when he published his paper, a man named Alan Turing over in the UK published a different paper that took a different approach. And he coined something called a Turing machine, so-called because he had a bigger ego than Alonzo Church did, which is essentially an abstract state machine. It deals with state. It deals with the state of information and the transformations between them. These are just some of the people we're going to talk about who are just way too smart for the fact that we don't know enough about them. People don't learn about some of these people, which is very depressing. But these two men, they're very respectable mathematicians and scientists, so looked at each other's work and said, wait a minute. This looks familiar. And they went on to prove that. These two ways of looking at the world are equivalent. There is no problem in computer science, in programming, in computers that you can solve in one way or the other. You can always approach it in both ways. So let's fast forward a little bit to World War II and shortly thereafter. And another crazy, smart mathematician named John von Newman, for whom mathematics was just kind of his side hobby. He also worked on set theory, quantum theory, nuclear physics. He was on the Manhattan Project, just way too smart for his own good, as was often the case. He was also a consultant on the ENIAC computer, one of the very first electronic computers, and on EDVAC, which is the other very first electronic computer. And both of these systems, under his guidance, essentially were implementations of the Turing machine. They took the stateful approach of defining, all right, if we want to build an electronic magic box that computes stuff, how are we going to do it? And they took the Turing machine approach of tracking state. And this architecture became known as the von Newman architecture. Von Newman architecture is this idea of a stored program. A program is simply data. Program is instructions. And it's a series of steps. A series of steps followed in order, linearly, one step after another. And this job of a step is to alter the value in some place in memory in that magic box computer. And one of the things that you can alter is the place in that magic box that says what step to run next. And that, just right there, gives you enormous power. That is how all computers work today. And this leads to a concept called imperative programming. This is imperative in the linguistic sense of an imperative move. Like you're giving commands, go here, do this, add that, set that, print that. That's what imperative programming is. It's a series of commands, a series of instructions. Very precise commands. Remember, computers are stupid. They only do what you tell them. And it has this concept called state. State are values. They are pieces of information that change over time. This value will change. And the point of a program is to run a series of steps that will change that state. The purpose of a program, an imperative programming, is to change state, to change the value of something. And this is how virtually every piece of modern hardware works, from the phone in your pocket, to the laptop on your lap, to the giant servers running Wall Street, they're all, at the hardware level, doing this exact same thing. They're just pushing state around. Conceptually, you can think of imperative programming like following a recipe for a cake. You've got this big bowl that is your memory. And you add an ingredient to it, and you've changed a state. And you add another ingredient to it, and you've changed state. And you mix it, and you've changed state. And you put it in an oven, and bake it, and you've changed its state. In all of these, you are just manipulating state. In the case of a computer, instead of a bowl, you have RAM, or data on a hard drive, or the registers in the CPU. But fundamentally, you are just manipulating state. This led, of course, this is hard to do. So this led to procedural programming, which is essentially just a step above imperative. It's not much of a change. And it has the concept of a procedure. A procedure, or subroutine, as it is often called, is simply a reusable set of commands. All it is is reusable bits of imperative programming. But that does give you this concept of structured programming, which gives you these sort of high level abstractions like if, and for loops, and while loops. So instead of directly manipulating registers, you can say, well, this register is still five, or greater than five, or whatever, run these instructions over and over again. Which is still pretty low level, but it's a less low level than dealing with the memory registers directly. Very, very simple procedural program can look like this. I can't actually see it from the side here. But we define our list, and we have this biggest variable. And then we define this set of commands. And then we call it, see? And then we print out the value at the end. That's a procedural program. I don't think this is an actual language, although it might be some close to the early languages. But notice we are manipulating states. We are changing values as we go. Essentially, it's like singing a song with a refrain where you can repeat the chorus over and over again. It's reusable. You can just reference it. But you're still singing one line after another after another after another. Imperative programming is based on the idea of defining how a program should work. Who's familiar with this? If you've ever written code in pretty much any language, your hand should be up right now. OK. If you've written PHP ever or JavaScript ever, you've done exactly this. And it's just abstractions and abstractions and syntactic sugar on top of pushing state around on disk. Everything we do above moving registers around in the CPU is purely syntactic sugar. Everything we do in programming is syntactic sugar over pushing values around and changing values. But let's take a look at a different way of approaching the problem. Declarative programming. Declarative programming does not just say how a program should work. It just defines what it should accomplish. You simply define this is how something should end up. And then it's the computer's job or a compiler's job to figure out what that means in practice. Who here has written SQL? Declarative programming. You're not saying, here's how you pulled data out of this data structure in this file or whatever. You're just saying, these two values from these two tables should merge together and give me a result. That's declarative programming. Who's written CSS? Declarative programming. All of the YAML files in Drupal 8, declarative programming. Spreadsheets. Who's written a spreadsheet? Declarative programming. You're simply stating what the value of a cell should be. Which brings us to our next very smart, old mathematician, John Bacchus. John Bacchus was, among other things, one of the inventors of the FORTRAN language. It was one of the first, actually the first, third generation language that gave you something above assembly to actually work with. And as penance for writing FORTRAN, so you got the joke, OK. You went on to help develop Bacchus in our form. Who's heard of this, BNF or EBNF? Yeah, that's that Bacchus. Which is a standardized way of describing a language, describing a programming language. He also co-developed ALGOL, which is a mechanism for defining algorithms. And in 1977, while receiving a Turing Award, ironically, he gave a paper called Can Programming Be Liberated from the Von Neumann Style? A functional style and its algebra of programs. Functional programming, as a term, came about here. And we say functional, we don't mean that it works. I mean, these programs were working well before 1977. But it's functional like mathematics. Like the math you remember from high school or from college where you had inputs and transformations and you got an output out of it. That kind of function, mathematical functions. Again, Bacchus did not invent this stuff. If anything, this is simply lambda calculus all over again. In fact, I think he even references that. This is simply saying, should we go back to that? Of course, as I said, he didn't invent it. List people claim that they did invent it because list claims to have invented everything. But that's another story. But in functional programming, what you are declaring or what you're defining is your algorithm. You're not declaring steps to accomplish a task. You're defining the algorithm for a task, the relationship that defines a task. And then actually turning that into steps is not your problem. It's not your job. That is what a compiler is for. Functional programming has three core concepts which we're all gonna go over. Pure functions, immutable variables, and higher order functions. The higher order functions, first class functions, they are technically different things to the mathematical purists, but they're close enough for our purposes that I'm gonna treat them as though they're the same thing. So anyone here who actually has a CS degree, please don't throw tomatoes at me. But we said before, you've probably done something like this. Who's written formulas in a spreadsheet at some point? Functional programming. You're not saying take this value and this value and add them and multiply by this value in steps. You're saying no. The value of this cell is these two cells multiplied together. And the value in this cell is that value plus the value in this cell. What order that happens in when it gets calculated? Completely not your problem. You are defining a cell's value as being something in relation to other cells. How it actually happens, meh, not your job. So if imperative programming is following a recipe, procedural programming is singing, functional programming is filling out spreadsheets. So next time you start making fun of the project managers who do all kinds of crazy stuff in Excel or in Google's spreadsheets, they're doing functional programming. Respect. And you're probably sitting out there thinking now, so what, we're in PHP, not in Excel. But here's the thing. Everything I just described, everything I just talked about is what you should be doing anyway. Most of the principles of functional programming, while there are languages that will enforce them syntactically, are simply what in object-oriented languages or in procedural languages we call good coding practices. So let's take a look at pure functions first. A pure function is one that has no side effects. That is, it does not affect the state of the world other than taking input and issuing output. It does not print anything. It does not set any variables other than its return value. It does not take input. It does not change the state of anything other than computing something. It has very, very specific explicit input, usually through function parameters, and very specific output, its return value. That's it. That's all it does. It, that means it is stateless. Stateless meaning it does not track any values between runs. You call a function with parameter five, you get back 12. You call it with five again, you will get back 12 again. There is no state that has persisted from one run to the next. Clear input, clear output, that's it. Here's an example of a pure function. This function takes an array of, we're going to assume strings, and a string, and it builds up some other string and returns it. The only inputs to this function are items and type. That's it. The only output is a string. Nothing in the entirety of the universe changes from as a result of this function other than returning a value. That's a pure function. That is a good thing. What's the advantage of a pure function? For one, it is really easy to test because if you want to test something, you want to unit test it, you need to be able to control its inputs and then check its output. If all of its inputs are sitting right there in front of you and its output is right here for you, you're done. It becomes really simple. You don't have to set up some environment for it. It's just right there waiting for you. You don't get spooky action at a distance. Who's ever had a problem where, you know, you make a change over here and then some bug shows up way over here. Normally I have a wireless mic. Who's had that kind of problem? You're like, what? How are these things in any way related? That's spooky action at a distance and that makes you sad because that is how lives are lost debugging things. Please don't be sad. They are completely self-contained. That means they are really easy to understand. I don't need to know, oh, this value or this part of this function is gonna do this because this global variable is set over here by this other routine that I'm going to assume is already run, but if it doesn't then there's other thing. No, no, no, no. Right there, the code right in front of you is the entirety of what you need to know to understand this function. I said there's no I.O. at all. So it's not printing anything. It's not taking any user input. It's doing nothing other than computing something. And that makes it item potent, which is the fancy mathematical way of saying repeatable. If, as I said before, you call something with five and it returns 12, you are mathematically guaranteed. I can promise you with absolute certainty next time you call it with five you'll get back 12 again. And if I call it with five one time and six another time and I get back whatever the results are, doesn't matter which order I call it in. Calling it with six will get me back 50, calling it with five will get me back 12 and I can order those however I want. They're completely independent of each other. This makes code easy to understand. This makes code easy to test, easy to follow, easy to grok. That description should sound familiar to a lot of people. Who's been in any of these symphony trainings so far today? All right, they talk about this a fair bit. Service objects, ever heard that term? Service objects are stateless, or they're supposed to be. They serve one purpose only. They're very self-contained. They are supposed to be item potent. And they have no side effects unless it's a service that's bridging to some kind of IO, like a database connection or an output stream or whatever. But aside from reading and writing to some external channel outside of your program, they have no side effects. A good service object is a pure function. They are conceptually exactly the same thing. And all of the benefits of good service objects and good pure functions apply in both directions. Who's heard the term iceberg class before? Nobody? A few people? An iceberg class is a class that has only one public method. Everything else is a protected method or a private or a call to some other object, but it's public-facing API is one method. An object with one method on it that is a pure function is exactly the same thing as a pure function. It's just a syntactic difference. It is conceptually the exact same thing. Immutable variables. Immutable variables are not variable. Hmm, they're a variable that, once set, cannot have its value changed. They are placeholders for a value rather than state that is going to change over time. Now, if you have the ability to enforce that, then you can make some very interesting memory optimizations like not copying a value in memory because you know it's not gonna change so you don't need to duplicate it just in case. It reduces that spooky action at a distance because if you know you've got a value here and a value here and here cannot change that value, I know I can rely on this value not changing over here. That's useful because if this value is gonna change out from under me that's gonna make me very sad. And what it really comes down to is state is where bugs come from. Who has spent time with a real-time debugger or something like that or maybe print statements stepping through a program because you know at some point here this value is this string and down here is this other string which is wrong and somewhere in these 50,000 lines of code it changed and where did it change? Oh my god. Who has lost hours of their life to that? That's what I thought. More people than a few spreadsheets. All right. Yeah. If you can rely on the value not changing then you can very easily break it down to the individual pieces rather than oh, I have to trace this value's change over time. You want to be an enemy of the state. Don't tell the NSA I said that because that gives you much easier ability to focus on parts of the problem. And it also really forces you to have small functions or small classes that are not gonna be tracking state over time. They're going to do exactly one thing because the code won't let you do more than one thing. And code that does one thing and one thing only is easier to understand, test, debug, and so forth and reuse. This should be familiar. Value objects. You've heard that term before, okay? A value object is an object in OO that once its constructor is done running its values cannot change. An example here, if you have a class that represents a point and has an X and a Y properties, you set those in the constructor, you cannot then change that. If you want to get another point that is that point moved over, then you get a new point object that happens to be somewhere relative to that first one. Who's had to deal with a date time object in PHP which changes itself as it goes which means you can't actually say all right, what's this value going to be, what's the date a month from now or a week from now, oh crap, I've deleted my first value, now what would I do? That's because that's not a value object. Value objects make testing a lot easier because again, you know what's not going to happen. You know what can't happen so you don't need to bother testing it. Again, less spooky action at a distance. They are very often easier to read because they lead to code that forces you to have a clear structure rather than just, oh this value is going to change blah blah blah blah blah. Means you can reuse these objects safely. If you have an object that represents more than just a point on a graph but some rich value, say a node or a user, we can reuse it and get the memory savings of reusing it without having to worry about, oh wait, it changed over here and now everything else is broken. Why is mutation a bad thing? Well, let's have a look at this code. We've got this function foo, it takes two parameters and there's a bug there that should be passing a second parameter in here and we call get bar, get that value out, then we format a, then we make changes to a and then we output the formatted string so and bar is value one, all right. Who knows what value is going to come out here? You think value one? Nope. Where did that come from? I have absolutely no idea. I wrote this code, I have no idea where that value came from. Did get bar change the value? Did make changes affect the formatted string already? Did it have that kind of back change? I have no idea, I can't tell. That's a bad thing, this is a bad sign for your code if you can't tell what's going on. And finally, first class functions, these are the fun ones. First class functions, higher order functions, these are functions that can be a variable and therefore they can be a parameter to another function. They can even be returned from a function. Anything you can do with a variable that is an int or a string or an object, you can do with a variable that is a function. That's a first class function. In PHP, we have a disability added in 5.3 called anonymous functions and the syntax for it looks like this. We're gonna find a function that takes one parameter and returns that parameter times five. And we're going to assign that function, that piece of executable logic to a value and then we're going to invoke that value with some parameter. Nice, simple approach but can have a lot of really powerful effects. For instance, you heard of strategy pattern? Strategy pattern, you have an object that does whatever it's gonna do but it's missing part of its algorithm. You pass another object into it. Why pass an object in? Why not just pass a function? Here, use this function. That is the rest of your algorithm. Done, nice and simple. You can lazy evaluate things. That function we just defined a moment ago doesn't actually run. It doesn't do anything until we actually call it. So if we have a function that contains information that the logic to compute something we want or something that we maybe want, we can define that and then later on call it or not. If we call it, we get back the value that we wanted. If we don't call it, we never have to spend the time computing that value we don't need. Anonymous functions give us that ability. They give us an ability to do something called partial evaluation or curry. We'll get to that in a moment. Or there's a really cool shorthand for objects because really that's what PHP does under the hood. In that case and point, here's a nice simple anonymous function. It takes two parameters and returns true or false depending on if they are equal by the definition of equal that we're defining. And then we can call it just like any other function. There we go. Find pair and pass in two cards and get back a Boolean. What PHP does under the hood is compile this to a class that looks like this. As a five three, any class can have an invoke magic method on it. And if you call an object as if it were a function, then underscore invoke gets called. This is what happens behind the scenes in the PHP engine when you use anonymous functions. Could you just write all of your code this way instead? Yes. Would it take more work to do? Yes. This is a much shorter and easier way of writing this code. And you're not polluting your namespace with I've got to name a whole bunch of extra classes that only get called once and they're gonna live over here in this other file. No, no, no. All of my code is right here in the same place. You can also, when you define a function, say, you know what? I want to access some value from my context, from my lexical context syntactically. And here, so in a lot of languages when you have anonymous functions that happens automatically. When you use the name of a variable it will magically figure out if it's from the parent scope or not. PHP forces you to specify it explicitly. Which I think is a really, really good thing. Who's written JavaScript? Who's had an anonymous function in JavaScript where you reference a variable and you're like, oh wait, that's not the variable in my function. It's the variable outside my function. Or maybe it is the variable inside my function, right? Yeah, in PHP it's syntactically explicit which I really like. A lot of purists think is wrong but this makes it a lot easier to avoid bugs. So in this case we're saying use wild, that use that value. And then the value wild is available inside our function. When we call it, whenever we end up calling it, much later. This is conceptually, and again this is exactly what happens inside the PHP engine, the same as writing a class. That takes something in its constructor. This is what happens under the hood in the PHP engine. Could you write this yourself? Just save the value from the constructor to a property and then use it when you call it later? Yes. Would it be twice as long? Yes. Do you want to write twice as much code to get the same effect? No. At least I don't, I don't know if you do. You can also, because a function is just a value, use another function. So here we've got, we're gonna factor out that is wild logic to a separate function. And we've got, it takes a card and uses that wild value. And then is pair is going to take that is wild function as a closure. So when you reference a variable like this, the fancy way of saying it is a closure is created over that value. Don't worry about that closure. People like to use closure to mean anonymous function. PHP even does in the engine, so roll with it. But then when we call find pair, it will simply call is wild for us. I keep giving this talk and I didn't notice that typo before. Yes, it is. Yeah, second function, card one, card two is C1 and C2. This should look familiar. This concept, anyone heard of dependency injection? So let's write a class that uses this concept. We have PHP's magic get and set methods. What those do is if you set a value that doesn't exist on the class or on an object of that class, then it calls this function instead. And if you try and access a property that doesn't exist, it calls this function instead or this function. So let's say when you try and set a value, you're going to set it with a function with an anonymous function with a closure. And we're just gonna save that function to run later. And then when we try to access it, then we're going to run whatever that function was. And so we can now use it like this. Create our container object. And we're going to assign this con property, quote, this anonymous function, which has not run yet. And that function is going to access the info property of the container, of the object. And info we define, we can define it later, that's fine, to return this array of configuration values. And so when we then ask for that con property, get is invoked. And it looks up that function we just assigned and calls that function and returns its value. That function is right here, which is going to look up info, which is going to call get with the other function, which returns our configuration parameters, puts that into that object, instantiates it, and returns that. And we now have a nice, cleanly injected database connection. We just wrote a dependency injection container on one slide, which is pretty damn cool. You want to see more of this technique? Have a look at PIMPLE. It's a PHP dependency injection container written by Vivian Pontossier, the Project Lead of Symphony, built on exactly this model. It does more than this, quite a bit more than this actually, but it fits in like 150 lines of code. A complete dependency injection container, just using anonymous functions in 150 lines of code. What else can we do? Just with this concept. Who's ever had to import data into a system? Everyone's done everything, that's awesome. So let's try defining an importer that looks like this. We have an importer class and we're going to take a mapping object and that mapping, then we're gonna get some source value coming in out of some third-party system or a CSV file or whatever data you're importing from. We're going to create one of our local objects, whatever it is, node, some other entity, just records in a database table, whatever it is you're doing. And then we're going to iterate over the set of mapping instructions and for each one of those we're going to invoke whatever that function is and set the result. That mapping object looks like this. We have an array of callables, an array of functions. And so when we get to title, then we'll call the title callback, which is this function and has a reference to this other service and it's going to do whatever it's gonna do to pull out the value we want. Then body, it's gonna do what it wants to do and extra, we're just gonna hard code something in here and we just return back that array of importing commands, array of mapping commands, which means we can then import any values we want just by adding stuff to that array. And then we can call it very simply by taking some external object, passing it to map data and we get our object back to safe. We just wrote an entire import engine on one slide. This isn't made up either, we actually did this on a project at Palantir last year. This is slightly simplified code from an actual production project. If you'll also notice, these objects are immutable. They are item potent, so I can reuse them. I can import as many records as I want and the state of these objects never changes. I can run that over 10,000 import records from a third party system. I could chop the data in half and run it in two separate processes with 5,000 each and it worked just as well because the services themselves are immutable, because they're item potent, I can mathematically prove that there can be no bugs between the two from splitting the problem up that way. That's cool. That functional programming has a lot of tools to it. Once you have these assumptions in place, there's a lot that you can do. We're gonna go through a number of these here. First one, who's heard this phrase map reduce before? All right. Map reduce is useful for embarrassingly parallel problems. That is problems where you're doing something to a lot of values, but there is really nothing between one value and the next that is going to impact them. It consists of two parts. Map, which is you take whatever your problem space is and split it up into a whole bunch of identical sub-problems. And then you distribute that, each of the sub-problems, the separate workers. And then reduce, which is, all right, those workers have finished, collect all of those answers back together, and munch them back up into the results of your original problem. This is really useful if you have really, really epically huge data sets, like, let's say, Google, or Twitter, or Amazon. If you want to analyze tweets, that's an easy map reduce problem because you're going to analyze, you're not gonna analyze 50,000 tweets, or 50 million tweets, you're gonna analyze one tweet 50 million times. And you can do that in parallel if you wanted to. And this could get very, very complicated. There are some very large, very expensive systems that do this on a large scale. Or in PHP, we have a function called ArrayMap. Who's used ArrayMap? That's map reduce. That's all it is. The second parameter, excuse me, the first parameter for any callable to ArrayMap can be any callable. That is, name of a function, which you've probably used before, an anonymous function, or an array representing a method of an object, or a method of a static class. You've probably done this before where you had some function you defined, that's a one-off function, then you put its name in here. You don't need the one-off function somewhere else to keep track of. The logic is right here, you can inline it all at once. Because it's just a value. A function can just be a value. ArrayMap also lets you do merging of arrays. So pass in two arrays and get back one result. So add together, or multiply together, parallel values from two arrays. And you can put in any set of, any callable. So in this case it's the method, a method of the object. It's a very wonky syntax, but hey, it's PHP, what do you want? And this is really great. Effectively the same thing as doing it for each, but it separates your application logic from your loop. It means your looping code can be completely separate from your code that's doing the actual work. Now I can unit test the code that does the actual work without having to think about loops, separate your concerns. Is it slower than doing for each? Yes, because you have extra function calls. Is it slower enough to matter? No, you eliminate one SQL query from your system and you will more than make up for the time spent on a couple of function calls. If you benchmark that and find something different, turn off x debug and try again. Now one problem with ArrayMap is it does not work on iterator objects. It only works on actual arrays. Which is a problem, but let's see if we can work around that and do things declaratively in PHP. So if we wanted to do it the old school way and import a whole bunch of records, then we have some array of the data coming in, an array of lines out of a CSV file or whatever, and we just iterate over them and run that importer on each record. All right, can we do better than that? Can we just declaratively say I want you to operate on this whole array? Well, we could do it with an array, but if I'm importing a ton of data, I don't want to pull it all into memory at once. That would be terrible. I want to build an iterator that will load stuff slowly line of the time out of my CSV file or web service or whatever. So it's really easy to write a function that does ArrayMap on an iterator. It's very simple. We're just really just wrapping the four each and now we can say, apply that mapping logic to everything in that iterator and we will get back an array of our mapped imported objects, our node objects or whatever. Now looking at that, could we go a step further and not end up having a gigantic array of all of our nodes or one of our target objects? Could we just save each one on the fly? Yes, we could. I'll leave that as an exercise to the reader. I've only got so much time here. Memorization. Is the fancy academic way of saying result caching? That's all it is. Remember, pure functions have the same input, get the same output. Every time, very predictively, mathematically provable. So if you get the same input, why bother computing the output if you can just remember it from last time? Right? You have probably done this before. I can guarantee you you have seen code like this at some point in your career. Drupal used to be full of it. That came out wrong. Drupal used to use this technique a lot where you have some function, you maintain a static. If a value is not set, you go and compute it and cache it to the static and then just return it out of that cache. Next time you call it, great, it's already in the static cache, you just return it. And that's wonderful. It's simple. It's easy. But it means you're confusing your actual business logic that whatever that expensive operation is with your caching. You're tightly coupling those. Also, what happens when you try to unit test this? We had to build all kinds of hideous hacks into Drupal to get around the fact that we kept doing this and it was breaking all of our unit tests. Sometimes you have functions that have an extra reset parameter, which is nuts. Others, we've got this Drupal static function you've probably used. Those are all work around so the fact that this is a bad way to do it. Instead, let's take a look at this function. This is computing factorial, mathematical concept. Five factorial is five times four times three times two times one. Four factorial is four times three times two times one. So five factorial is four factorial times five. Still with me? Am I getting flashbacks to high school now? All right. So this is exactly how you would define that as an anonymous function PHP. So we're defining it recursively. And we're referencing our outer variable. So we're calling, this anonymous function is calling itself. The ampersand lets you use a value by reference. So if you change it, it does get updated, which you need to do when you're self-referencing like this. So we call factorial three and then factorial four. And this is what we got. We call factorial three, which calls factorial two, which calls factorial one, return, return, return. We get a result of six. We then call factorial four, which calls four, then three, then two, then one, return, return, return, return, and we get a result. Why are we computing those over again when we just figured out what factorial three is? Fortunately, there's a way to cache anything externally, and that's memoization. This is a nice simple utility function, but wrap up here is a lot of the concept behind where anonymous functions get really, really powerful. So we're gonna take this factorial function and memoize it, which means we're going to, in this function up here, take this function in and return a new function. We're gonna transform it to a new function that has that same static caching logic we just saw based on whatever the parameters are to the function. And if it's not already cached in our static variable, we'll go ahead and call the original function. So now we call our new factorial with three and then call it again with four. And calling it three, nothing's cached, so it just all goes through and computes everything. Then we call factorial four. Factorial four, factorial at this point is this inner function. And so it's gonna look up in our static results array, do we have a value already for factorial three? And it's gonna say yes, here it is, and return that instead of calling the original factorial function. We can now take any function, wrap this around it, and it becomes cached. That's cool, which means I can now test that original function without worrying about caching. There is no caching as far as that function's concerned. It's a completely separate process, as it should be. Can you do this to objects? Absolutely, you should be doing it to your objects. If you have an interface defined, which you should, you can do it like this. You've got some interface, you've got some class that computes something expensive here, and then you just build another object that implements the same interface and wraps whatever that object is. Congratulations, you have now memorized that service. Whatever's happening inside that compute function is now trivially cached, but you can still test your fancy class without having to think about caching. This makes your unit tests really, really easy, and it makes your code really, really fast, and those are a good combination. Call compute the second time here with whatever the value's gonna be, and it does not call it the second time, it just pulls the value out of the cache. Can you just, if you only have the one method, can you just use Memoize for it? Yes, because remember, any object and method can be represented in PHP as this weird array syntax. So here we're gonna Memoize that callable. So now we call fcached, and it's going to call that function that Memoize produced, which will in turn call the compute method of the fancy object of f, save the result, and return it. We now have a tool that can cache the result of anything you can put parentheses after, and it took like what, six lines, eight lines? That's what functional programming gets you, as a concept. Who likes Indian food? This has absolutely nothing to do with that. Sorry. Currying is in fact named after Mr. Haskell Currie, who among other, another one of these crazy smart mathematicians, who among other things is the person after whom the Haskell programming language is named, which is ironic, given that according to his wife, he hated his name, and so we named both a programming concept and a programming language after him, because computer science people are morons. Currying is the process of splitting a multi-parameter function to a series of single-parameter functions that do the same thing. In general, this concept is called partial application. You're going to halfway call a function. Take this function, add, takes two values, we're gonna add them together, all right? We can create a partial add, where we can call it with one value, and get back a new function that persists that value, and we can then call that second function with some other value, and we'll add the two together. Could you do this by passing a value into a constructor of an object, and saving it that way? That's exactly what the PHP does under the hood, but isn't this a lot shorter? This is way less syntax to have to do the same thing. Where is this valuable? Well, that's an initial value, you can just pull out of some configuration or whatever, and because it's then a callable, we can now apply it to anything. So I can have pieces of my program that are set up and call halfway based on configuration, and then call the rest of the way on actual user data, and I can still test it as one function by passing in both pieces, both my configuration and my user data, which is pretty darn cool. So let's have a look at this practically. Can we abstract this? Yes, just like we had a memoized function, we can create a partial function that takes a function to call, or any callable, and whatever the initial arguments are going to be. So if a function takes, in the case of say, the date function, page B, it takes two parameters, we're gonna call it first with the first parameter, and we're going to get back a new function that keeps track of the initial function and that initial parameter we called with. And then when we call that new function with the second parameter, it takes the first parameter it saved, the second parameter we just gave it, and calls that first function. So date never actually gets called until here, but we get back our nice result. This should also seem very familiar. This is dependency injection all over again. This is put stuff in constructor and then call methods later that take advantage of the stuff from the constructor. The exact same concept is done in a much smaller, tercer, tighter package. But what's important here is that we get to see what we're doing much more clearly. We could see we're defining relationships rather than moving values around. What's that? Not necessarily, because if the values themselves are immutable, then I can still see, all right, what's the value that's persisted in this object? It's five. That five, oh, I know five is wrong. And if I'm setting it up, it's much easier with this kind of approach to say I can see, look at this code and just visually tell that this is not wrong. I can tell that this is right because I know something's not gonna change unless I know about it. You can persist values through a system without having them change randomly. Once I've set that value, once I've set that format here, I know that when I call date format, I'm never gonna be using a format other than the one I specified originally. It is impossible for my program to suddenly start printing it out using a different date format. That's where it's not sad. Could you do this through a method too? Absolutely, it's a callable. So let's bring this all together a bit. Remember we saw this code before with our cards? The problem is it's kind of hard to test anonymous functions because, well, they're anonymous. I don't know what to name them. They're just defined inline and other code. So let's rewrite that to be testable. We have an isWild function that takes a wild card and a card object. And it says, all right, is that card that wild? And findPair is going to take that isWild operation to use and then the two cards to compare. And I've got the same syntax error here, I apologize. Now we can test them. IsWild and findPair, we can very easily test. We can mock isWild if we want to, but it's really easy to unit test these pieces. But I don't wanna have to pass something every time. All right, partial it. We'll take isWild and halfway call it with our wild card. So we wrap up, this function will always get called with five of hearts as the first parameter. And then we partial findPair with the five of hearts isWild function we just created. And then we can call that pair checker repeatedly just as we did before. Or we could make both of these into classes and objects and pass stuff into the constructor and this slide would be four slides long. Or I can just put it right here altogether. We can go a step further. We know if two cards are going to be a pair, they're gonna be paired the next time as well. So memorize it. In this case, it's not a hard comparison, but pretend it's a really hard comparison to do. And now we can call this cached version for determining if something's a pair anytime we need. And it will be fast and testable and simple and straightforward to read because we're just looking at, those are the only functions you need to read to understand what's going on. You can keep on going to say three of a kind. What is three of a kind? Three of a kind is, is card one and card two a pair and is card two and card three a pair. So we can define it exactly that way. And then we can create that partial that references the pair and memorize that too. And now we have a cached way of checking set of cards or three of a kind. Which the function itself is defined by what the definition of three of a kind is. Let's go back here. This is the definition of wild. We're not defining steps. Nowhere in here am I saying this is the steps you take to define to determine if something is wild. I'm saying this is the definition of what it means for it to be a wild card. Does it, is it the same as this card we specified? Are these two cards a pair? The definition of them being a pair is, do they have the same value or is one of them wild? That is exactly what the code says. There are no steps here. There are simply definitions. We are simply building the atomic pieces of our problem space. Your job as a developer is not to solve problems. Your job is to define the problem in such a way that it becomes solved. That's a big shift if you're used to pushing data around in memory. But pushing data around in memory is how you get weird, weird, weird bugs. Instead, define the problem in such a way that the answer becomes obvious. And then the computer takes care of it from there. Pushing data around in memory is boring and hard and dumb. Boring, hard, dumb, that's what computers are for, is to do that so that we don't have to, because we're better than that. So this is a lot to take in. There's an important observation to make here. Purely functional languages, which enforce all of this stuff rather than make it optional, have this advantage. All data flow is explicit. So you know exactly where everything is coming from at all times. They have this advantage. Sometimes it really sucks to have to specify everything explicitly. So the takeaways I want you to have from this talk are make your data flow explicit most of the time. There are exceptions where it's just too much work, but for the most part, if you really think about it, you can make most things explicit and end up with code that is easier to test, easier to verify is correct, and you can just look at it until that it's right, because you are defining the problem space. Using pure functions let you do all kinds of cool stuff, and when I say pure functions, I include service objects. When you're writing for Drupal 8, most of the business logic of your modules should be in stateless service objects. Controllers and forms and all this stuff are just glue code for that last 10% of weirdness, but your actual module business logic should live off in service objects, which are pure functions. If they're not, you're doing it wrong. Approach the problem in terms of what's not how. Don't think about how am I going to move this data from here to here. How am I going to compute this thing? Define what it is you are trying to do. Define what it means for something to be a pair. Define what it means for something to be a wild card. Define what it means for something to be three of a kind. And then actually using it just falls out naturally, because you define the relationships between different concepts in your program. Separate out the what from the how. You will at some point probably still need to push a few variables around somewhere. That's okay. Keep that nice and self-contained separate from your definitions of the problem space. When you have to do IO, when you have to do impure things, keep those separate. In a purely functional language, there's weird workarounds that play games with the definition of pure to make it work. Whatever, where this is PHP, we're not pure. But if you have an object that's talking to the database, all it does is talk to the database. That's it. And everything else just communicates with it. If you have an object that does output to the screen or printing it to a response object or whatever you're doing, that is all it does. Keep those impurities isolated so that the rest of your program is really, really easy to test. If you've heard of the single responsibility principle, an object should do one thing or things that are only gonna change together. It's that same concept taken to an extreme. Because when you do that, you end up with these small, tiny, reusable pieces of code that you can assemble very, very easily. And individually look at and tell whether they're right or not. It becomes really easy to debug visually. Keep in mind, state is where bugs come from. So the less state you have in your program, the less of your program is spent keeping track of variables here and all over the place, the fewer bugs you will have. Because the fewer places there are for bugs to be lurking. Keep in mind, when you say functional programming, it is not a specific language. I don't mean you must write in Haskell or you must write in LISP or you must write in Erlang or any of these other things. No, functional programming is a concept. It's really just good coding practices. That's all it is. Thank you. Once again, my name's Larry Garfields with Palantir.net. Do you wanna know what else we're doing or where else we'll be speaking? Follow us on Twitter, check out our newsletter, I swear it's very low volume. We also have a booth down in the exhibit hall, booth 215, if you wanna talk further. We are out of time for questions but I can answer them between sessions if you'd like. Thank you. The slides will be posted very shortly, sometime later today. I will be tweeting a link from the app Palantir account and they'll eventually get linked from the DrupalCon website. So follow our Palantir if you want the links before that. Excellent.