 Okay, so my name is Joe, and today I'll be talking about Ruby, which is a rule engine for Ruby. And it's a pure Ruby rule engine, so I'm talking about everything written here in Ruby and still getting these performance gains that we get from a rule engine. So when I talk to you about Ruby, the first question I usually get is, well, what's a rule engine? People have heard the term, but don't really understand it or have a clue what it is, so hopefully today we'll cover that and show you how you can use them in your applications. So to get started, I want to cover some programming language theory, and specifically the taxonomy of the different kinds of languages. So most of you have probably seen this kind of stuff in school, but we're going to approach it from a little bit different angle. So we'll start with imperative programming, and imperative languages are like Ruby, Java, and C, and this is primarily what we're all used to, or if we're at this conference you should understand, but these are programming languages that use instructions to change program state. So our programs written in these languages have some kind of flow of control. And this is a great paradigm, and it solves most of our problems, but it does have some negative aspects. So it can result in very complex code with numerous state transitions and interdependencies, and when we end up with a mess like this, we tend to call it spaghetti code. And while that's a big term, I think we all know what I'm talking about, and the result of this kind of code is that we end up needing a debugger to write our programs. And a debugger is a nice tool, but it's unfortunate when we require one to really write our programs. And actually Dykstra noted this many years ago. He wrote that the programmer should understand what he's doing, that his growing product remains firmly within his intellectual grip. And when a program has left our intellectual grip, that's when we need a debugger. So we can no longer understand the program just by looking at it and thinking about how it's going to execute. We have to actually watch it execute. So an example of when this kind of problem occurs is when we have very complex conditional logic. And this sometimes happens in things like parsing documents or something where we have a lot of if-then statements. And what happens is these if-then statements become difficult to understand and a maintenance nightmare. So I have here an example of, let's say we're writing some application for car insurance that we want to generate a policy for a user. And we have all these business rules that define what parameters are going to increase the rate and decrease the rate. And so if we try to do this with just a mess of if statements and nested if statements, and we want to try and optimize it so that we don't reevaluate the same values and that kind of thing, we just end up with a mess. And if inevitably the requirements change, our code is going to, this nicely crafted mess of if statements is going to have to change too. So in this example we see that we're just looking for a driver that's young, male, red car, increase the premium. But if he's a good student, decrease the premium. So we want to solve problems like this in a different way. And what we want to do is separate our business logic from our data. So we want to separate these if then statements from the objects that they act upon. And we could do this with a functional language, because they're very good at that kind of thing. But that's not always an option. Well, this is a Ruby conference, right? We want to write in Ruby, and we want to keep the imperative paradigm, because it's a good paradigm. Instead, what we want to do is just add a declarative nature to our imperative programs. So when I'm talking about declarative, I'm not talking about functional. And people tend to mix these two terms up and think the two are synonymous, but it's not actually correct. It's understandable to make this mistake, because a functional language is declarative, but a declarative language is not always functional. So to define declarative programming, we're talking about languages that describe what something is like rather than how to create it. So a good example is HTML or SQL. In HTML, it's a programming language, and we're writing what a web page looks like, but we're leaving it to the browser to determine how to create it. In SQL, it's the same kind of thing. We're describing what our query is and not how to execute it. So this paradigm of declarative programming is going to allow us to separate our business logic from our data by declaratively programming our business rules. And this introduces a new paradigm called rule-based programming, in which, as I said, we're declaratively programming business logic. So in other words, we're going to describe what the business logic is and not how to execute it. So some examples of languages that use this paradigm are, of course, Ruby, which we'll talk about today. Clips is a C rules engine, and Drouals is a Java rules engine. And all of these frameworks or languages or whatever you want to call them use this rule-based paradigm. So if we put all this together, we find that our procedural languages fall into two categories. And now procedural, again, tends to get confused, kind of like declarative and functional, but with imperative. People tend to use imperative and procedural synonymously. But if you remember that functional languages are procedural, it's just procedures that don't have side effects. So procedural languages like Ruby and Java are imperative, and things like Haskell are functional. And then we have non-procedural languages, which is a bit of a misnomer because the languages I have down here, they have a procedural nature. But the emphasis is on things that are not procedural. So in the case of Prolog, we're doing logic programming, which is declarative. And in Ruby, we're doing rule-based programming, which is also declarative. But of course, Ruby is in a host language that's imperative, so it does have this procedural nature, too. And then non-declarative languages are way outside of the scope of this talk, used in AI and academics and things. OK. So this rule-based paradigm is used in a rules engine. But we haven't yet defined what a rules engine is. And there's a lot of confusion about this, because it's, like I said, it's a bit of a buzzword. There's actually times where it's used, I guess you could say incorrectly. And there's one book in particular that I'm thinking of that I see all over the place. It was at the bookstore at Java One. And it has absolutely nothing to do with what we're talking about today, but it's about rules engines. So to define the scope of what we're talking about, this rule engine that I'm describing is a production rule engine. That means it's a mechanism for executing production rules. And I'll explain a little bit about what those are. But it'll be made up of three parts. The first is a collection of rules, called a rule repository that contains our business logic. The second part is a collection of facts, which makes up the state of our program, or the program's memory. And the third part is an inference engine, which does the job of matching the rules to the facts. So we'll start by talking about these production rules and how they're contained. Production rules, they really aren't anything complicated. They're just statements that contain business decisions. So they're similar to if-then statements. So for example, in the car insurance application that I described earlier, we might have a production rule or a business rule, two terms, kind of synonymously, that states given a car and a driver, if all the following conditions are met, the red car, male driver, less than 25, we're going to increase the insurance premium. So our car insurance application will be made up of rules like this and probably a lot of them. And we're familiar with this kind of thing. Most of our applications are made up of rules like this, usually some subset of the requirements. So that's not new. But what's different in a rules engine is how these are expressed, and they're going to be expressed as productions. And again, I'll get more into that. OK, so the facts, the second part, these are the state of the program. So in the car insurance example, the facts would be the drivers and the cars. And this isn't just the drivers and the cars that we're looking for in our production rules. It's all drivers and cars. So drivers that are 45 or whatever, these are what make up the memory of our program. So this collection of facts, I'll refer to it as a working memory, but essentially our program interacts with it by reading from and writing to. OK, and the third part, this inference engine, is really the heavy lifting of a rules engine. So for now, we're going to think of it as a black box. I'll get into it later in the talk about what actually it's doing to be so efficient. But it's going to optimize our business logic so it can execute much more efficiently. So a rules engine introduces a few new characteristics to our programs. We get this ability to declaratively program, which allows us to separate logic and data. And when we separate logic and data, we get a centralization of knowledge. And this is important because it makes our business logic or our business rules easier to maintain and easier to understand. And in the field of expert systems, we find that we can actually bring in expert systems where you're bringing in an expert from some particular field, like car insurance, and someone who knows everything about car insurance and how to generate a quote, but knows nothing about programming. So in expert systems, we bring these people into the development process, and a lot of times the rules are written in a DSL that's essentially pure English. So they can actually review and even write some of these business rules without touching other parts of the code. And so finally, we get this characteristic of speed and scalability, which is from the inference engine. And I'll go into that more later. OK, so that's the 40,000 foot view of a rule engine. I wouldn't expect you to understand precisely how to use one now, but I'm going to use Rule B to kind of demonstrate how these work. It's important to understand that Rule B is just a library. It's not going to change the architecture of our applications at all. If you're comparing it to aspect-oriented programming, if you're in Java and you want to use aspect-j, you have to use a whole new compiler, and it just really changes your program, we're not talking about anything that drastic here. We're just bringing another tool into our applications. So being a library, it's made up of a bunch of classes, and every program that uses Rule B starts by implementing the rulebook class. And this implementation is going to contain our business logic and our business rules. So once we've created this rulebook class, we can use it to instantiate our engine. And this engine is the container for the Rule B rule engine and all of its parts. And so being a rule engine, it has three parts. It has these rules, which are written with a Ruby DSL. The facts are just Ruby objects, so our memory is made up of our model objects, typically. And an inference engine, which uses a special algorithm written in pure Ruby. So we'll start talking about facts, because they're the simplest. As I mentioned, they're just Ruby objects, so if you have some model with a person and you instantiate it, p, here in this case, can be one of our facts. But it isn't a fact until we assert it to the engine. So we interact with this engine by asserting and retracting facts. This is really the only way that we interact with it. And we compared to an object-oriented programming, our different components interact by sending messages via method calls, while here we're interacting with the engine by asserting and retracting facts. And that's also how our rules communicate. OK, so these rules, as I mentioned, they're written in this internal DSL. And we introduced this new construct. It's really just a method, but for the paradigm, it's a construct of a rule. And this rule has two parts. The first part is called the left-hand side. And so the left-hand side is like the if-part of an if- then statement. So what we have here is a pattern that we want to attempt to match to some of our facts. So this is a trivial case where we're looking for some object-of-type message. If we find this object, we're going to tag it with a variable name, m, with the symbol. It needs to have a method of status with a value equal to hello. So if we match this pattern to some fact, we're going to execute the right-hand side. So this is the then portion of the if-then statement. And this right-hand side is just a block. And we can put any Ruby code that we want to put in here. There's no limitation. And this block takes a single parameter here, v, which stands for variables. So these are the named variables from our left-hand side. So if we do find a message object, we'll be able to access it in the right-hand side. So if we were to write this same business logic as just a method, it might look something like this, where we're checking for a type of message and a status of hello. And we're going to do something if we find that. And so the point here isn't to demonstrate that Ruby is more concise or anything like that. In fact, I think you'd find them pretty similar. The difference here is really just that they're expressed in a different way. And of course, we're not supposed to care that it's a message, so we can just check for status. So once we have this rule and any other rules that make up our application, we put it into our rule book class specifically in this rules method. And using that class, we can write a full-fledged Ruby program. And this is, again, very trivial, but it certainly would work. We instantiate a new engine, we instantiate a new fact, we assert that fact to the engine, and we execute match. And match is just a very thin call, really, that tells the rule engine to produce some output. So it's kind of like if you have a method, and you do all this work during the method at the end, you return some value. Not really any work involved there. So if we were to watch this program execute, we'd find that as this engine is instantiated, we have our three parts here. And the rules contain a single rule, the one we described. And as the program executes, we'll assert a new object to our facts, so we have this message object. And when this fact is asserted, the inference engine picks it up and attempts to match it to the rules in our repository. In this case, there's only the one rule, which is satisfied by this fact. So the inference engine produces some output. So that's a meaningless example, really. It just shows you the fundamentals. So we're going to add a little bit more to this rule. We're going to add some more to the right-hand side. In the right-hand side, we can actually interact with the engine, because, as I said, the only way these rules communicate, they don't invoke each other or anything like that, they pass facts. So in this right-hand side, we're going to retract the fact that we just found, this message of hello object. And we're going to assert a new fact, this time a message with a status of goodbye. We're also going to add a second rule to this rule book. And this is very similar to the first rule, except instead of looking for hello, we're looking for goodbye. So in fact, we're actually looking for the object that we asserted in the right-hand side of the previous rule. So if we were to run that same program with our new rule book, we'd start up with two rules now in our repository. And as that hello message object is asserted, the inference engine picks it up, attempts to match it to both of our rules. It satisfies the first one, which produces the output. Then we retract that hello object, and we assert a new object, a goodbye object. The inference engine picks that object up, or that fact up, and attempts to match it to our rules. And it satisfies the second rule, so it produces some more output. So what we see is that as we continue to assert new facts to our working memory, our program will continue to execute. And this is called the Recognize Act Cycle. So this cycle starts by matching rules. So new facts come in, and we attempt to match them to the rules in our repository. We then select the rules that are satisfied, and execute their right-hand side. So the right-hand side does whatever it does, which may involve asserting or retracting facts, and we go back to matching rules. And this cycle continues until there are no more new facts to process. And this gives our rule engines a lot of power. We have the ability to do things like looping and recursion and endless looping, too. You have to be careful. So these rules, like I said, are written declaratively. But the way we interact with the engine is still in an imperative fashion. So we can really just add rule B into our existing programs without substantially changing the rest of the application. But we still get this benefit of declarative programming. So really what we're doing is turning Ruby into a multi-paradigm programming language. And it really was already multi-paradigm in that we have object-oriented and procedural, but we're adding this drastically new paradigm of rule-based programming. And what this allows us to do is just select the right tool for any given problem that we're trying to solve. And so we're really following Dijkstra's advice, and as he wrote in that same paper, we need to apply efficient structuring to otherwise intellectually unmanageable complexity, where the efficient structuring is our declarative paradigm, and this unmanageable complexity is our business logic. OK, this is where it gets heavy. This is the inference engine that we're going to talk about, and it's, like I said, it's doing the job of matching our facts to the patterns in our rules. So really that's what it is. It's a pattern matcher. Our rules, the left-hand side of our rules contain these patterns, and we need to determine if the facts in our repositories satisfy any of these. And to do this, it uses what's called the Reedy Algorithm, and it's just a pattern matching algorithm. It was developed or first published in 1979 by Charles Forgey. I was his PhD thesis at Carnegie Mellon, and it's really one of the most interesting algorithms in the field, I think, and it's certainly one of the most overlooked. I assume very few of you have actually ever heard of it, but it's extremely profound and overlooked. That's the word. OK, so before we go into describing what this algorithm does, I want to talk about a naive approach to solving this problem. So if we're trying to implement a rules engine without this algorithm, we take a look at our goal is to match all of our rules to all of our facts. So we could do this by starting with the first rule, iterating over each fact to determine if it matches, moving on to the next rule, go on to the next fact, or iterate over all the facts again, and keep going until we get this cross product of the rules to facts. And this is going to be far too slow. And then it's also a problem because it doesn't scale well. As we get more facts and more rules, we end up making more combinations, and our programs won't execute efficiently. So what Redi does to solve this problem is it creates a network of nodes where each node corresponds to a pattern or a part of a pattern. And actually, Redi comes from the Latin for network, so it's a fitting name. So if we were to take the example of this Hello World rule that I described earlier, it would be compiled into this network of nodes, fairly simple. But at the top of the network, we have a node that corresponds to the check for an object of type message. And that node has a child that corresponds to the check for a status of Hello. And finally, we have a terminal node that signifies we've reached the end of a rule and have, in fact, satisfied a rule, and we will execute the right-hand side. So if we were to watch our program execute with this network, we'd see this Hello message object enter the system. It would be pushed down the network, starting at the top, so it would be evaluated by this first node to determine if its object type is message. And since it is, it's passed further down to the children, where it's evaluated for a status of Hello, which it satisfies, and it's passed finally to the terminal node where the right-hand side is executed. So this method of pushing objects down a network is called forward chaining. And it's really characterized by reasoning from facts to conclusions. So this differs from backward chaining, where we're starting with some hypothesis, and we go out looking for facts that satisfy that. So really, this was our naive approach. We started with our rules, and we're going to iterate over the facts to find something that satisfies our pattern. But in forward chaining, we're starting with the data and pushing down this network to determine what we satisfy. So if we look at these two side by side, we see forward chaining is data-driven, bottom up, right first. Backward chaining is goal-driven, top down, and depth first. That can be a lot to swallow, but if you're really familiar with compilers, a good way to compare this is shift-produced parsing is forward chaining, and recursive descent parsing is backward chaining. And I'm sure some expert in compilers can tell me why that's not quite right, but I think it's helpful. And if you're not familiar with compilers, recursive descent parsing is at the beginning of the book, and shift-produced parsing is at the end of the book. It's a lot more to swallow, but it gives us a lot more power. And that's what we're trying to get. OK, so that last example just kind of illustrated how this is going to work, but I want to kind of demonstrate some of the advantages. So if we look at the example with our second rule, it's going to be compiled into this network of nodes, where we still have a single node at the top, but it's actually shared between the two rules, because they're both looking for the same thing. But now it's going to have two children, where it has that original child looking for hello, and a new child looking for a status of goodbye. And it has a new terminal node, because terminal nodes correspond one-to-one to rules, regardless of both the right-hand side does. So if we watch the program execute with our hello object, we see it's pushed down the network again, starts at the top, and it's evaluated only once for both of our rules for a type of message. So it's passed on to both children, where it's evaluated for a status of hello. On one branch, it succeeds, and on one branch, it fails. Where it fails, it just dies on the vine. Where it's successful, it's sent down to the terminal node. And an action happens. So the advantages of this algorithm are that we're able to reduce the done-in evaluations in our rules by sharing these nodes. We're also able to prevent the re-evaluation of data by storing partial matches. So each of these nodes has its own little memory. And if a rule is partially satisfied by a fact, it will store this partial match in the memory. And when another fact comes along that satisfies the rule completely, we won't have to re-evaluate anything. So if you think back to our naive approach, if a new fact enters the system, well, we're going to have to iterate over every rule and re-evaluate everything for it again. And the same mechanism is used for the efficient removal of facts by indexing matches. So let's think about the naive approach again. If we want to remove some facts from our memory in order to determine if any of our rules need to be rolled back or unexecuted or whatever, we have to iterate over each of them again to determine if it matches. But in this algorithm, we index these matches based on the fact ID, which is really just the object ID. And we can simply pluck it out. So a removal takes no computation or no comparisons, rather. But there is a downside to this algorithm. And that is it sacrifices memory for the sake of increased speed. So we're creating this network with all these nodes and all their memories, and we are going to consume more memory. But that's the trade-off. So this pillow rule example is handy for describing how this mechanism works, but it doesn't really provide us any insight into how to use a rules engine. So I want to talk about this car insurance example a little bit more. And we're going to find this application as a web app that's got some series of screens for collecting information from a customer about their driving history and what kind of cars they have and that kind of thing. Once we have all this information, we'll prepare a quote and display it, then finally ask the user either to accept it now or sometime in the future. So the bulk of this application is really well suited to Ruby and Rails or whatever web framework you use. We can create our model with drivers, cars, and policies using Active Record to handle the persistence. And this is great. This is what we do with Ruby. But the process of preparing the quote is complicated. And we've got changing data, changing business rules, and this is troublesome for a procedural language. So the example that I showed at the beginning where we have this mess of if statements, what if April is coupon month and all the rules change? Well, we're going to have to go in and mess up our nicely crafted code. And we don't want to have to do that. So if we have this kind of standard architecture with a Rails application and everything's doing its job, it's like, well, where does the business logic go? Well, if we introduce Ruby, we can have it contain our business logic and allow all the other parts to continue doing the job that they're good at. So if we go back to this original rule that I described earlier, looking for a young driver with a red car, we would write it something like this in our application, where we have a composite pattern, which is made up of three different patterns. The first looking for a driver under 25 that's male, a car that's red, and an existing policy. And if all these conditions are met, we're going to execute the right-hand side. So each of these patterns is logically ended together. So the application will be made up of this rule and lots of others like it, things like student discounts and accident history. But really what we're getting here is the ability to separate those from the rest of our application. So as we go on and write more of these rules, we want to keep a few guidelines in mind. The first is that we want to create small, fine-grained, simple rules. So we don't want to try and accomplish many different business requirements within the same rule. We want to isolate them into their specific reason for why they exist. And the reason we can do that is partially because of the algorithm. So if we have any redundancy between these two, that's OK. That'll be factored out in the algorithm. We want to keep these readable, maintainable in that nature. We also want to supply these rules with simple, small, fine-grained facts. And really this is just a good programming practice in general, but we want to break our model down into as many parts as we can. So if we may know that, oh, a driver's only going to have one policy, so we start mixing all the policy number type stuff into the driver object. And we don't really want to do that because when we make these big, complex objects, our reading network grows this way, which means it's going to take longer for our facts to get pushed down the network. But if we have lots of different objects and everything's isolated into its own container, our network grows this way. And we can traverse it much quicker. So another thing to keep in mind in writing these rules is that we want to keep conditional logic out of the right-hand side. So a lot of novice rule programmers will do something like this, where you have a very simple pattern. But on the right-hand side, you're going to do some conditional logic with this, in this case, checking accident history and adjusting the premium accordingly. But what happens when we put conditional logic on the right-hand side is that we can no longer optimize it with this algorithm. So this particular example should probably be written as two rules that look something like this, where we have a first rule looking for a driver with no accidents and we decrease the premium. And the second rule is looking for drivers with many accidents and we increase the premium. So what we've done here is we've really isolated two particular cases and made it easier to understand and to maintain. And we don't have to worry about the fact that we are checking for the same thing twice. We don't have to worry about the redundancy in our rules. For example, checking the driver and the policy, because that we'll all be taken care of by the algorithm. So once we have all of our rules written and our application set up, we would take advantage of it probably by having a single entry point, maybe a get policy method that provides drivers and cars, which are our facts. So we assert these to our engine, we execute match, and we get the policy that's created. So this kind of demonstrates how the rule engine really encapsulates that business logic and provides it to our applications through a single entry point. OK, so I could go on lots of other examples of where a rules engine is really good, like maybe a web doctor application where you take in some symptoms and use some rules to produce a diagnosis. Those kinds of things are very good for rules engines. But I think in order to really understand the best cases for it, it's also important to understand when not to use it. So we have to keep in mind that it is just another tool in our toolbox. And we're not seeking to replace anything here. We're not trying to replace artificial intelligence or decision tables or general purpose programming languages. We're not going to write a web framework in a rules engine. A rule engine is just a supplement to these other tools. We also have to keep in mind that there's a certain overhead in using this algorithm. So if our task is to run every rule against every fact in order, then a rules engine may slow us down. So if you think back to our naive approach, which was executing every rule against every fact in order, well, if we're going to be doing that anyways, we might not be able to get any advantage from this algorithm. So an example of this would be if we're scheduling classes for students, where the goal of this problem is to create a schedule for a student that allows them to graduate but still maximizes the number of classes that they get, which they originally desired to have. So in order to do this efficiently, or to do this best, we're going to have to score every combination of possible class schedules in order to get the best result. So we wouldn't get the advantage from the rules engine. So if you're not scheduling classes for students, I'd encourage you to check out our project. You can install the gem as anything else. Our source code is hosted on GitHub. And I have the hello world example I described here is in the code, so you can check that out and run it. And we're located at ruleby.org. So that's it. And we'll open it up for questions. Yeah, Troy? What I'm deciding whether to implement the rules. I think there's a couple of the rules engines in Ruby. But none of them use the read algorithm. So what's it? Ruby rules, no, rules, R-O-O-L-S. That's probably the most mature other rules engine. But it's not using a read algorithm. So it's more for just, I guess, I'm not really sure what you use for at that point. You can use spreadsheets with decision tables in them and things like that. So you get that kind of declarative programming nature, but you don't get any of the speed and scalability. I haven't tried doing it, but there are people who have considered using drools in JRuby, that kind of thing. Yes, yes, we do. Matt Smith. Raise your hand, Matt. Hi, yeah. He's working on that. And we almost have to go that route. The internal DSL works, and it functions. But there's some really quirky things about Ruby that make it difficult. So let me. Geez, it's a long talk. OK. Wait, I'm losing everything. OK, so what we're doing in this DSL is we've actually overridden this double equals method to return an object that's just whatever we want, contains a representation of this, it's not actually returning true or false. So when we pass in this array, it's just going to represent our model. What we found is that double equals is a great example where double equals is a method, but not equals isn't a method. It's just an operator. And I can't understand that. And I will pay you a dollar if you can tell me why. But if you return some object with double equals, that's not true or false, you're always going to get false for not equals. Another example would be we tried solving it by using single equals, but there's some problems with that. Well, it only takes one parameter, but that's OK. I can see why there would be a syntactical need for that. But if you have A equals x, where equals method is overridden to return an object like this was, that expression's always going to evaluate to the right-hand side. There is no return value for an assignment method. So it's not really a method. So yeah, so we are working on an external DSL. If you're curious in that, talk to Matt, because he's been behind that. We're doing some parsing and things like that. Yeah? Can you say you guys overwrite double equals an object? Can you say you guys overwrite double equals an object? No, just in this m here, it actually stands for method. So we're signifying that we want a method of status. So m is going to return one of our objects. Yeah, so we're using for m dot whatever, we're using method missing to take whatever and then return some object that's going to override the double equals method. All right, well, I guess that's it. Thank you.