 So I'm going to be talking about verification techniques in Ruby and asking the question, could a machine ever write password art code? So to start, I want to ask you guys a question. So looking at this picture, can you tell if this is a sunrise or a sunset? Think about that to yourself. We're going to come back to that when we, later on, install it. So my name is Lauren Siegel. I've been working on master's in formal verification research at Concordia University in Montreal, Canada. It's kind of what we're going to be talking about today. I also wrote a documentation tool called YARG. I don't know if you guys have heard about it. Thank you very much. We will be using it in the talk, the tool actually uses it. So we'll be looking at that. I'm also on GitHub and Twitter, so if you want to talk to me after the talk, check out my code. It's all there. So I developed a tool called RubyCorrect. And I should preface this entire talk with the fact that this is a real throughput concept. It's not really production ready. But the tool set has two tools in subject or program verification in Ruby. The first tool is called Ruby ESC, which does extended static checking. And the second one is called Ruby Cache on which does symbolic execution. And we'll talk about those. To understand what extended static checking and symbolic execution are, we need to understand how formal verification works in the middle of the case. It's a formal verification. The problem here is that formal verification is boring. There are a lot of details, but we will have to be skipping most of them. But not all of them. So with that, what is formal verification? Formal verification is basically a set of methodologies using logic and theorem proofs to verify program correctness. In other words, we use math or a fully announced product to verify that the program does what you wanted to do. Some of the methodologies include extended static checking, which we'll be looking at, symbolic execution, which we'll be looking at. There are other ones like model checking, runtime checking, and others. To give a brief landscape of what the spectrum looks like. We have static verification, we have runtime verification. The two ones that we will be looking at today are ESC and symbolic execution. ESC is very much a static verification kind of concept. Symbolic execution actually sits between static and runtime checking, and we'll talk about how those two interplay with each other with ESC and symbolic execution, and what the similarities and differences are, which we'll click those two. So let's start with ESC. ESC is extended static checking. As I just pointed out, it's static verification for code. What we do is we translate a method into a single logical expression. It's a thing Boolean algebra. We can confirm that that method is correct, assuming a given pre-condition that our post-condition matches the logical steps in that method. Or in other words, given a set of logical pre-conditions, when I execute a method, the result should be equivalent to my post-condition. So pre-conditions is post-conditions. As I mentioned, there are pretty much logical expressions in the form of Boolean algebra. P implies q and r, stuff like that. We assume that the pre-condition is correct, and this is how we typically use the pre and post-conditions. We assume that the pre-condition is correct, and assuming that the pre-condition is valid, we assert that the post-condition will be equivalent to what we set our method. So for an example of that, this is the Fibonacci sequence, we can write this as pre and post-conditions, and we can have a pre-condition for Fibonacci, saying that n is greater or equal to 0. We can also have two post-conditions. The first post-condition says that our result is going to be n if n is less than 2, and we have a second post-condition that says our result will be Fibonm-1, plus Fibonm-2 if n is greater or equal to 2. And those are our pre and post-conditions. In other words, what we're really doing is we're doing design by contract. Contract is a law when I talk about yard. The reason I like contracts is that contracts are effectively specified, and specifications are effectively documentation. So I like documentational law, and this is typically how you would mark up in yard a specification for our Ruby VSC tool. You can see that pre and post-conditions are specified in pretty much similar syntax over there, as well as the type applications to tell it that we're using the numbers. So the good thing about writing this out in documentation is that I also talk about this law, auditing code, and I talk about this in a bunch of yard talks that I've given in the past. I always wanted a tool that can read documentation and verify its correctness, and so I actually got to be able to build that tool for Ruby VSC. So I'm just going to jump into a little demo here. I'm going to pop up this Fibonacci code right here. As you can see, it's pretty much the same code that we just saw in slide. We can actually execute this through our Ruby VSC tool, so we can go and run it through the tool, and so it's going to tell us that it's right. But we can go in and make a change and break it and see what happens. So we'll do Fib of n minus 3 instead of Fib of n minus 2. I should break it. Let's see what happens. There we go. And so it's running through, and it's going to, I'll press enter, there we go. All right, so this time we had four errors. One of them is that we violated a free condition. So this is something I didn't actually notice when I first made this change, but when you actually set Fib of n minus 3, you're actually violating a post-condition when n equals 2. So when n equals 2, you're actually passing Fib of negative 1, which violates a free condition. So we violated a free condition. We've also violated a post-condition, but that's a little more obvious. We're not implementing Fib of n minus 2 as specified. So that's Ruby VSC in general. That's sort of how it works. That's what it is basically, but how does it work? That's the question. So what we do for Ruby VSC is we take Ruby code, and we translate it into a language called bogey, and then bogey takes that code and translates it to a theorem mover to parse and form the real head lift. So let's go through these steps one by one and figure out what these want. First, we have Ruby. Well, we all know Ruby is the wrong one. Bogey. Bogey is an intermediate verification language. It's also a tool by the same name, so you can run bogey code using the bogey language. It's a Microsoft research project. It's probably one of the only Microsoft projects I like. You can try it out. Actually, right now, if you are bored in the top, you can go to viceprofund.com. You can actually type in code and play around with, you know, that syntax and play around with bear, bind, free condition, post-condition and stuff. It's an open source product, which is good, like a real open source project. And the syntax is actually very C-like, so it's actually a very high-level language. You can see here it pretty much looks very C-like. The only difference is that we have Korean post-conditions specified on the method itself. So that's kind of what bogey looks like. And now, theorem movers. There are a bunch of theorem movers out there. Simplify, CBC, Isabelle, some of which you might have met a part of it. You've done verification before. Bogey uses Z3, and so we use Z3. It's also a Microsoft research project that was actually recently open sourced, I think, last month. So how does a theorem mover work? Well, Z3 specifically actually uses a list-plight syntax. In fact, it's pretty much a list. To express logical statements, so it's going to convert that high-level C code into a list-s expression, and then assert that statement. We basically assert a bunch of statements, and then some magic happens in the theorem mover. It's verifying that all the expressions that we've asserted do not contradict one another and are consistent. And if it can prove that, then it can tell us that our equation is satisfiable. If it can't, then it tells us it's not. And there is a case where it can tell us it can't do anything in that case if that's unknown. So that's a theorem mover. So I skipped over a very little secret before, and that is that we use type annotations. Type annotations are required in Ruby ESC right now. It's kind of a necessity. The good news is that most of it is an implementation detail. We can get rid of a lot of typing with type inference. And we don't actually support it yet, but that's possible. You can't get rid of all type annotations. It is static analysis, so we do need to tell our tooling what we're operating on. So speaking of annotations, I just want to plug Yard one more time. You can grab Yard at Yard.org. We use Yard to annotate all the types and contracts as you saw up there. The good thing about this is if you have method documentation that you've already marked up with types, you are good to go except for the contract part. I'll give you the part of the part, but you're halfway there. Stuff ends my plug. One more sunrise, sunset question for you. Think about that one, whether you think that's a sunrise or sunset, it's a little easier. I'll give you a little hint. It's all about context. Moving on, Ruby to Boogie translation. So I mentioned we're translating Ruby into Boogie, and then when Boogie is taken out and doing its own thing a bit. Let's talk about how we translate Ruby into Boogie. The first thing we do is we translate method control flow. And that's sort of like basically translating Ruby syntax into Boogie syntax. And that's mostly fairly simple. Ruby has a fairly... Well, we know Ruby has a high level of language, but Boogie also has a fairly high level of syntax. We can map it fairly equivalently. So the only difference here is that we have a procedure call instead of a depth. And we have to pass itself as an argument because there is no self or this keyword in Boogie that has no concept of object orientation. And the only difference is really the way we bring different values. The second thing we do is we map the object system. So Boogie, as I mentioned just now, has a different kind of object system than Ruby does. So we have to define a reference type called value that references all objects. We use the name value because that's what the MRI, C Ruby implementation, if you've ever looked at the C code, that's the name for all the value references. The good thing here is that everything's an object in Ruby, so pretty much all variables are passed in as values. The problem is that Boogie actually requires native types at some point. At some point, we're operating on some kind of scalar types. Typically everything in computing ends up narrowing down to some kind of integer mathematics, so the int value is pretty much the most important one. So we actually make a special exception for integers, and we actually alias int and value as the same type. So an integer is a value and a value is an integer. We can get away with this because, luckily, it's actually the same trick that C Ruby does to implement into big stumps and objects. So we actually partition the objects based into integers and object reference. So values, integers, same thing. The next thing we do is we map method calls to procedures. This is the hard step. This is where we have to perform static analysis. We have to resolve the method at translation time, which means we need to know the type of the receiver that we're calling the methanol. Once we can figure out the type of the receiver, which, if we have annotations, is pretty easy, we have to perform Ruby's method lookup dispatch kind of code that it does in the interpreter, which is looking up the inheritance chain as well as mix-ins. So that is one of the harder steps. We then have to handle lambas and blocks. So the way we do this is we convert lamba blocks into anonymous bogey procedures, and we call them as if they were anonymous procedures. The only problem here is that we have to infer extra contracts. I will point this out in a bit, but every time you have a different method in bogey, it has to have its own contracts. So we have to infer some contracts for the block itself and pull that out into a separate method. The next thing we do is we handle loops, and this is where it gets crazy, because bogey needs invariance to find out all these structures, and that means we, as programmers, have to find things, because it is a fairly manual process. And invariance, if you haven't heard of it before, is basically an expression that holds true for every single iteration of the loop. So for every loop, X is always going to be 2 for that loop, and that's an invariant. There's actually a lot of research going on right now, in terms of invariant inference looping and extent of static checking, specifically because it's very difficult to write your own invariant for a loop. It's not trivial. And that is how we do loops in short. The last step that we have to handle is what I call preamble, which is where we define specifications for built-in methods. So by default, Rumi does not have specifications built-in or fixed on Boston and all the other operators that we would need. So we have to manually define those specifications. There's good precedent here. We can use Ruby's back, and we can use tests and stuff to build those, but it's still a manual process, so it takes quite a bit of time. So that's mostly an overview of how we do Ruby to Bogey translation. To summarize, we map control flow. We map the object type system. We map method dispatch to procedure calls and do static analysis for that. We handle the alandas to tournament synonymous methods and as well as loops. And then we have to create the preamble and create a base set of specifications for built-in. So what are some issues and limitations for Ruby ESC? Working backwards, the standard library is not fully covered. As I mentioned, it's a manual process. It takes a long time and a long effort to implement specifications for all of the methods inside of Ruby. We currently focus on integers and arrays, mostly because that's the kind of stuff we're doing with a group of concepts. So we don't really cover strings that well, but we could theoretically cover them. Methods with multiple types of integers. So you can have a method that takes an array and sometimes strings them other times. In Ruby ESC and our current implementation, we don't actually support that. There are ways to work around this, though. One of them would be, this is actually a typing problem. This is really just a typing problem. With better typing inference, we've actually solved this problem quite easily. The other way we can handle this is by having separate overloads and if you document your method with multiple overloads and have a specification for each one, that's another way to do it. Then there's eval. So eval is really dynamic stuff that's happening at the time that we cannot figure out static. There's pretty much no way to go around this except for annotations. So we're pretty much stuck on that in front. It's obviously not ideal, but pretty much if you have some dynamics, we're calling using eval. We have to have some kind of annotation that specifies what this eval does. Fortunately, we try to keep eval light, so it's easy to find those hot spots and find, you know, write that out if you really want to ESC. Do you know, man, how you got the app but I think the bigger problem here is that your entire store base needs contracts for this one and that's a real big... because contracts are not that easy to write. We saw the Fibonacci sequence that was easy, but that was also an easy method for a ten-line method that does a bunch of different logical steps and integrates with different systems and becomes way more difficult. And this is generally an ESC problem. It's not a problem of R2 necessarily, but it's mostly an extent of a second problem, the fact that you really get an all-or-nothing behavior. If you're missing one contract in your entire program, your entire program will not be able to be tested. So it's not perfect. Ruby ESC is not perfect, but fortunately we have symbolic execution. Symbolic execution is not perfect either, but it does give us some benefits that Ruby ESC doesn't give us. The biggest one is that it does not require contracts at all. It can use them, but it does not require them. That means that if you have a program and you're missing one contract somewhere, you will still be able to test your program. So you don't have that all-or-nothing problem that Ruby ESC does. So let's run through an example of what symbolic execution might actually look like if we were to actually execute our code. So just to summarize symbolic execution is really just what it sounds like where we are executing a block of Ruby code symbolically. So instead of concrete status scalar values, we substitute everything with a symbolic value. And we resolve those values after we execute our code. So if we were to run through this method, we would start with y equals 5x. So we assign 5 times x to y. So that would turn y into 5x. We then branch. So when symbolic execution gets a branch, it takes both branches automatically and resolves a separate symbolic value for each branch as a separate state. So we split off our branch into one state and we now have 5x plus 2. We then split off our branch into a second state and we have 5x divided by x. I missed the assignment there on the left side. So we actually ended up with two states now and at the end we returned off those states. So we now have two output states. The end result of symbolic execution is basically a set of states for each code path, for each branch and for each loop. We have a set of different states. Each state has a set of logical formulas like y equals 5x or y equals 5x over x. We then can run these logical formulas through a theorem proveer to discover when they are unsatisfied. Our theorem proveer would actually be able to figure out for instance for 5x divided by x when x is zero we know that that is unsatisfied and our theorem proveer is able to tell us that. So how is this really different from ESC? We're converting them both into logical formulas and paths into a theorem proveer which just doesn't really seem all that different. The real difference here is that ESC requires us to know what our result will be performing. It basically verifies that our assertions about our code are correct. The difference is that symbolic execution allows us to discover what our result will be not having known what the specification is. So how is this related to automated testing? And more importantly, how is symbolic execution different from unit testing? Why would I just say unit testing? So symbolic execution and unit testing are a little different. The difference is that unit testing really only confirms what we've asked it to check. If we say, you know, check Fibonacci when n is 2 and you should get whatever the result is, you're only checking for that value. Symbolic execution actually doesn't use any symbolic concrete values on the first pass. So it can really run through and check all the values. It can resolve a bunch of values and figure out either using heuristics or just logic to figure out there are certain values that this function will not work on. In other words, it's automatically generating test cases for us. So it's able to figure out scenarios where our program will succeed or fail. So now it's time for another demo. This time I'm going to be case gen and I'll drop that into the terminal here. And this time we will run the Fibonacci sequence, the Fibonacci sequence through the case generation tool. We will see what it does. It's going to take a bit of time, but when it comes back, it does a set of tests. So it generated a bunch of tests for us. So we have a bunch of tests starting from N equals 0, N equals 2, you notice that it skipped N equals 1 because it was able to detect that N is N, 1 and 2 are the same case. It then went to 3 and 4. It would have kept going to 5, 6, 7, 8, but the loop bound in the symbolic execution engine was exhausted, so the symbolic execution engine would keep trying until it runs out of stack space or you can define how long it's going to check, how many loop iterations it can check for or how long it's going to recurs. But in this case we have a recursion step of about 3 or 4. So it stops at 4 and it's able to generate these test cases for us. So we can actually run these tests through, you can see that our test pass. So that's the Ruby case gen. How does Ruby case gen work? Well, we actually translate type annotated Ruby into Myro. And some of you have heard of Myro before, but I'll talk about it in a bit. We then use TSN as our symbolic execution engine. TSN will also talk about it in just a second. Our symbolic execution basically, as I just described, will resolve all the values in our symbolic, all of our symbolic values into concrete values depending on heuristics as well as the case, as well as whether or not the theorem logic actually says that there's a case here that will fail or might pass in a different way than the others. So we end up with test cases generated as data, and the only thing we have left to do is really convert them into executeable Ruby test cases. So Myro is a statically typed Ruby white language. It was created by Charlie Nutter, who is the J-Ruby guy after everybody knows him here. It basically compiles Ruby into straight Java code. So we have, it converts the type, and the difference is the only difference between Ruby and Myro is really the fact that there's type annotations on methods. Everything else is done with type inference. So we actually do get type inference for Ruby case gen. So that's Myro, and KSN is part of the Serium framework, and Serium is just a framework for verification using Java. And it's created by the Santos lab at Kansas State University in its open source, and that's the symbolic execution engine that we're using. So you might be thinking now, we're testing Java code, wait a minute. The answer is only kind of. So we're sort of testing Java code. At the end of the day, we're still writing Ruby code at the top. I mean, for Ruby ESC, we convert Ruby into boogie code, so we're not really testing Ruby code. But we use Java code in this in Ruby case gen to generate our static code. So how does KSN know which values to use? That's another question that might be useful. So here's another Sunrise Sunset question. This one, this time you get a better hint. That's East. So now it should be fairly obvious what it is. So as I mentioned, it's all about context, right? If we provide type annotations to KSN, it knows what types operate on that step one. We can then provide optional contracts to tell KSN exactly what bounds we want to test in. So we have extra context for KSN to actually figure out, you know, should I test negative one on a terminology sequence? Does that even make sense? So we can specify optional constraints. For instance, if we had a progress bar that really only made sense between zero and 100, we can specify these preconditions and tell KSN, you know, don't test this between negative and absolute, zero or above 100 is not going to make sense. But I still didn't answer the real question that you all came for. Could a machine ever write tests for our code? The answer is kind of simple. We just saw it. Yes. So we just saw Ruby case-gen generating test cases for our code. The only problem is it's the wrong question, right? The better question is, will automatic test case generation always work for us? Should we just adopt this feature and start writing, you know, using symbolic features all the time in our tests and our Rails code? The answer for that is not always, right? There are some scenarios where we just cannot handle certain features to be where they're way too demanding for us. And so that brings us into limitations. So one of the limitations of Ruby case-gen is that we do convert Ruby into Java using Myra. We can therefore only support what Myra supports, and Myra does not have full coverage for Ruby. It's only a Ruby like language. It's not a Ruby language. Java also doesn't work exactly like Ruby. Integer values don't work the same way. There are other differences, minor differences. Fortunately for us, most of the differences are fairly minor. It's actually surprising how close they match. The other issue and the bigger issue is that we still don't properly handle a valid code. And we still have to rely on annotations for this kind of stuff. Annotations are good, but it's not really how we want to have to write Ruby code. And at the end of the day, there are things that we just will need to annotate that are way too dynamic for a static analysis tool to handle. So how can we improve a Ruby case-gen version of Ruby? We were to do that. There are a couple of things we can do to make Ruby case-gen more effective. The first thing we can do is teach TSM to symbolically execute Ruby code. So currently, TSM really only understands Java and that's sort of why we took the shortcut through Myra. If we had access to VM instructions in Ruby 2 or currently, the Ruby 1.9 interpreter doesn't really give you access to the byte code for the VM at a very low level. If we had access to that kind of stuff, it would be much easier to write a symbolic executor because really what you're doing in a symbolic execution engine is effectively running the byte code out of byte code. So if we had that kind of access, it would definitely make it easier for us to write a TSM interpreter for that. The other thing we can do is implement Ruby abstractions in Myra. So instead of making TSM understand Ruby, we can make Myra understand Ruby. And currently, as I mentioned, Myra really just compiles your Ruby down to the Java code. So if you're using a string in Ruby, you're going to get a string in Java.lang, a string of native string type. We can build abstractions on top of that and pull them into Myra to have them compile us a program that uses Ruby abstractions on Ruby's standard library. And at that point, we can actually have better matching to Ruby's features. We can also forget about TSM entirely and write our own symbolic execution engine. There has been some work done in static analysis with Ruby in terms of writing intermediate representation languages. So Blazor is a good example of that. But writing a symbolic execution engine is pretty tough. So that's a pretty tough path to go through. Finally, there's LLVM CLEE. CLEE is another symbolic execution engine that runs on top of LLVM byte coding in terms of LLVM byte coding. Excuse that. So we have an LLVM implementation of Ruby. It's called Lavinius. And actually, Lavinius would be a very good place to try out this kind of thing. These are the ideas that I came up with as future steps for the project. I'm open to ideas. I'm sure I didn't think of everything. If you just thought of something now if you've been thinking about this for a while, please come up to me if you can talk. I'd be interested to hear about how you can do things better for that. There are some more details here. I want to know about exactly how we translate Ruby into bogey and what are all the real technical issues we deal with in terms of lambas and moving all those crazy things. You can actually read, I have two papers published on that subject, so you can grab those as they're on Google Scholar Bogeyman. And with that, I want to thank you guys. Any questions? I'm looking into, say, Web Developers for a long time, as far as integrating this thing. So, ideally, this would be something where you... I mean, it would be really no different from integrating with a non-Webeveloper workflow. The only difference is there aren't my level of higher level abstractions to deal with and that your symbolic execution depends on the open handle. It's more of an implementation issue of how long will it take us to get to the point where we can actually symbolically execute something like Rails. And that's sort of where it stands. So, at the end of the day, it's really the same kind of workflow as regular Ruby code, but it probably will be a wild port. You actually see symbolic execution working on something like Rails. It is a pretty heavy step. Would it ever be the case that you would want to just run symbolic execution with just across your full app and not have it translate to tests that you would accept to be that it would fix with those tests and would you translate to Ruby? Yeah, so one of the things we explicitly did was to translate those test cases that were actually just really awkward XML optimates in Ruby code. We could have just converted them into like an either readable data format like based on our pictures. But we chose to go to Ruby because, you know, it looks easier to read as Ruby as Ruby and there's less test framework stuff that you need to do to code up the picture. So we just, instead of having a full part of the test framework and a tool that generates the data for the test framework, we just generated the test directly. But that's really a side effect of the fact that this was really a good concept. Yeah. Have you put the use of Ruby on quite a bit of a private Ruby? I've only... I've heard about MRD last year. I haven't had a chance to take a look at it. I would be interested. That would be actually a good thing to look at. So the answer is no. I haven't really had a chance to take a look at it. But actually, that is one of the possibilities. Yeah. What do you think you're actually doing with Ruby? What was the participation? I'm sorry, I said it. Why did you choose to go with Ruby rather than just to go with Ruby instead of Ruby? Right. So the question was why did we go to MIRA instead of Virginia's first? So we were... I was actually working with Serum for other research issues. So I mean, we were actually using Serum in general for the project. So that was sort of the guiding tool that I knew more about the Serum framework than I did about LVN CLEE. And I'm not sure exactly how it stands up to QIESAN. QIESAN does have specific optimizations that make symbolic use way more possible. And I don't know if CLEE has those things. So those are the things we have to investigate. That was really the bigger issue. I think we mostly had a Java stack behind us. So these are fine. Well, thank you very much, guys.