 So, hi, my name is Joseph Wilk and I'm going to be talking about some testing ideas, tools and some languages outside of the Ruby world. So, here I am sitting in a Ruby conference telling you that I'm not going to talk about Ruby at all, pretty much. So, what am I going to do? Well, I kind of think communities sometimes have the tendency to be quite insular with their ideas. Kind of the Java community obviously has a very rich testing history. And I kind of want to try and kind of bring out some of the ideas and other languages and bring it in. I feel like the Ruby community is kind of a very receptive audience for some of these ideas. So, I'm kind of hoping to show you some stuff which may inspire you or may be useful outside of the Ruby world for you. So, really, there's no Ruby tools and as long as we both have the expectations set, then I hope I won't disappoint you too much. So, there's a pattern from the book. I think it's the 97 things every programmer should know. And I wasn't studious enough to write down the name of the guy who I was talking to who said it. But he kind of said there's some evidence to suggest that the number of programming languages that you know, and when I say no, I mean you kind of understand the paradigms of that language and kind of you can very naturally fluidly write in that language. It kind of has a correspondence with your programming skill. You pick up these different paradigms and you start to be able to apply them in different contexts, different languages. It's kind of a hard thing to say kind of concretely, but I think there could also be something about passion and that people that want to be kind of multi-polygots are the sort of person who really want to be as best as they can in their skill. So, I'm kind of hoping as well as kind of showing you some tools and stuff that this also applies to testing. If you have more paradigms in your head about the way you can test stuff, then I don't see why you can't be better testers or programmers. I kind of see them as the same thing. So, the testing world is when I first sat down to write this kind of presentation, I was fairly depressed. It was like JUnit, RSpec, PHPSpec, JSpec, CircumSpec, somethingSpec, something, something, something, something, something, something, something. And just these tools to me aren't really that interesting. I mean, apart from cucumber, of course, that's still quite interesting. Sorry, I was like. So, what I really wanted to kind of move away from was these languages are useful to have some kind of BDD testing tool that works for people, but I'm, I just, to be honest, you can't be bothered to talk about them, so I'm not going to. I'm just going to move on to what I think are some of the more interesting stuff. There's also whole kind of typing and how static types and how that affects mocking. And again, like, why, why tell you about something that's a lot more verbose? You know, we've already won there. Like, what's the point? So, nothing's to do with typing and mocking. Again, that's just boring. So, some of the stuff I do want to focus on are property testing, model testing. They tend to have been fairly academic in some ways, but quite commonly used tools in the functional world. Some of the things we can learn about test feedback and utilizing metrics from tests are graphical. The fact that written tests sometimes aren't good enough and that we need some sort of visual representation. Dealing with permutations and dealing with asynchronous stuff. And I figure probably that out of all this stuff, maybe asynchronous stuff will be stuff you could take away and utilize in stuff like JavaScript. But the rest of the stuff may be a little bit out there, so please bear with me. So, the first one I'm going to jump into is Haskell. Haskell is a pure functional language. And I'm not going to try and condense the entire language into your head in about 30 seconds. So, really, I want to kind of show you some of the testing tools that have evolved around Haskell. And specifically, I like this quote from a man who I can never pronounce his name, but I'm going to give it a try. Did Jettstra? I'm not sure how was that? Does anyone actually know how to pronounce that name? Dejstra. Thank you, almost. So, he says that programming testing can be used to show the presence of bugs, but not the lack of bugs or the absence. So, when I like to think about this, like a single aspect test, for example, with some static data, is a very good example about expressing the behavior of a system. But if we kind of move our minds into a more statistical world, that test doesn't give us a sufficient evidence to perhaps conclude that that function is bug-free. So, again, going very brutally to that statistician mind. Well, so if we increased the number of tests, if we had, you know, a million tests against this function, trying lots of different test data, lots of different logical properties, then he's right, we wouldn't be able to show the absence of bugs, but we could create sufficient evidence to maybe suggest that there aren't bugs. And I kind of think that's an interesting idea, which kind of brings us to the first tool I want to bring up, which is QuickCheck. Now, there are some implementations, kind of some rough stuff around with Ruby, Perl, it's kind of propagated to a whole bunch of languages. I think there is stuff you could use in Ruby, but I'd break my law if I told you about it. So, there's this kind of crazy logic format. I kind of found a lot of these tools after kind of studying computing at uni and then not using any of it for six years and then coming back and going, oh yeah, that was actually kind of useful. We can kind of define things about a function. So, for all values of S, the length of a thing returned by this five random character function is five. It's a fairly trivial logical property of a function. And you can define it in the logical sense if you really want to see. So, what QuickCheck does is it takes this logical definition and it will generate random tests based on some data distribution. You can kind of customize specifically what type of data. So, it's going to try and insert, you know, like strings, numbers, whatever it possibly is given as a range of stuff it can throw at this function. And then it's going to go and run thousands and thousands of tests against your function. And hopefully your function bursts into flames and melts. And you get spat out some counter examples. Some examples where that function failed that logical property. We said for all possible S's that are taken by this function should have that property. So, we can find like counter examples, which is kind of cool. Excuse me. So, this kind of stuff is already possible in, oh crap, I'll show you some ruby down. This stuff is kind of possible in tools like RSpec. And Elizabeth Henriksen has a couple of really interesting blog posts about how she tried to take RSpec and make it more like, more exhaustive data set testing stuff. So, you know, very trivially testing that the reverse function works of Ruby, which you would never, ever do. And then kind of doing something quite similar to the idea of Quick Check. You know, go run a thousand tests, randomly picking strings, like two characters, numbers, mapped them down strings. It's, you know, it's a very naive implementation of how you could do something like Quick Check in Ruby. I want to show you the same thing in Haskell. So, the top box represents the distribution of the data that's fed into this reverse function. This is, again, where a lot of power comes from in Quick Check. And you can make this as simple as you want to as complex as you want. In this example, the arbitrary line, I'm specifying that it uses, um, displayable characters in Haskell. And this co-arbitry is actually how, how I define it randomizes picking those values. And this, this is like trivial. I'm, I'm still trying to get my head around like page-long examples of how people generate this data sets. But obviously when you're throwing loads and loads of tests, kind of test data or something, you want like a tight handle on what type of random distribution that has and where the values are biased or not. The bottom line is, is a very simple property. Uh, we just do this prop underscore, which is not interesting. And just stating that, uh, where XS is this character string, that the reverse of it should be equal to itself. A very simple property. Running Quick Check, we just tell it that function. And it goes off and just runs a hundred tests, which is kind of cool considering we only wrote one, but they can be quite hard to formalize. And I think that's one of the barriers behind some of these tools is that actually you could spend quite a long time formalizing this logic. So I kind of think there's a, a balancing line of where you want to do that. So I said I wasn't going to talk too much about tools, but imagine something like Cucumber where you have a table, um, in Cucumber, which can represent the inputs to that test and then some output state. So there's no reason why you couldn't take something like Quick Check and start mutating those inputs to that Cucumber test, for example, and examine what happens to the system when it gets all these weird and wondrous crazy data sets. Just a brief idea. So jumping on to the next one I want to talk about. It doesn't sound up very well, does he? Um, Erlang, which is based on the actor model and is very much focusing on messaging and concurrency. So this has a really badly named tool called Muk Erlang. Ah, yeah, I seriously question that. Um, so Erlang, um, you can imagine it has lots of processes potentially communicating messages between each other. And it's fairly easy to kind of produce a quite complex set of protocols of kind of what are the different states the system could be in. What happens if this message comes for that message? And you can imagine you can build a fairly complex parallel system very quickly. So what this Muk Erlang does is it says, okay, well, I'm going to, Erlang is compiled. Um, it's going to take out the communication concurrency and distribution aspects of Erlang and actually run it under a completely new runtime system which simulates all process activity. So the idea is then from that we can generate finite state machines and examine the possible states that a system could be in. This is a really simple example. You can imagine some kind of trivial message server where you have clients kind of logging into a service trying to send a message to another client, the client logging in and getting that message back. So you can kind of see already here, very simply, you could extrapolate the various states that this system could be in of that one, you know, the first person logged in, the second person logged in, then the message, or they logged in the message and you kind of get the idea. So this is where we get to do some scary stuff. Well, scary, not really that scary. So in model testing the idea is we want to define some property of the system and it's going to go off and explore every possible state that that finite state machine could be in and assert this logical property holds. So this is simply stating that when someone sends a message that person should receive it, but it has like a kind of a precondition, you know, the user one does not send a message until someone's logged in and then, you know, the action, so if they send that message and then the effect eventually that person receives a message. So you could write this and the bottom box actually represents what we write in this MOOC Erlang language, which is, I have to, Bucci Monitor, which is linear temporal logic. That's really crazy names, which I can't pronounce, I'm probably getting that wrong. So we can represent that property of our system. Again, this kind of logical representation of some thing we want to always be true for that model. And then Erlang, I'm afraid, really doesn't make very good slide code. It's just a mess, right? But, okay, like briefly, did I get a funky laser pen? So in the middle part, you'll see that string that I just told you about, not P until Q implies, eventually P implies, eventually R, which map down to predicates below it, which are about sending messages, logging in and receiving messages. So again, fairly ugly, but the ability to do this sort of stuff, if you've ever had any debug, sort of like this, this level of concurrent process and protocols going on, it, you know, you quickly discover that in fact this test fails because what can happen is that Clara could log in, Fred could log in, Clara can send the message and then Fred logs out before he ever receives the message. So already that identifies like a problem with this protocol. So here's the, well, good and bad news, I guess. Something called the pesticide paradox, which is from, these are from some software testing book. He states that all these ideas are kind of really good ways of preventing or finding these bugs, but unfortunately what you're left with is even more subtle or even harder bugs and these tools ultimately become ineffective to capture those bugs since like they've already missed them. So these tools can help but it still doesn't rule out any sort of, you know, I still can't see how you can not have sort of some level of QA sort of manual or some human inspection of systems. It just gives us another tool to our kind of tool set. So closure, I'm really briefly going to dive into mainly because this one isn't really about a tool as such, I just kind of found the language interesting. So Brian Marek wrote a kind of a BDD-esque mocking testing framework called MIDG. And at the top there I've got the closure version, at the bottom I've got the Ruby version. And one of the things that Brian states is that enclosure everything, well, pretty much immutable. So it doesn't make any sense to use the word should or talk about it could be that or it might not be that. He kind of wants to build up facts about his function and a system should be defined by a number of facts. Also kind of this dot-dot-dot syntax is kind of like a mox saying it's this thing, we don't really care. And you'll notice obviously since we're talking about functions here and not the bottom example where we have classes on mox, I find something very elegant about the top one in the fact that it's almost like it's not a test, it's almost like it's much more of a specification. Alive in the next generation of cells, so this is from the Conway's Game of Life, like provided that a live cell goes to false and the neighbour count goes, the neighbour count of the cell is three. So these are just globally defined methods or defined in a module but they have no object to invoke against. And I just look at the bottom example and it just suddenly feels quite crafty of like the responsibilities of this object. I'm stubbing, I'm not really defining like a fact, I'm just dealing with like precondition stuff I need to do in order to get the assertion. So just an interesting syntax in his testing framework. Ioki is just an idea I want us to steal when someone may have already stolen that I've missed it but something Ioki does really nice which was written by Ola Bin. Is the documentation for Ioki, he's kind of written a similar R-Specky style language and what's really interesting about this is that we often say the specifications of the documentation but then I've worked in a lot of open source projects where you say okay let's just expose those to users so they can go read them and there's like a sudden like intake of breath and like oh yeah maybe that one's not quite right, maybe that's a bit technical. What he's done is in his kind of R-Doc style documentation of Ioki he's exposed all the tests so it should like he's talking about dictionary equality and saying you know should return true for the two empty dictionaries where one has a new cell. So I really like this idea of exposing the test through the documentation because I think it kind of stops them rotting and starts to make you think more about the language and care more about how you phrase these which I think is always a great thing as as documentation. So this one has a little bit more detail about because I kind of wanted to leave some maybe more practical-ish stuff and JavaScript is kind of fun to mess around with. So JavaScript has an interesting situation where often it's contained in this black box the browser and this black kind of embedded system almost is slow and painful and the APIs to interact with it while improving are still to me fairly clunky and messy. So something I quite like the idea of and there's lots of tools that enable you to kind of run like headless browsers and get away from this kind of built-together clunky slow system like html unit and v8 like that but the kind of reason I brought up zombie or zombie j s is I quite like the fact that it's a JavaScript implementation of the DOM used in JavaScript. It's kind of accepting that if you really want to kind of get that interactivity and have a nice API to it and keep it fast why not just keep it all pure JavaScript. It has a very similar syntax to sort of things you'll have seen in capybara or webbrack you know or selenium sort of visiting URLs filling in buttons. I felt it deserved a mention because I think it's an interesting solution to bring new things natively to the language and I think still with ruby as far as as far as I know we're still mainly cutting out things like va or like through spider monkey through johnson backing out to see stuff which are going to be fast. I kind of like the purity of this so I can just jump into the JavaScript and actually see how it works and not have to pass you lines and lines of c code. vals is the other framework I wanted to mention. vals is a node j s testing framework and it demonstrates a couple of interesting things with kind of whole asynchronous stuff. One thing it does initially is it creates a separation between the data that's under test and the tests so you have this topic at the top which you know is a function of value just return something. This is executed once and then each of these assertions is run against that topic so that that doesn't sound that interesting but you kind of see how powerful this can be with dealing with asynchronous stuff and dealing with running things in parallel. So when you have an asynchronous call in my past hacking often this is always like the hellish problem of having to do sleeping and waiting on conditions for things to be true and you know this comes built in with the asynchronous idea that you can take this file stat function which is runs in the background, sorry asynchronously and say that you should only run the tests using what we do in JavaScript callbacks. Only once this function is finished it should then fire that this.callback line which will fire off all my tests. So it's a very clear kind of clean way to deal with asynchronous functions. There's a lot of interesting stuff actually going in the source of vows so if you get a chance I'd recommend having a dig around. There's another fairly similar idea with these kind of promises or futures where you define this proxy object where you kind of deferring the execution of some of an asynchronous thing you don't have the data yet. So it's a similar way about creating kind of not having to do any blocking or waiting about being enabling your methods to be called when the callback fires off in this and you can kind of guarantee it's going to either admit it in error or a success. So this pattern is often used where you want to decouple the caller and the callee. It kind of does the same thing as we saw in the previous example but some people throw it. So the most fun thing about this and kind of start to wish that maybe a aspect and some other tools I had it built in from the start was being able to fire these things off in parallel. So you have kind of a closure right you have a topic and then you have a bunch of assertions there is nothing stopping all three of those from firing at the same time. You could argue that there's maybe like shared resources or some sort of kind of global state that could get problematic but to me that's the smell in your tests and not really an issue you should try and resolve to get this sort of independence and be able to take a bunch with this parallel stuff. It seems so much more elegant than firing up a couple of processes which are just running through a list of specs to actually be able to just fire them all off at once. I think it's a really powerful thing of vows and probably the thing I like most. So this is fairly old hats but I just thought I'd mention it because I think it deals with the interesting problem of the permutation explosion. You may have heard I've talked about tools previous talks like pairwise where you can try and kind of reduce the number of combinations that you need toward in order to examine in order to kind of cut down your test data set. Well kind of the whole kind of jQuery idea of things like browser orientated is that they need to just it's no good it's not good enough to say are we kind of a little bit of us might work on opera like not point something but we don't really know. So it kind of fires off against loads of different browsers and the way it does that is by crowdsourcing the problem. So anyone can fire up a browser window and kind of give up that page for someone to use to run tests across. If you caught any of the lightning talks James did a similar one of Terminus right James a similar idea of where you kind of create open this tab which can then be reappropriated by a test process and stuff thrown across it. This is kind of cool for the whole JavaScript world. If you've ever come across a tool called Globus that it kind of is used a lot in grid computing research that has a sort of similar idea where if a computer is idle for a certain amount of time it will then start distributing kind of computations usually fairly kind of heavy physics stuff they tend to use it for to and then you can imagine like a lab full of computers right and the student goes away for like 10 minutes and immediately that machine kicks in and starts contributing resources. I'd really like to see something like that apply to other domains because not just kind of public crowdsourcing but private sourcing and that we have lots of nodes in our work in our workplaces that I'd really like to see more utilized for kind of the way we run our tests. This is kind of fun I get to talk about Java at Ruby. My very well my first commercial programming language. Java has a lot of really exciting ideas and I'm already trying to steal as many as possible. JUnitMax is a tool that Kent Beck wrote and his whole whole principle of this tool was that he wanted faster test feedback. He kind of defined two major principles that he did a lot of like metric research on testing and large test suites to see what sort of patterns he found. And being that failures are not randomly distributed it tends to be the same problematic tests which fail and you have lots of tests which never fail and also in terms of the performance time for tests it tends to be distributed as just a lot of really short ones and then a couple of very large tests. So his idea is something we're maybe a little bit common with something like auto testing grouse or notifications in that when you're using Eclipse it's a plugin for Eclipse. When you save your code it will kind of give you a notification immediately on the left there kind of the notification and the bit down the bottom telling you there's an error in your code. So, excuse me a second and what he's doing here is he's using metrics about your test suites and what tests the most what tests the most likely to fail based on probability distributions. So rather than going and running like the entire test suite you're getting very rapid feedback we already have some of these ideas in Ruby with some of the tools but the idea of having it integrated into an IDE that's kind of giving you that snappy feedback that's immediately as you kind of finish your line hit save bang you immediately get a notification that that test failed you could then go see like how many times has that test failed you've got like the little plugin at the top which tells you the overall state of the test suite and you can kind of see it's a little small but there's like a kind of a timeline with red green red green where you see the the amount of time you were in the green for when you moved into the red and so on this is a really cool tool and I think something that would be really nice to see in the Ruby sort of IDs like red car industrial logic is a company who does kind of quite a lot of I guess sort of not sure if it is just Java I think it's you dotnet and Java as well kind of training programs and their angle is they're totally obsessed with metrics so what I've got here is someone attempting to solve maybe a cat some sort of problem they're using something like Eclipse or IntelliJ for a start the events on this graph first of all you've got red and green right so like how long you were in the red for how long did you spend with failing tests and how long did it take for you to get that green so immediately that's an interesting thing to see right you know if I was in excuse me if I was in red for like 10 minutes and then was green for five seconds immediately back into red maybe I'm doing something wrong and since we're using an IDE we can capture a lot of refactoring patterns because stuff like IntelliJ you know people just use that the amount of refactoring patterns you can apply if you edit us like IntelliJ are amazing so we've actually captured in this graph when you did the refactor so we've literally got like the red green refactoring pattern as a metric we can assess this and what they do is they kind of look at this and try and help people see where they can improve in their practices very much at a micro level looking like what happened over you know 10 minutes not looking over like what happened over a week or two I think this is a really interesting tool and something that again I've been working on a side project limited red to try and kind of take this idea and Kent Beck's idea and do the same sort of thing I don't know also some of the Cata websites now will start to show you a little bit of this information about like how long we were in the red how long we were in the green so here's a collection of just really random stuff which didn't really fit anywhere but was kind of interesting and some of these ideas aren't that new actually but I'd never knew about them so I'm hoping you guys didn't either so the the fundamental problem for me I mentioned this a little bit at the beginning about tools like cucumber for example is plain text just isn't good enough you know conversations we can get but I need images and most of the people who I've talked to who are involving non-tech people in doing things like cucumber are doing like you know the sketching on the board and often like the cucumber is this of the guck and there's the side effect of actually having gone away and scribbled cards diagrams back and forth so here's a kind of a couple of tools where people specifically in the group which has a really painful acronym something like the agile alliance functional testing something something something it's an awful acronym but they're kind of a lot of smart people coming up with some ideas about I think they can do with pictorial representations for how we look at tests so Ward Cunningham I think he what year did he do this in I think this example I got was from around 2007 so it's not exactly bleeding edge but I was kind of fascinated to see it I'm actually going to show you a brief demo so you can get a feel for what this tool is so it's it's a list of tests and what this is all about is running the test providing a better visualization of what happened during the test it takes a little while because it's actually running stuff so it has this swim lane idea of the different roles that were involved in running this test the developer developed two candidates and you can see that they were taking various actions and that this is time sensitive so the various timelines of when these actions occurred the really neat thing is as you roll over you get little snapshots of parts of the UI so you can identify where the bottom was it's highlighted yellow there that they click nominate they click search this is a really nice way of getting some visual representation of what's going on inside a test so we can look at the script that's behind this which doesn't really look that exotic or interesting it's kind of clicking thing logging thing fairly fairly he's kind of standard sort of web rat esk API what's quite nice is when we execute this we can also get an exact layout of what happened so I think if I click run now yeah so then you see what we just the script parts and also the UI elements and what's really nice with this example and something that I wanted very keen on was that you can still inspect these elements and see what was what was a radio box and what was the select box you're actually getting the elements in place from during the test run which is a really powerful idea and you know I would love to see something like this for some of the higher level acceptance testing tools and it's an area that we really are lacking in and in this sort of IDEs or tools to enable people to write these sort of acceptance tests another tool you may have come across which is still fairly prototypy Brian Marrick did a similar thing I think he's done it with Omnigraphil and he did it with something else where he was using fitness as the back end which is if you don't know is kind of like a cucumber but more table centric and he in Omnigraphil he created this sort of markup and then he pauses this diagram which is just xml in Omnigraphil and converted it to a test which then gets executed so his idea was that he could actually edit like you know edit a little bit of this document change some of the text and it would change the test so again this feeling that like we need we need the visual sometimes in order to communicate this stuff if you know working with UX people it's really beneficial to be able to show them screens and not just given I did this and I did that and I did this I think there's actually a ruby a ruby gem that he was been playing around with so if you go to Brian Marrick's blog I'm sure you'll be able to dig out that and have a play for yourself I don't have time it's kind of a an issue I have with testing that I wanted to briefly touch on is that I find that when it comes to kind of what we know about failure there's kind of quite a divide often in what that a developer knows and what a tester knows and I've got a link there which I'm essentially having a bit of trouble with but Elizabeth Hendrickson who I think this year last year won the PESC agile award has written a lot of amazing stuff about just kind of QA sort of like fail heuristics as well things that you should look for when failing and she's got this great cheat sheet of you know like when you're testing dates when you're testing times when you're testing like all these different type of data sets she's kind of done the hard work and read read the lots and lots of books about things that you should look for in failure and I find there's often quite a disconnect with developers of of really understanding what sort of things are likely to fail so from a kind of just a developing perspective if I know more likely things that are fail I'm more like I'm more likely to produce a higher quality software because I'm going to program for those in my head I'm going to be aware of those cases and I often see kind of teams whether I have a developer who will throw it over to some testing process when they'll get a bug back and they'll fix the bug with a little test whatever happens again and there's not really in the kind of a an acceptance that developers should know those heuristics what's what's interesting kind of aside from the manual heuristics is there's a lot of testing tools which using metric information I think it was there's a tool called Agita which I didn't put on the slides because it looks like an evil enterprise solution and doesn't really sum up what it achieves in its web page which automatically generates some J unit tests based on metrics from analyzing your test suite and other people's test suites which is kind of evil in some ways in that their argument is that writing unit tests is slow so why don't you automate it which is a fairly scary idea but there's definitely something quite nice about the idea of a suggestive element to that of um suggesting heuristics that you maybe want to investigate about various parts of the code base so I think there's a lot of value we can take from manually learning heuristics of failure and also looking potentially at tools that may be suggestive of areas of our code base areas where tests are failing all the time sort of thing a lot of metric information that could guide us towards heuristics so um I think Ben yesterday did a really um a really good talk about stealing ideas and how we've stolen ideas like rap and um it kind of made me feel fairly guilty because um I guess my message takeaway from this would be please go steal those ideas and write the tools for me so I can use them so I think there's a lot of interesting ideas and I've I've kind of seen um I think a bunch of Python team use a lot of the quick check stuff in order to you know really dig out some interesting bugs in their system so I kind of hope that there's some interesting stuff there and some hopefully stuff that you can go away with and start to play around in your own languages so I hope I've satisfied a little bit of your curiosity and some of the tools that are outside um the Ruby testing world so thanks very much