 those material and we're going to do it in an hour and our part of the bargain is we're going to not go off on tangents and rent as much as we normally would. And in return we ask that you save your questions to the end and hopefully we won't go over time and we actually will have time for questions. So this is, I'm Jess Humble I've worked for Go Wolf Studios, I was co-author of this book and my job is to rent at people. My name is Butwin Al-Janaker and I work at studios as well. I'm a developer. Recently I've been playing more of a product on a role but that's just past few months. I've been with Orw焉 for about 11 years now and most of that I've been doing writing code as I was with Java when we were developer. I'm working with some pretty big hairy automation test pieces of part of that. So that's me. I come and sit at the front and then we don't have to chat so loud. So… Rhyn i'n bwyl ni'n stryb y gallwn bydd. Yeah. Ac we will talk today about how to create high quality acceptance tests but more importantly how to make your acceptance test stwyde maintainable because it's one thing to create exception tests stwytes keeping them maintainable over time reasonable cost is the hard bit it turns out all the problems ultimately the practices and principles to achieve this are well known but they are hard ac fel hyn yn ei ddafod â cyllid mewn ei ddweud, ac yn ei dweud o'r ffordd a fel cynnau mewn'r gweithio. Felly mae'r seksion i ddechrau'r hoffi a risiol lle mae'r hoffi yn tuwyr iawn. Felly yw yw'r hyffordd ffawr y ffrwng yma, mae arbennig yn y tuwyr. Mae tynnu'r hoffi yn y blygu. Mae hoffi yn y bwysig o'r hoffi, mae'r hoffi yw'r hoffi yn y bwysig o'r hoffi. Mae'r amser cyfnwysau iawn yn gwneud ymddiad, tyfu, ac ymwirio'r cyfnwysau, iawn gyda'r cyfnwysau iawn fel arnyn siaradau. Mae'r cyfnwysau iawn yn iawn, daeth dwydo'n gwneud ymddiriedlach a'r cyfnwysau iawn. Mae'n cerddio'r cyfnwysau iawn o lawr ac cerddau ar y caelabau cyfnwysau iawn. ac yn ystod, mae'n bwysig mewn gwahodol arferfynol a'r hyn yn ystod o'r bwysig i'r hyn yn credu y cwil yn ddiogeliau, o'r bwysig i gael gwych yn'r cwil. Yn ystod, ydych chi'n gweithio'n cyfathu'r ysgwyl? Wrth gwrs, ydw i'n byw yn dystod o'r bwysig, rydyn ni'n bwysig i'r ysgwyl o'r bwysig i'r ysgwyl yna, ond mae'n gweithio'n gweithio o'r bwysig o'r progect o'r trwy'n gyfrannu, amaeth ymmell gweithio cyntaf o'r cystafol ymlaen a'r cystafol ymlaen ac all ffordd y tmes yw hynny eni ei gallu meddwl yn ei gerwyn Felly y panaf, cystafol cystafol cystafol ymlaen ar gyfer y brhwyng felly mae'r unid ac y cystafol yw'r cystafol Bringer hon yn pawb y ddweud gelwethaf, ei fod y cydafol oherwydd a'r cydafol llwyddi yn ei gweithio ddefnyddol Ac mae'r ystod y bydd wedi gweld, sydd wedi eu gweithio'r tyfun yma that we need to test is to test in our development. This stuff should all be automated. On the top right are the things that the human beings should be doing, showcases, demonstrations, getting feedback from your users and your customers, usability testing, exploratory testing, this is what testers should be doing. That and creating helping to create and maintain acceptance test. This can't be automated. On the top left is Halby Call Functional ddyfyn o'r llwy ffobl yma, oedd ydych chi'n gweithio gael y gweithio'r cyffredin iawn, yr unig y byddai'r cyfnod o'r ffawr o'r cyfnod o'r ysgol, yn hynny'n gweithio'r cyffredin iawn i'r bobl yn y ffliwch o'r bwysig a'r ffliwch o'r gwneud. Felly mae'r ffliwch arall yw'r rai bwysig o'r gweithio'r ffawr. Mae'n rhaid, ond mae'n ffliwch ar y gallai o'r ysgol. So we're not going to be talking about the rest of this stuff very much. And then at the bottom right is what's laughingly called non-functional acceptance tests, so things like form and scalability, availability, security, all these other kinds of illiti type concerns. And a lot of this can be automated. It's insufficiently done and it's insufficiently tested from the beginning. This is how we validate our system architecture. And we need to be doing this from the beginning of development because that's when it's actually cheap to change the architecture. So we're going to ignore the right-hand side completely from now on and focus on the left-hand side. And in terms of these types of tests, there's something called the test pyramid that Mike Cohen talks about. And what he says is, in your test suites, there should be a very large number of unit tests which test a single class or function in isolation and they run very, very fast and you have a lot of them. In the middle you should have a smaller number of service level tests and these are end-to-end functional acceptance tests but they go through the service layer. And then at the top you have an even lesser number of UI tests which again are end-to-end functional acceptance tests but they run through the UI. And if you design your test right, you can use the same test suites and run them either through the service layer or the UI layer. We'll be talking about that later. Let's take away from this. And if you look at this in terms of the quadrant diagram I just showed you, you can kind of collapse these. What you really want is a lesser number of end-to-end business-facing tests and a large number of localised technology-facing tests. And the important thing about the pyramid is there's more of these than there are of these. What you want to avoid is the inverted pyramid where you have a ton of acceptance tests and not very many or no unit tests. And that's a common failure mode that we see. People use some horrible tool like QTP to record and playback-based acceptance tests and they don't have any unit tests. And then what you find is these recorded acceptance tests fail all the time and they're flaky and they're very expensive to maintain but it's your only protection against bugs and so that becomes inordinately painful and horrible. So we're going to present six principles. Principle zero because we're all zero index here of course. Writing good acceptance tests is hard. So we all know this, that's why we're here. The important thing to consider is what do we mean by good? And I think there's two elements to that. Number one, when the tests are green we are confident that the software is working. So a good test suite, when it passes you have some confidence you could actually release that to your users. Conversely, if the test suite is red you have some confidence that there's actually a bug. Not that the tests are flaky or the environment isn't set up right or some other problem like that. So you want to know, test suite goes green, great, I can release. Test suite goes red, oh there's a problem I need to investigate. And that's the sign of a good suite. Ultimately this turns out to be the thing that distinguishes a good test suite from ones that are hard to maintain and difficult to maintain. At the end of the day it doesn't matter whether you've got a pyramid or a pair or an apple shaped test suite it's important to achieve that as a sort of platonic ideal but that is not what you're really working towards. You're not working towards making your test suite go into a pyramidal shape. What you're working towards is this level of confidence and then you can then optimise the test suite that results and gives you this level of confidence into that sort of platonic ideal of a pyramid in order to get the benefits that you get when you do that. So to give you an example of yes it's hard to write acceptance test suites but it's also false to state that large scale acceptance test suites cannot be written, maintained and evolved over a period of time with an application that's currently in production as well. This is the application Mingle I Work in Studios. One of the products we have is called Mingle. It's a web-based project management tool. We started it in 2006. It was a really tiny team. We were practically in startup mode. We started with about 20 tests on day one. 500 lines of code in the test suite and the whole suite ran in about two minutes. Fast forward to 2012. We probably need to update these numbers for 2030 soon. We are at about 3,000 acceptance tests. They are about 50k lines of code and they take about 12 hours if you run them from end to end and these tests have run every single day. From the day we started writing them in September of 2006 till today. They have lasted us through an office move from Sydney to Beijing and from Beijing to San Francisco and three office moves in San Francisco. So through all this, this suite has actually stayed alive and running on different machines. The actual running time for this 12 hours suite is actually 55 minutes, which we achieved by parallelising our test suite. They run about 7 to 10 times a day. We obviously check in a lot more frequently than that but then the check-ins get batched up and they run about 10 times a day. They've been running for about six years across multiple offices and they are still running. So if somebody tells you that you can't actually grow a test suite and keep it running over multiple years, they're lying. You can do it, it just takes a lot of work. I want to add there's a guy called Gary Grover who worked at HP on the laser deck firmware team. He's written a book called A Practical Approach to Large-Scale Agile Development, published by Anderson Wesley. So what they did, this is for HP laser deck firmware, they created a suite of several thousand acceptance tests that run on logic boards that simulate the printer. So they actually deploy the firmware to the logic board, start the logic board, run the acceptance test against the logic boards. They have thousands of tests that they run and if the tests fail, the change that caused the test to run is reverted automatically from version control. So anyone who says you can't write end-to-end acceptance testing say embedded systems, show them this book. I actually bought this book so I could spank people who say you can't do acceptance testing on embedded systems or my system can't be acceptance tested. Any system can be acceptance tested if you design it in such a way that it can be acceptance tested. So don't believe that it's not possible, it's absolutely possible. So what do people mean when they typically say that my application can't be tested or that the suite is flaking? So they mean that their acceptance test suite has decayed. It has undergone a sort of diffusion, the purpose of what the test suite is and what it states is not quite clear anymore. Why do these suites decay? Well they decay for the exact same reasons that your production code does. If there was nobody actually taking care of your production code it would decay in exactly the same way that your test suites do. You don't pay enough attention to expressing the intent in the test suites. If you pay more attention to making sure that a particular link with a particular rail type and class is clicked every time you land up on a page that does not express the intent of what the user is trying to accomplish. That is somebody trying to assert the mechanics of how the user goes about doing this action. So if tomorrow you decide or a UX person comes in decides your company needs a new code of paint and says that all these links are going to become buttons and if you think oh my God we can't do that because we've got 3000 tests that are now going to fail you've done a bad job. You should not be in a position where that becomes a problem. Only testers care about maintaining tests. If test suites and the maintenance of test suites is relegated to a department that you hopefully never see probably works on a different floor and God forbid it works on a different continent in a different time zone and you just write the code and throw the code over to them and expect them to validate it and they are the ones who are responsible for maintaining a test suite forget it, it's never going to happen. I think the important point to bear in mind is that we'll come to this later. So now we start from the real principles because we're starting from principle number one at this point and so this brings us to principle number one which is that tests are actually first class citizens of your project. Tests deserve, what do we mean by this? We mean that tests deserve the same kind of care, curation and dedication that you would give to writing a production code so you would not design poorly designed objects and throw in random willy nilly strings running about the place without consolidating pieces of repeated code so if you wouldn't do that in your production code because that leads to diffusion of intent in your production application the exact same problems can happen over here as well and you need to prevent that from happening in your test code base. So suppose the automated tests are written sometime later than the software and we'll come to that and actually let's take questions at the end if it's not answered by the end then please ask again. So as I was mentioning you need to treat test code as production code you need to refactor relentlessly refactoring is not something you do for just your production code it's not something that developers do because they've got access to IntelliJ or Resharper refactoring is a way of working where you consciously take very small behaviour preserving transformations to make the system capable of accepting change you need to refactor your test code as you refactor your production code do not repeat yourself this is the death knell of any code base if there is duplicated intent anywhere in your system one of them is going to decay and that one will start failing mysteriously and you will not know how to handle it This is one of the reasons we don't like fit because in fit it's very hard to extract out repetition That's our little trick Laugh, laugh Great We didn't think we were going to get many opportunities for joking about test feeds so we just threw that one in there Do not use record and play back tools to build your entire test feed This does not mean that you can't use record and play back tools There is value in them Sometimes you want to know how to actually go about this flow you want to know what a particular selector is you want to know how to actually click that button what event is actually triggered By all means run it once see what it does poppy the code and then just put that one little snippet into the place where it belongs scaffold it with the right level of abstraction that you need so use it to record little snippets and then put a scaffold around those snippets as either a method or a class or a page object or something of that sort and we'll get to all of these patterns later in order to build the rest of your suite and what you'd find if you create suites entirely using record and play back is they're really really brittle you change a UI element or you move a UI element suddenly all your exceptions says fail and you need to re-record them and that it's just death and that's why QTP anyone using QTP hey stop doing it stop trying to bash what you do on a day-to-day on a day-to-day basis but we're just letting you know that we feel your pain we're trying to empathise maybe not doing it very well preventing decay of intention so those things were the previous points I mentioned were about how you could prevent decay or diffusion in the mechanics of what the test action does how do you prevent decay of actual intention so what your test suite is actually talking about so before I talk about this a quick step back let me talk about what is intention and what's mechanics intention is what your user is trying to accomplish it is a statement of what the user is trying to achieve by going through your system is how they actually go about doing it clicking on the login link making sure that they can see a particular tab clicking on that tab getting on the right sidebar and clicking the fourth link down there filling out some form fields those are the mechanics what the user is trying to do at that point is make a purchase so making a purchase on your system is the intention the mechanics of how you go about it are the actual manual steps that you follow so we just talked about how you can prevent intention and mechanics in that aspect of it and now to talk about preventing decay and the intention of your test suite given when then is insufficient so it is not that it's a bad idea to start your tests out with a given when then format but it's insufficient because then you end up trying to shoehorn every single requirement into that format regardless of whether it fits on or without really paying attention to whether it actually makes sense in this particular context so use it as a guide but don't try to force yourself to write every single test in that form that's true of all the agile things like the story format they're starting points the azure so that I can format if you try to write every story like that you'll probably never write a performance story or a story that says my system should run for three months without crashing with an out-of-memory error because which customer is going to come and tell you that I don't want it to crash on April 31st with an out-of-memory error so there are some things that you should not do and you should not slavishly follow any pattern that can be bad as well separate intention from mechanics so the things that I was talking about the mechanics of how somebody accomplishes something versus what they are trying to accomplish if all likelihood, even if you move from a web-based client to a rich client or convert to a way around which is more likely you'll probably still be having a system in which your customers can place orders so the intention that the customer is going to place orders in your system is still likely going to be valid the mechanics of how they're going to do about it can change with a UI code of paint can change with whether you're using a rich client or not and all of these things can switch and you want to be able to take these things that have different rates of change and express them separately so you want to separate your intention from your mechanics people could be using a web browser they could be using a service layer to do it or an API they could be using an RESTful API or they could be using an iPhone exactly and finally you need to express your test as the steps of a user's journey is a sequence of actions that they do that represents a meaningful outcome from a customer's perspective this is not from a story perspective saying I need this link to be present here so that I can do these few things in the future so it's not about that sort of granular level it's about when a user enters the system when they stay in your system logged in for about 15 to 20 minutes what is the operation they're doing what are the few operations that perform and you need to express your intention in that format oh fancy animations alright so here's a solution that we propose use a natural language to express intentions because that's the best language for expressing user intentions my customer places an order better set that way than by inserting arbitrary periods in there to make some object oriented language feel happy use a general purpose programming language to express test mechanics these have proven their value we know where their value is we know what Java can do for us so use a general purpose language to express the test mechanics filling out a form field and use a tool that allows you to operate in either domain seamlessly you need to have a tool which allows you to offer your tests and as intentions express your mechanics in the elaborator point so you want to we said you have to treat your test code like production code so what that means is you have to refactor it you have to be able to remove the implication you need to be able to do encapsulation, separation of concerns so you need a programming language to do that you can't do that in a natural language or a made up language and you need an IDE which is designed to allow you to do that so if you're going to treat your production test codes with the same way you treat your test codes the same way you treat your production code you need to be able to manage it using the same way you would manage production codes so the benefits of the toolings that you get for writing production code should also be available to express to for your test mechanics themselves and since they are treated identically you want to use a specific tool chain in order to do that and you want to use a tool chain that allows you to switch between those two saying when you're talking about intention you can see that as natural language expressions so when you actually move over into mechanics you can actually see those as well if only such a tool existed buddery oh wow what do you know we have one so I don't want this to become a product demo but this is just one of the tools out there this is a tool that we've developed in studios called twist and I'm showing you a twist test suite I don't know if people in the back can see but this is basically a test suite that's actually written in natural language and it's testing a real application that we have internally so you see the intentions are expressed in pure English whereas the implementation obviously navigates through a web browser you don't see a browser.click anywhere over here but all of that is happening behind the scenes and that's because each of these steps is actually implemented as a series of browser automation steps using a browser automation framework but that is not the way we actually encode the intention of what the test is trying to say so if we want to go to a listing of all user members perform some searching and verify that some users are there we do that using the intention language not using the language of mechanics so what's happening here is each of these steps here is actually calling a method with the same name as the thing but it's all kind of concatenated and what the tool does is it keeps the the steps in sync with the codes if I rename the methods it renames the steps if I rename the steps it renames the methods it keeps everything in sync I can extract stuff out and remove duplication and most importantly I can press an execute button here and that will actually run into these methods in order so this is based on Eclipse there's tools out there that let you do it like Cucumber Cucumber lets you do exactly very similar stuff each of these tools have got their own buses come at the problem from the same place they want you to talk about what your intention is in natural language and then separate out from the implementation that's the end of the project another thing that you can do is the page object how many people are familiar with this pattern the page object pattern it's a way to encapsulate each page in your application so that the operations that can be performed on that particular page are available as abstractions on the object that represents the page for example if you've got a login page and if you can log in with a set of credentials you'd create a page object that represents the login page and you'd have methods on it such as login as and inside that method you'd be able to do the mechanics of filling in the strong fields and hitting the submit button you do not let those series of four steps leak out all the way into your application code or into your test suite code this way when you need to understand how to log in there is one place in the code base that tells you explicitly how to log in in the system happens and if you ever change it there's only one place to change it so there isn't that sort of diffusion where it starts spreading and this is an example of how you can expect errors and things of that sort so the other interesting thing to keep in mind is that typically page objects return other page objects so if upon successful completion you go navigate away from your login page to a home page then you see that the login aspect that actually returns a home page object and you would therefore sequence your user journeys by sequencing sequence of actions on these page objects each of which would return to a new location in your application so the top layer of your test code doesn't know how to interact with the system on the test it calls methods in the page object so if you had code which logged in as a user that would call the login as method on the login page and supply the username and password and this actually is what interacts with the browser so all the information this has the selenium objects in it all the information on how to actually interact with the system is encapsulated in the page object so as Budgey says if I change the UI element I only need to tell the test suite about it in this one place where that encapsulation with that UI element happens now the nice thing about this is it allows us to use the same test suite to interact with say a graphical user interface or a service interface or even an iPhone client because what I can do is I can turn this class into an interface and then I can have different implementations of that interface I can have a set of implementations that interact with the GUI I can have a set of implementations that interact with the service API I can have a third set of implementations that interact with the iPhone client and what that means is I can switch out a runtime I can use the same test suite and say ok now I'm going to run it against the iPhone version of the app just by changing out which actual implementations I inject run time for the test suite so it's very very powerful as a way of reusing your test suite but only this class needs to be changed yes so well all of the page objects so this particular class knows how to interact with the login page you need one of these classes for each one of the pages all the abstractions that are loathers you don't have to change them this is the bottom level this is the lowest level of abstraction that you would typically find because below this you call framework code by calling in something like Selenium but it calls other arguments no this only calls the driver your automation framework whatever your automation framework driver is so if you've got a Frankenstein driver or if you've got a Selenium driver or a web driver this just makes calls onto the driver for example you see the browser over here that's on to the driver in order to automate whatever ui you're using or a rich client and then it finishes that operation and it turns the new page object to you so all you're doing here is interacting with the driver to drive the system on your test the advantage is that this stays stable and then your level of intentions does not change every time your mechanics changes so if you're instead of changing a link you made it a button or if it happened to over Ajax all of that would affect only this one place not the place that was actually trying to log in as a precursor to finishing the rest of your user journey so you're saying you inject the right driver based on what you want to do that's one way to do it if you wanted to swap drivers out you could do it but just to make it so instead of the Selenium session you use something else so you have an interface to log in page and then you have a class which would be a Selenium login page and then you have another implementation which would be an iPhone login page and the API login page and that's the same thing for each of the interfaces for each of the pages it's actually a strategy for a device yes, yes essentially and it's the sort of thing that you get but if you're using capybara plus cucumber you'd get this you could just swap out the driver at runtime with an annotation and it would just run in a different driver so this is the pattern there are multiple implementations so you can use dependency injection basically you could use like pico or nano to inject and that brings us so that is about how you can actually structure the mechanics of your tests and separate intention from mechanics and Gens was going to talk about how to actually go about creating these suites so I'm going to now talk about the team dynamics in terms of how we create and maintain acceptance test suites and hopefully address your question so the idea is this in the life cycle of a story first you you write what how to solve a customer problem and you need to work out acceptance criteria for that story how will we know when we are done with this story and that involves the customer and the tester or the analysts and the tester working together in order to work out what the acceptance criteria are so that's decided by the customer and the tester working together and then you actually play the story now we believe that you should write unit tests first as you implement code so TDD but the acceptance tests we're not religious you can either write the acceptance tests before you write the code or you can write the acceptance test after you write the code it's fine and with twist and cucumber what you can do is you can write the natural language expression in the tool but those methods don't actually do anything there's nothing behind it so they just all pass automatically so you can write those beforehand the natural language acceptance criteria in the tool before you play the story and then after you play the story typically you'll write the implementations and so the implementations need to be written by developers and testers pairing and we actually think that developers and testers should physically pair to write that because testers may not be technical and that's fine we don't say that all testers should be experts in test automation but and that's why they would pair with a developer who knows how to actually write the code but they may not understand very much about things like exploratory testing and how the test suite is organised so actually have them paired together to write the test implementation and the dynamics of this will change with the evolution of the system when you're working on a new system you're going to have to do a lot of writing of test implementations with an older system what you'll find is that a lot of the page objects and the mechanics are already there so you don't need to write a lot of new codes to implement a new test you'll just be reusing existing stuff generally because you probably have a reuse problem but so then this is done by developers and testers typically the implementations after the story is played so that's kind of the life cycle of how you actually write the exceptions test so I just want to talk a little bit about the role of the tester because kind of tester is a bit of a misunderstood role and it is a role it's not a person so famously Google doesn't have people who are called testers the engineers are responsible for writing the test for their code so they play the tester role so you don't necessarily need to have someone who's called a tester you can have people who are part-time testers the important thing is to have that role and that skill on your team and the other terrible mistake that people make is that they consider that testers are somehow failed developers if you're not smart enough to become a developer somehow you will be smart enough to become a test suite right, because that's really easy so our strong belief is that testers have a complementary set of skills which are just as hard it's like they're harder, I don't find very many good testers in fact good testers are a rarity they're even harder to find than good developers good developers you can read every single one of Don Luth's books and become a really good developer and if you're smart enough to find a good tester you need a person who's like naturally really really smart and gets the internals of how systems work and where breaking points are and those people are really hard to find there's a famous quote that if you write the cleverest code you possibly can then by definition you're not smart enough to debug it it's the same thing with tests actually when you do TDD what you find is the hard part is writing the test once you've written a good unit test writing the implementation is the easy bit it's actually the thinking and the design of the test suite that turns out to be the hard bit in writing good systems so the job of the tester is not to be the person who decides the quality of the system the role of the tester is to be the user advocate they're the person that represents the user and how the user will interact with the system and to make the quality of the system transparent so the team can make decisions about the quality of the system if your testers are primarily working on manual regression testing you have a problem Neil Ford, who's speaking here today has a joke that when human beings do the things that computers could be doing instead all the computers get together late at night and laugh at us and nowhere is that more true than if you have testers doing manual regression testing this is 2013 we should not have human beings wasting their time and we should not have human beings wasting their time doing slavishly repetitive reading pages of scripts and pressing buttons on your eyes I mean, really they should be focused on exploration testing and helping to create and maintain suites of automated acceptance tests this is what we believe testers should be doing so last thing on principle one passing acceptance tests are necessary conditions for being done developers cannot say that they are dev-complete with a story until they have passing acceptance tests that prove that that story really works on a production-like system so dev-complete is meaningless if you don't have automated acceptance tests and that's the really important thing and that prevents this nasty failure mode where project managers deprioritise the tests for the functionality I mean that shouldn't be possible because you can't say you're done until you have the acceptance tests it's really important to use encapsulation to encapsulate the interaction with the system on the test away from the rest of the test suite because that's what allows you to get all these nice things like being able to reuse the same test suite for different clients and also in terms of maintainability and making sure the test suites aren't flaky if I change a UI element or the test break, well that's fine I can just update the reference to that UI element in one place and finally, the acceptance tests are everybody's responsibility the team owns the acceptance test suite not the testers so principle 2 is that we should the test suite should always interact with the system on the test in the same way that a user would do so so more on this point what do we really mean by this if you try to take back doors through the system or there are ways to access the system that are only available in test mode and you use those extensively and primarily to test the functionality of your application you're never going to have anything that tells you whether when you put this in front of a user it's going to work the same way or not and if you don't have that guarantee it just fails the very first criteria for what we said a good acceptance test suite is which is when it passes you know that you can release it to your customers and if you don't have that confidence there's no point spending time building those suites so people have this notion that browser based tests are unreliable what do they mean when they say that in the first place the test fails in CI but when I run the app like when I'm actually going through it it seems fine so why is it that I can press this button and the page gets submitted but why does the test fail in CI or vice versa or the other way around it's usually an indication that the test mechanics and the user interaction patterns actually differ so how could these possibly differ and we'll be looking into that a little bit further common pattern is Ajax so quite often you tend to click a button and you don't realise whether it's a full form submit or whether it's an Ajax submit and whether the page is going to get reloaded or only a small portion of the page is going to come back and these things tend to cause a lot of flakiness and we'll be talking about how we can handle this and a JavaScript heavy application in which the actual UI processing time may be non-zero you typically expect the browser to instantaneously have arrived at a complete state whereas it's not true with more and more stuff actually becoming the browser's responsibility including local storage, painting 3D rendering, JavaScript engines and working at the blazing speeds that they're expected to there's a lot of work that they have to do and there's quite often a significant amount of time that your application spends in the browser and that's a non-zero time and you need to account for that while writing your test reads or else they can become flaky and this is why you need to have tests that run against the UI if a significant portion of your business logic and your domain logic is actually in on the client side you better have tests that run through the UI and this is getting more and more true as we write rich JavaScript applications because JavaScript is powerful enough to actually merit those kinds of applications being written in at this point so some solutions to these problems test your application the way a user might use it so don't take shortcuts what does this actually mean understand when behavior is asynchronous and account for it explicitly if you've got a JavaScript-based drop-down and you're going to click a drop-link and the drop-down is going to show up and the drop-link can instantaneously click on a drop-down value they're going to wait until that drop-down actually shows up on the page before they navigate to the third item of the list and click it so if your user is going to do that think about that process explicitly and encode for it so you click the drop-link wait until the drop-down is present and third element is visible and then go and click it and that's just one of the very very simple examples but you can use this pattern over and over again anytime when you click on an element and that causes for example a server-side call when you have to wait for that and counter example as a counter countering that don't use bare sleep so don't just say I'm going to click on this drop-link and wait for somewhere around three to six seconds that is just doomed to fail because at some point in time your application is going to get slow there's going to be network congestion and sometime or the other it's going to take you seven seconds and your test is going to fail and what happens is people try to get clever and extract that out into a constant and what happens is the test suite fails and you're like well I'm going to increase the constant that I wait for and then your test suddenly takes much longer to run and it takes longer and longer and longer to run as you increase the constant more and more and more maximum time starts from ten seconds and slowly creeps up to 120 seconds by the time you're in the fourth year of your test and why because some test needed that time and if it's hard to write the test sorry and so the solution to that is to pull so what you do is you have a loop and the loop has a delay of like 100 or 10 milliseconds or something and then after 10 repeats then it will time out so always pull to wait for things to wait for things to happen rather than waiting Does this kind of a test build have a real time system Yes it just shows up right First test First test that I've got streaming data or sockets coming through or some sort of comment like interaction to actually the patterns get even more important to wait for this sort of stuff Secondly, if it's hard to write the test for some reason it's not a reason to not write the test or take shortcuts it's actually a reason to have a conversation with the dev team. Your application should not be untestable from the UI level because lack of testability at the UI level means that some visible clue that the user is going to use a need in order to use your application is not available and if it was available you'd use it in order to automate your sweep and if it wasn't available then most likely your user is also going to miss out on those queues so part of what we were talking about one of the roles of the tester is to be an advocate for the user in your system and this is one way in which they can do it if they can't automate the sweep your users are probably not going to be able to use your application This is one of the reasons that it's important for testers and developers to be in the same room if your testers are in a completely different room you can't have these conversations writing tests impacts your architecture and it impacts the way you design the system and creating tests for a system to change the design because it's a pressure on the design writing tests forces you to do good design and so if the testers and the developers can't have a conversation the developers never feel that pressure to create a system which is actually testable and then that just creates a horrible feedback loop which creates horrible un-maintainable code which is then hard to test which then becomes low quality and so forth So some solutions to these problems of asynchronous one of which just alluded to using bare weights there's this library called weight utils it's quite literally the world's most useful stupid piece of code you'll ever find it is nothing but a bare loop which takes a predicate object and checks for that condition to go through it sleeps for 100 milliseconds wakes up to see if a condition is true goes to sleep for another 100 milliseconds and it's like literally two methods this library is two methods and this will change the quality of your test suite Are there any particular tools it works with? No, it's a Java library you can throw it in with anything that you're writing your tests in It's a series you're writing your tests on in Java You can reuse this you can recreate this thing in any language that has pretty much got the ability to sleep and wait for a condition It's really small You can do it in Ruby you can do it in Python Any language that accepts functions you can do it and any language that has got sleep you can pretty much rewrite this code If it's tiering complete you can probably do it You can probably do it For Ajax based tests if your JavaScript framework provides a pre and post call reuse those Intercept those to count the number of active calls you're making What do I mean by this? If you're using, I don't know, a jQuery or if you're using Prototype there's $ajax every time you make it wrap that function and when you make the call when you initiate a new JavaScript query it says increment the count by one and when that comes back either successfully or fails, decrement the counter and every time that counter is zero you know that there are no pending outstanding JavaScript requests so that's a very easy way to detect if all JavaScript activity is complete and that's one way by which you can start Ajax operations know when they finish even when there might not be any visible changes to the page and that's another way of preventing random bare sleeps in your code Here's an example of how you would do it with I think this is for Prototype but you could do the sort of stuff with pretty much any framework don't worry if you can't read this we've got the gist up there it's up on github so you can always go and copy it from github and these slides are on the conference website so you can download the slides from the conference website and get the links from there we want to make sure that there was a pattern we're not going to really talk you through the code be sure you can follow so remember make time to go back and refactor your tests use layers and encapsulation separate high level intent and low level mechanics and use a paint project in track of the system on the test taking into account things that incur delays and things that you think cannot necessarily be tested for such as waiting for stuff to appear or disappear on the page and Ajax activity and don't use bare sleeps which brings us to principle number three how to curiously curate the structure of your test suites so when you first start with a new application your very first story as a mmm I want ding so that yay and you'll have acceptance criteria for that thing which is given an existing state of the system when I perform these actions I expect this finishing state so you write your very first test suite which tests this very first requirement and you feel happy because you've got exceptions tests that pass that are automated yay but then over the course of the next several weeks you have to do that a lot for all the news stories and what you can end up with very easily is a test suite for every story and that's a terribly bad thing to do because what happens is your test suite tells the story of the evolution of your system not the story of how users interact with that system which is a different thing so what we propose instead is what's called journey tests so remember the test is supposed to interact with the system the same way that a user would so how what are the main business flows through your application so imagine your writing tests for Flipkart you buy a product so first you search the product catalog you add the product to your cart you check out, you create a new account provide your address details, your credit card details or cash on delivery for Flipkart you complete your order verify that it's created verify that an acknowledgement email is sent that would be a user's path through the system so you'd have a test suite for byproducts and test suites for other journeys through the system so I have a new requirement or a new story as a customer I want a gift wrap option so that I don't have to wrap gifts and post them myself so what do we do? do we create a new test suite for this requirement? no what we do is we look at the existing journeys and we see if instead we can modify an existing journey so we would take this journey and modify it and add something and verify that the order that was created has the acknowledgement email says we will gift wrap your order and not include the invoice for example so always try and diff existing journeys instead of creating new ones and what that means is that the testers have to work with the customer or the analyst to understand the journeys through the system so the testers become really good at actually understanding the business value delivered by the system which is another reason and testers will often have feedback on that to the analysts and to the developers well this actually won't work because the user would be doing this so this is yet another reason why having testers outsourced or in a different room is a bad idea because you can't have these conversations and they're really important so you need to identify the main user journeys through your system that's part of the analysis of creating maintainable test suites and a journey is the path that a persona takes through the application to solve a problem for that user that's what a journey is most applications don't have a lot of personas and in fact usually as part of an analysis effort to create a new system you'll actually identify one key persona and then maybe some alternative one so if you've got tons of personas in your stories that's a sign that something is wrong and at the beginning of development of a new system you'll be creating a bunch of test suites and then as the system evolves you'll find you're creating fewer and fewer new test suites and instead changing existing test suites so that's part of the evolution of the system a good rough guide is if your test suite increases in execution time by an order of magnitude it means that your user should be able to do an order of magnitude things more in your system so it's okay that when you start your application after two minutes and when you finish release two or three it's at twenty minutes because your customer can genuinely do ten times more things at release three than they did on the day you started but if at year ten your suite is at three thousand minutes and your customer can really only do two new things at this point then obviously there is a discrepancy between the amount of things a user can do and the amount of things your tests are testing and that should be a key indicator when you have when you need to actually reconsider whether you're doing journeys properly this is a concrete example another concrete example we're running through this very quickly and it is a bit hand wavy because it's a bit contrived we wanted to take a very obvious example of something like Amazon or Flipkart but if you want a slightly more concrete example I had to do this exercise for the product I'm working on right now just yesterday so contact me after the talk and we can talk about that as well but for example how do you weave features and journeys in with each other in an application that lists products you may have searching as a feature you may have pagination as a feature with each of their own requirements so searching may have things like searching for a word, searching for a phrase searching for a quoted phrase should bring backwards with everything paginating results by page size what happens before the first page but after the last page all of these things can be acceptance criteria at the individual story level but if you decide to automate each and every single one of these into your acceptance test suite you may run into a problem so that's what your story test would look like for each of those things and there is a point where you might start with this because you don't have a full journey in your system but very soon you should be changing these into your journey test and we can see it let's take a quick look at how we can do that a journey of buying a book however would look something like this log in as user bob, search for quoted, my friends dead make sure that three pages of results show up verify that all my friends are dead by every month and are on the first page add two copies of the book to your shopping cart gift wrap one of them and proceed to checkout this is a really complete flow that actually tests what a user might do it exercises search functionality it exercises the quoting part of the search functionality it exercises that there are multiple pages of results that show up and therefore test the pagination functionality and when you put all of these together it actually genuinely verifies that the benefit of searching, the benefit of pagination and the benefit of being able to gift wrap or gift wrap that book is all available to your customers rather than necessarily testing what happens if the user hacks that you are and types in page number 3008 because that's not particularly useful to find out at this level and this is why testing, having a testing role and the testing skill is really important because developers often don't understand how to do this knowing how to do this is a skill and it's something that testers are really good at so solutions just to recap extract journeys from your acceptance test if you start out with story level test there is a time and place in order to curate your story level test into something larger, your journey test make them fast and make them run first put your journey test before the huge big bulk of your end of your negative test of your test test what happens when the user hacks that you are what happens when somebody inserts way SQL like characters into my form fields do I have SQL injection tax, do I have JavaScript injection tax all those negative tests can go into a separate suite that runs after your journey test these deliver quick value and tell you very quickly whether any key user functionality is broken and you want to get that feedback fast and so the other thing I wanted to say is that exploratory testing will often change journey tests so testers will change these tests in response to exploratory testing they do and the journey test that exists will be reused as part of exploratory testing as well so there is an interaction between the exploratory testing activity and the way the user journeys work you may want to automate 90% of your actual journey and then stop at that point and test two or three alternate parts that are less likely but have proven to be fragile in the recent past finish that up manually and then continue your journey test from that point on as an example of what you might want to do test the most likely parts in the journey test do not test every possible part through the system because then you'll just have a test suite that runs forever and extract negative tests in edge cases into a regression suite after your journey test have them, run them pay attention to them but do not put yourself into the grindhouse of thinking that every single test that you possibly write has got to run right at the start and run with every check-in make sure that you can pull the most valuable test first and get feedback from them first before proceeding so my favourite quote about quality is from this guy W.R.D. Stemming who is very famous one of the people who helped create the lean movement in Japan after the Second World War and he was working in manufacturing but he has this to say about quality cease dependence on mass inspection to achieve quality improve the process and build quality into the products in the first place this is my favourite quote about quality and it has two important implications for testing number one testing is not something we do after dev complete testing is something we do all the time the quickest way to fix a bug and the cheapest way to fix a bug is to enter virgin control in the first place that's why we have unit tests that we run before we check in the second implication is that testers are not responsible for quality everyone is responsible for quality and the reason for that is quality is a business decision you may decide that it's okay to sacrifice quality for time to market and get a product to your users quickly that is of lesser quality that may be a valid business decision I mean for example Microsoft Office which would be an example of such a strategy or you may decide that quality is very important to your customers and you want to sacrifice time to market in order to have high quality it's a business decision and that's why it's a decision that the whole team has to make testers are not responsible for quality everyone is which leads to our fourth principle which is that everyone owns acceptance tests if the acceptance test break it's not the testers problem to triage it when the acceptance test break a developer and a tester should triage that problem together and then work out what to do so what do we do when an acceptance test break first you need to find out why there's four possible group causes you could have a flaky test environment the application may not be deployed properly there might be a configuration setting wrong number two there could be a bug in the actual test so you might there might be something that you have to fix in the test number three maybe your assumptions changed maybe the system evolved we changed the requirements and because we changed the system in response to that requirement change the assumptions of the test are no longer correct and so we need to change the test or number four maybe the test actually caused a bug that would be fabulous so the first thing you have to do once you triage the problem is fix it and then crucially you need a guard to prevent that problem from happening again so if the problem is environmental that means you have a configuration management problem you need to make it so that you can provision the testing environment automatically using information conversion control so that test environment provisioning is a push button reliable repeatable process if it's a bug with the test then you need to well I guess the main thing with all these things is this if your acceptance tests are failing all the time that means that your unit tests are not good enough so there's a feedback loop if you've got good unit test suites you should find that your acceptance tests fail rarely and when you have an acceptance test failure what that means is there's a missing unit test so the first thing you do once you fix your acceptance test is write a unit test to make sure that the acceptance test won't break again in response to that so what we're doing is optimising our test suite to detect failures fast which means running the unit test first and paralysing the automated tests but optimising our process for time to fix tests which means that we need to make sure that if we find a problem we put the guard in as soon as possible to make sure that problem doesn't occur again there's one more condition with your acceptance test fast what in fact should have failed right so if you're accept which brings us on to the next thing which is intermittent failure so thank you for that five minutes setup one of the biggest problems with acceptance test is the test pass you run it again the test fails and you're like oh well it's a flaky test I'm going to run the test suite again oh it passes now the test passing you don't necessarily have confidence that that means there's actually that the system is reliable maybe it should have actually failed so flaky tests people like to say flaky tests are useless flaky tests are worse than useless because they cause you to lose confidence in the test suite you don't trust the test suite anymore and that means that people don't pay attention to it which means that it rots so what do we do when we have an infection when somebody becomes infected with a disease what we do is we put them in quarantine and we should do the same thing with flaky tests we have a separate test suite or separate run of tests which is for flaky tests so if you have a deployment pipeline you have a separate stage to run all the flaky tests and so what that means is when the non flaky tests fail that means I'm paying attention I think there's a real bug if the quarantine suite fails I don't really care but the crucial thing is to monitor the proportion of tests in the quarantine suite if all your tests are in the quarantine suite that's bad I'm not going to go into detail there's more causes of intimacy Martin Feller has an article on non-determinism in tests that can be caused by dependencies between tests by things like memory problems so in some cases flakiness is actually we'll briefly talk about external systems which is another cause of test flakiness if your system has to interact with another system that can cause flaky tests so a well-designed system should not have all tests calling the external system what you can do instead is put in a proxy so I'm going to skip straight to this because it's the important bit what you do is there's no problem in computer science you can't be solved with an additional layer of abstraction so what you do is between your system under test and any external dependencies you put an abstraction layer and what that abstraction layer is is it could either pass those calls to the system under test or it could call a mock or a stub which simulates it or you could have a proxy and what we've used this pattern successfully before the abstraction layer is a proxy and what it does is you run integration tests against the real external system but you record the calls and you record the responses and save them in a flak text file it's the simplest thing to do or maybe in a day space if you want to get clever and then what you do is run those integration smoke tests before you run the acceptance test suite actually maybe you better finish this that's exactly what you're saying for example if you've got a tax system that returns tax codes exactly every day that the tax codes for I don't know pick Alabama that the tax codes for Alabama are going to change they've got a really weird tax code system in the US for that one but the fact is if you recorded that request once and then had a copy of it there's absolutely no reason why you can't reuse that in your test suite until such time has you know for a fact that the tax code system is going to return different data to you so recording proxy in places like this really helps and periodically expire the cache against the real system once in a while to make sure that you have the latest copy of the data and this will prevent you from actually making the third party system call every time and run the acceptance test suite only the integration suites fail so you're going to have a small number of integration tests that actually make the real call to the system run those first to make sure that everything is working fine over there and if those run green then run the rest of your acceptance test suite which brings us down to principle number five because most often acceptance which is that acceptance test suite are responsible for managing their own data and the reason this segues from the previous one is because a lot of time data does come from external systems and in order to be reliable this data needs to have certain qualities such as so there's three different kinds of test data test specific data is data who state we verify at the end of the test so if you're testing buying a product you would verify that the person's account have been debited there's also a test reference data which is data you need for the test to run but you don't validate the state at the end so you don't validate the user's address at the end of the test to find out if they bought the product successfully but you would need to set up the user with an address in order to run the test and finally application reference data which is data required for the system to start up like country codes or tax codes but no it's not actually set up as part of the test suite so it's really important to ensure that our tests can run independently of each other for two reasons firstly we want to be able to run our test suites in parallel so that we can run them faster secondly if your tests are coupled to each other so if test one sets up the data for test su two which sets up the data for test three what happens if I change test one because I have a new requirement or because something changes maybe test three breaks because the data that test three was expecting is no longer there so if you have dependencies between tests it can cause flakiness and intermittent test failures it also means you can't run your tests in parallel so it's really bad and the way to solve that problem is have tests set up their own data and what you should find is if you're writing good journey tests that happens automatically in the process of doing this journey test that Budley showed earlier I have to create a userbob and I log in as that user and I'm actually creating the state as I go along for the test but this test should create its own state it should have no dependency on data that I've already created in the system either as part of setup or as part of another test and in particular what's really vital is that you don't use production data dumps as the basis for running tests so we should not be running data directly into the database acceptance tests should set up in their own data because it becomes really hard to manage those production data dumps and it increases the time it takes to run the tests so in conclusion treat your acceptance tests like production codes always interact with the well these are the principles always interact with the system on the test like the user would continuously create your user journeys acceptance tests are owned by everyone and they should manage their own data and the takeaways that we've presented is everyone's responsibility it's a business requirement it's not just the testers we create maintainable test suites by continuously curating our journeys having testers and developers and users working together on that curation process we need to treat our test codes with the same care as production codes using capsulation, refactoring source control all these other things we use to manage our production code and finally it's not a good basis for the curation and maintainable acceptance tests you need to think about user journeys and how users interact with the system at the conference if you want to talk to me about those journeys that I created yesterday or any other patterns so feel free to follow us to the questions but we'll let you go now this is the end of the talk