 Welcome everybody. The first thing that I should say is that this is billed as a kind of a workshop and if I'm honest it's more of just a presentation. I'm going to talk through some ideas but let's try and make it as interactive as we can. So if at any point I say something that you think is unclear, if you disagree with it, if you just think I'm talking rubbish and you want to point out a better way of doing things all of that is absolutely fine. Just interrupt me and I'll try and make sure that we can have the conversation that's going on. But I'm not intending to do classic workshop activities with people interacting with each other today, I'm sorry. So we're talking about acceptance testing for continuous delivery and this is the kind of schematic that I tend to use for a deployment pipeline that shows some of the common stages in a deployment pipeline. And I think it's reasonable to say that anything after the commit stage you can kind of think of as kind of representing some form of acceptance testing. But what I'm talking about specifically are these tests and I'm going to talk a little bit about what I mean, how we define those sorts of tests but fundamentally what we're trying to do between those two different stages the commit stage and the latter stage of the pipeline is to kind of separate out the technically focused developer centered testing from the user focused testing that we'd like to evaluate our software. So these bits are kind of saying does the code do what I as a developer think it should do and these bits are really saying is the code useful as far as we can tell in the test is it going to do what the users need it to do. So that's the kind of stuff that I'm really talking about. The first thing that we can talk about in terms of what is acceptance testing it asserts that the code does what the users want it to do. There's a nice way of integrating this into your development process which I've used on a few teams that I've worked with which is that you kind of build it into an automated definition of done. So if you have a user story of some kind as it gets close to the point at which you want to play that story you're going to identify a series of examples or acceptance criteria that are going to describe if I deliver this piece of value I would expect this example to work like this. And the definition of done bit is that the team kind of agree that they're going to do a minimum of one automated acceptance test for every one of those acceptance criteria for every story. And if you adopt that you get very quickly to the point of having fairly good behavioural test coverage of a behavioural system from the point of view of the user. It means that you can kind of create these specifications work through them and kind of iterate down to a successful outcome. They also assert that the code works in a production-like environment. What we're trying to do is that we're trying to evaluate not just our source code changes but our configuration changes our deployment changes, the specification of our infrastructure all of those things, so any change that is destined for production flows through our deployment pipeline and can be evaluated at least in part in the context of one of these sorts of tests. And as I've just said we want to also test the kind of configuration and the deployment of our software as well as the behavioural. We also want feedback in a timely manner. We'd like to find out just in time that what we've just done is a crap idea rather than just too late. So we want these to be efficient in terms of our time to find out what's going on. I think there's quite a lot of terminology in this space. I think for the purposes of this conversation today you can probably treat all of these things as similar. I can give you an exposition on how the subtle differences between each of these things but I'll bore you to death and it doesn't really matter. Acceptance testing, acceptance test driven development, BDD specification by example, executable specifications. These are all these days used at some level as synonyms for what we're talking about. The terminology that I tend to use is that we are building acceptance tests that are created as executable specifications for the behaviour of my system. That's the terminology that I tend to use personally but all of these things are in common use. I have a personal annoyance with the use of BDD because BDD was originally designed to improve how we taught tester and development but now it tends to be talked about in the context of building high level functional tests. Which are good but they're not enough. So a good acceptance test is an executable specification for the behaviour of the system. We're trying to capture behaviours that are desirable in our system in a way that we can work against those specifications to create a desirable outcome of some kind. Here's another kind of diagram that I tend to use in the context of continuous delivery quite a lot. This is a feedback diagram. I think if there are several foundational ideas in continuous delivery but one of the seriously foundational ideas is we're trying to get high quality feedback fast on everything that we do. This is kind of the one level of feedback model in terms of testing strategy. At the outside of this feedback loop we want to have an idea. We want to get that idea out into the hands of our users and then we want to figure out what our users make of the ideas. At the inside of the feedback loop we're going to do test driven development, do very fine these sort of developer centered tests and in between these are the things that we're talking about today. We're looking at these executable specifications that evaluate the software from the perspective of users. The reason why that's important is because however great our TDD practises are however good our unit testing is there's a subtle difference between the behaviour of the system from that perspective and the behaviour of the system from the user's perspective. A silly war story. I once worked on a fairly large project we were building a point of sale system and there were about 200 people working on this project. We were doing full on TDD extreme programming all this kind of stuff and at one point somebody deployed the software into a test environment and found that it didn't work and then they went back version by version and it hadn't worked for the last three weeks. So you need to do something beyond. So all of the tests were passing but the software wasn't working. So you need to do something else. You need this kind of user centered testing as well. So what's the problem? Why is this an issue? I think there's often a disconnect between what users want of the system and what the software delivers. And what we're trying to do is we're trying to bridge that disconnect. The problem of defining what the software should do is a difficult problem. The problem of defining how the software should do it is a different difficult problem and all too often what we tend to do in our requirements process and in our testing is that we can flake those two ideas together. So we want to try and tease those things apart and treat them as separate activities through our development process and our testing strategy to reinforce that kind of thinking and separation. So let's not try to solve both of these really hard problems at the same time. Let's try and pull them apart and treat them separately. In lots of organisations, the regression testing is done like this. There are tools. This is one where you have a manual test case. You capture a series of steps and then some poor person has to sit through and go through and follow those steps. If you've ever done anything like that, if you're anything like me, you get to step 15 or something like that and it's different. It just doesn't work very well. These things are messy and not very high quality. So this is saying, open a browser, the browser open, go to bloody, bloody, bloody, expected results and so on and so on and so on. Somebody spent real time, valuable time where they could have fun maybe drinking beer or something. This is slow, low quality, expensive, unreliable, error prone, fragile and results in hard to understand test cases. The next step is this and say, okay, so let's not do the manual testing thing. Let's automate our functional test. That's a good idea. The first time that you do that, you tend to end up with tests that look a bit like this and this is what I call clickity, clickity tests. This is just kind of recording all of the this is completely about how the system works rather than what the system does. So this is slow to develop, low quality, expensive, unreliable, error prone and fragile. Yes. Yes. Yes. Kind of, but I think more than that what they're doing is that they're confusing what the system should do with how the system achieves the end. And what we describe is a process where you can completely separate those two ideas. That's what we're going to try and work through and get to it. Any more questions before we move on? Everybody okay? Good. So there's this kind of disconnect and we've got to build a bridge between these two sides of the picture somehow. We've got to figure out how we're going to do this. Wouldn't it be wonderful if we could kind of establish some sort of shared language of need that would capture the intent of our software without saying anything about how it worked? So, technique. Always capture any requirement from the perspective of an external user of the system. This is the classic kind of agile user story kind of thing and I don't really care whether you do as A and X I want Y so that's Z or given when then or any of those sorts of things. None of those sorts of training wheel templates don't really matter very much so nothing about how the system works. If you've got user stories that say things like add a new column to the database, it's wrong. It's not a user story. The users don't care whether there's a new column in the database. You want to capture the behaviour, you want to capture the intent. Avoid any reference in the specifications of your executable specifications to how the system works. You want to say, you want to kind of think about it from the point of view of the driver, not the point of view of the mechanic. Focus on what the user wants of the system and try and capture that. Link the executable specifications to user stories. Make the step, a tiny one between user stories. I have several clients that don't bother with user stories anymore. They just write the executable specifications because it's essentially the same thing. So try and think about those sorts of terms. What we're trying to do is capture the intent of the software and try to capture the intent. As I said, there are a variety of kind of user story templates that you can apply. This is a kind of classic one as a user. I want something or other so that I can achieve some benefit and then we've got these acceptance criteria and these are the things that we're going to write a test case for each one of these. What I usually do with a team is that by default you're going to have at least one automated acceptance test for every acceptance criteria. In exceptional cases somebody can argue that we don't need to do this one or that one and then the rest of the team will argue them out because that's stupid usually. So next, make the definition of done a minimum of one acceptance test per criteria. These kind of plugs it into the development process. It means that you're driving the development process from testing initially from these high level these behavioural specifications for the system from the perspective of the external user of the system. A good kind of sanity check when you're writing requirements and these test cases is to imagine the least technical person that you can think of who understands the problem domain and if they were to read the test case they should be able to understand what it means. It doesn't necessarily mean that there has to be in a human language but it has to be a language that's simple enough to pass even if you're a non-technical person and I'm going to show you some examples through but we're trying to get it's not a requirement of this that end users can write the specification but I don't see that personally as a necessary goal but it can be a handy side effect but they do need to be able to read it. Avoid technical stories always try and get into the fundamental user needs. Make each story, each specification as small as possible. Break the development process up into many, many small steps rather than fewer large steps and that makes the specification easier the testing easier, the development easier the release deployment easier the deployment as you go into production all of these good things. So what's so hard? People have been trying to do this kind of functional test for decades they've been trying to do this and usually what tends to happen is this first bullet the test break when the user interface well when the system changes particularly the user interface but any aspect of the system that the test is talking to it tends to be fragile. What this is a problem of coupling is a problem of coupling between the test case and the system on the test. So we can kind of start addressing that by separating the concerns more and reducing the coupling between the test case and the system on the test. The history is littered with poor implementations of this I particularly despise UI record and playback systems they're kind of so fragile you kind of record it once and then the next time you change something it breaks because some box is a pixel to the right instead of a pixel to the left or something like that. I hate record and playback of production data because it's big and heavyweight and it's not specific enough to evaluate what we're trying to do. I don't like dumps of production data at all really there are some specific scenarios some cases for testing where there's an argument for production data but if all you're doing is replaying production data through your system as your testing strategy you don't really have a testing strategy you have a I hope that kind of works out sort of strategy. And nasty automated testing projects there's been a lot of those over the years there's been a lot of nasty automated testing projects that kind of try and force fit and approach and there's still a lot of those things that are advertised and used these days. I don't think any of them solve the problem I think what you need is a broader strategy and you need to think about how you're going to use the tools rather than the tools themselves. The next thing to consider one of the projects that I worked on that was kind of foundational in my part the thinking of continuous delivery was we were building the point of sale system that I mentioned before and one of the insights that we had that started me down the road to thinking about things differently in that context was this was a very large project there were about 200 people working on this software development and there was a team I think it was four or five QA people who were dedicated and focused on automated testing they'd got a suite of a bounty if I remember correctly it was about 12 automated test cases and presumably at some point in ancient history those tests had run once but all the time I was there they never worked because what happened is you got this team of four or five people who were trying to keep up with a team of 200 developers that were going mad changing the system it just didn't work and one of the insights that we had on that project was close the feed battle it's developers that are going to commit a change that's going to break a test therefore it must be developers that are responsible for the test and as soon as we did that we started one we started getting the tests more stable and having them passing more regularly and two we started building more tests so developers are the people that will break the test but this is particularly true if we can get to these executable specifications because if it's a specification that says what the system does but not how the system works if that test fails it means that the software is not fulfilling its specification so we can modify it to try and make it work so this last one a separate testing QA team owning the automated test is a toxic antipattern that I have never ever seen work not once ever and everybody tries to do that nearly and it just doesn't work so anybody can write the test whoever's got the best picture for what the specification should be doesn't matter it can be a developer a tester, product owner a business analyst, don't care it doesn't matter but as soon as the test start running the developers own the responsibility for the tests and it's their job to keep them running specifically the developer that makes a commit is responsible for fixing any defects that arise out of that commit so developers own the tests and that closes the feed battle so I think that in this kind of context traditional QA is one of the roles that changes most significantly it doesn't go away my preference is to have professional testing people as part of the team at least on the sorts of software that I was usually involved in writing I prefer to do that because professional testers think about testing differently they have different kinds of insights and the sorts of things that we want to do so what we want to do is that my suggestion my advice is that you want to automate all regression testing if you have any test of any kind that you need to run twice you automate it but there's a role for exploratory testing that's very valuable so human beings are going to do different kinds of things to machines what we're trying to do is that we're trying to get a repeatable reliable process to release software into production and if it's repeatable and reliable it kind of rules us out as a species because we're not repeatable and reliable so we don't want human beings doing the repeatable and reliable stuff we want the human beings doing the stuff that we are wonderful at which is kind of exploring and trying out wacky ideas and trying to break it in different interesting ways and all that kind of thing so I think that becomes the role of the tester the other part of this is that I think there's plenty pattern to have a separate QA team at all so I think that people in those sorts of roles should be sitting with and working on the software alongside the development process and the development activity so you're trying to set up the process the other kind of aspect of this is that you don't want to build mini waterfalls inside your iterations or sprints you don't want to wait until the software is finished and then get the testers on the team to look at it so that you're yourselves as a team so that you're both you're all making progress and you're all going to come to stuff at the same time which means that the testers are evaluating the software while it's in the process of being built and again that's another one of these things that kind of calls out the need for continuous integration if we're able to build these changes incrementally step by step by step it means you can more deeply integrate the testers into the process okay? yes yes yes yes yes yes yes sometimes developers do feel that that they're wrong jezz will tell you that the data says that if you applied the coins these and other techniques of continuous delivery people, the teams will spend 44% more time on valuable work this is an investment there's a great story of the HP laser jet team and they did a before and after breakdown of their development process the kind of after picture they had eight times as much effort proportion of their development effort going into creating new software but there was a massive amount of effort that was now spent on automated testing continuous integration and all those sorts of things so these activities are unfortunate that as a species one of the things that we're not very good at are the kinds of problems where we have to be disciplined now to get a reward later on it's like dieting on exercise which you can see I'm an expert at if I know that I shouldn't drink glasses of wine it makes me too fat but I quite like glasses of red wine and if I see one I tend to drink it and that's not a good idea but I don't think about it at the time similarly development teams ought to know by now because the data is in that automated testing allows them to go more quickly allows them to move forward with more surety yes you spend more time doing some kinds of problems but that time is paid off and very little time analysing failures in production and debugging and trying to debug logs of what was going on in production because those sorts of failures don't happen as much there was a there was a survey of production failures in software systems done a couple of years ago and they looked I think it was about 6000 different projects and they looked at the different projects and tried to identify common patterns in the nature of failures at the point at which a production system falls over the first thing that's kind of amusing but not completely relevant was that the most common line of code at the point at which a production system fails is a comment that says should add exception handling here but the other finding that they came up with was that 72% of production defects are caused by the kind of errors that all programmers put into all software in any language off by one errors the scope errors getting a conditional statement the wrong way around all of those kind of common mistakes and if you just do that they didn't say specifically unit testing or automated testing but if you have a disciplined approach to testing you can eliminate that almost that 72% of defects in production from the data that I the subjective picture that I have of people practicing continuous delivery is he can do a lot better than that the defects in production are probably in the order of reduction by two orders of magnitude that's a huge huge saving for a development team but at the cost of having to do a bit more work on the testing so I think it's a good investment the most efficient team that I ever saw was also the most focused, the most diligent on automated testing I'm going to get there yes, I'm sorry so the question was I said that I don't like using production data so how do you get the data in for the test I'm going to describe that and I'm going to describe some strategies for that but the quick synopsis in answer to your question is that I think that the vast majority of data in test should be synthetic and I'll explain why as we go through okay any more questions I'm going to talk about all of these things the rest of this talk is pretty much going into each of these in more detail yes so the question was what if the what keeps changing and you want to separate what from how if that's really happening what that's saying is that you're exploring and you don't understand what the problem you're solving is yet which is fine, that's kind of part of software development but what that means is that you're coming up with some behaviour of the system that you no longer want and you want now to be different so you have to throw away the old testing and you write a new one but that's going to be true however you organise your testing if the requirements are changing yeah any more how do you come up with the acceptance criteria the easiest way sorry say again yes so really the best way to do it if you have a user story just think of an example that would demonstrate that user story let's say we were going to write a test for buying books in Amazon so I would start to come up with an example that was a bit more specific I'm going to go and search for a book or continuous delivery I'm going to select the book I'm going to put it in a shopping basket I'm going to go to the checkout I'm going to pay for the book I'm going to then confirm that I now own the book and pretty much the language that I've just used is the language that I would use in my test case so the specifics of I'm going to order this book I'm going to pay this much money and all sorts of things are the example that I'm talking about that's the acceptance criteria that we're describing okay yes the question was of developers and how overloaded I would disagree with that the developers are working differently but they're trading off one set of work for another set of work maybe so maybe it might mean that that's no longer a good distribution of people in a team that's possible but what I'm talking about as I said I think this changes the QA role probably more than it changes all of the roles are effective when you adopt continuous delivery but QA in particular I think changes because QA is classically seen as kind of the gatekeepers of quality and I think that what we know is that that doesn't really work and so you've got to build quality into the system from the outset you've got to figure out how you can design quality into the system and build it the most effective way that I think that we know how to do that is that you aim to make your systems testable and you do that by writing the tests first and using them to drive the development process it's what the driven bit means in test driven development really and so yes the developers are going to spend time writing tests but it's an odd kind of idea really that they don't anyway if you're writing software and you don't know whether it works what does that really mean it means you're not doing a very good job probably I think that might be true so if I think back to the teams that I worked on there were usually a team of four or six developers and usually there was one or two QA people on that team in addition to the developers usually which isn't very different to the kind of ratios that you're talking about so we're not talking about sacking loads and loads of people or anything like that that's not what I'm trying to suggest at all but I think you've got to establish and close this feedback loop you've got to get this effective process and you've got to drive the development from the tests which means that the responsibility must be taken over by the development teams and then the testers are supplementing that effort they're bringing expertise insight and the kind of manual testing that does these pretty pictures inside my head so I can figure out how to extrapolate and use the software in another context and all that kind of stuff is what they're really doing we when we were built so some of these ideas are based on a project that I worked on a start-up when we were in the middle of writing the continuous delivery book and we've got a financial exchange and when we were doing that we very highly valued our testing people and the testers liked it more they enjoyed their work more than they had before following some stupid manual script yes no I think those things largely become unnecessary if you take this broader strategy for the developer-centered testing and the kind of acceptance testing that I'm talking about so the acceptance tests at one level are kind of a super integration test they're evaluating all of these things if I, let me just skip back to the picture that we started out with this is kind of my default model for a deployment pipeline and this pretty much defines loosely anyway the kind of testing strategy that I recommend so you have this kind of developer-centered testing aiming at the the technical quality of the software that we're building and that's best performed through test-driven development you have this acceptance test to driven development, that's kind of from the stuff that we're talking about from the perspective of users and that covers all kinds of behaviours and it talks about non-functional requirements and functional requirements, security issues scalability, resilience, performance all of those sorts of things as well and if you've tested all of those sorts of things I'm not quite occasionally, very rarely there might be an ad hoc need there might be some utility in doing a specific integration test for a specific piece where you can discover a common kind of failure more quickly but mostly what we found was that those system testing as a whole is really what this is and integration testing is kind of covered by that because we're testing the system in a life-like deployed environment in a production-like test environment so it doesn't add very much I don't know what we'd learn by doing system testing or integration testing on the whole sorry, say again it's not user acceptance testing because the users aren't doing it it is acceptance testing yes, it's from the perspective of an external user of the system I'm playing with words slightly I think that it is very similar to the goals of this are very similar to the goals of user acceptance testing but this is a dramatically more detailed level of testing than you get with human beings and it's a lot faster and a lot more efficient and a lot higher quality process yes so I can give you another little war story so there are two aspects to this so there's the kind of technical assurance that you can get from your testing strategy and from your deployment pipeline and that kind of stuff and there's the the cultural assurance that you might need in an organisation so when we were building our exchange we for a while the business didn't trust that we tested it enough and we had the conversation we said we're running something like 50,000 or 60,000 test cases on every commit and we fail the build if anyone test fails and you're not going to find any bugs but they said nevertheless so they decided that they wanted to test before we kind of turn it on live in production so we said okay if that's really what you want to do so we did that for a little while and after about three months they came back to us and said you're right we don't find any bugs and this is a waste of our time do you mind if we stop doing it? and that seems like a so you know you've got to build up the trust at some level organisationally but what I believe that what I'm describing is a significantly higher quality approach and outcome than any army of manual testers the one thing that's not true that you're always going to end up with problems that you don't anticipate in production but those are going to be less common they're going to be not the sorts of things that you find by just randomly banging on the system they're going to be the edge cases, the corner cases those are things that you don't think about testing everything else you'll have tested when we were building our exchange we got completely test obsessed we tested everything that we could think of about our system and if a defect's happening in production which didn't happen very often then we'd add a new test and that's that kind of failure in future so we were in production for 13 months and 5 days before the first bug was raised by a user so you trade all of that stuff around bug triages and all that kind of stuff you trade that off by spending some more effort on doing all of the automated testing yes sorry I can't hear hello so I was just talking about that moving towards that day of while leveraging that CICD pipeline isn't it good to have that product taste suit level execution instead of calling that acceptance taste like now the world is moving towards that within the 11.9 second I wanted to deploy my code into prod so isn't it good to have that product taste suit level execution we should consider your functional regression internal integration or external integration and API test as a productive taste suit and execute it and then release the code into prod so that's what the things is coming into the mind while delivering your code within 11.9 seconds so productive taste suit execution will be another way of pushing your build with quality within zero defects so I'll give you the consultant's answer is that I think it depends it depends on the problem domain it depends on the seriousness of the software I think of mostly the kinds of systems that I ended up working on through the part of my career that I learnt these lessons I think of as what you might think of as high consequence software there was lots and lots of money flowing through those things these days people's lives depend on some of the systems and all that kind of stuff so the level of rigor, the level of assurance that you want in those sorts of circumstances is probably slightly different than if I was writing software for my mom's cake shop but nevertheless what we're talking there are multiple values here one of the properties of this test driven approach to development that I value very very highly is the impact that it has on the quality of the designs that I produce in my software it allows me to build a more modular more loosely coupled software with better separation of concerns but you get that by trying to make the software testable and that kind of an outcome and so this strategy I've applied this strategy through clients and through projects that I've worked on in all sorts of different problem domains in all sorts of different technologies and I think it's just better I think this is just a better way of working if I'm honest I think this gives you a better outcome so you can kind of cars the kind of test one of the things that lots of people talk about is the test pyramid I'm not a big fan of the test pyramid because you don't get enough acceptance testing and maybe it changes it depends on the nature of the software the picture of the wheel that I draw of the deployment pipeline that's a mechanism a process that's optimised to get fast feedback if you can get the answer to there's no more work to do on this piece of software before I release into production and I can get that answer back as a red squiggly line in my IDE I would take that but I don't know how to do that for the sort of software that I tended to work on and so a deployment pipeline is a compromise it's a compromise between immediate feedback and a level of assurance so what I usually say to customers is that from the commit stage I'm looking for feedback in under five minutes with a roughly 80% level of confidence that if all those tests pass everything else is going to be okay and then from the acceptance test phase I'm looking and trying to get an answer and performance tests and everything else I'm trying to get a definitive answer is there any reason why I can't release this into production in under an hour and so if you can answer all of those questions you can get an answer back in nine seconds cool that's lovely but it means that your software is dead simple which is nice that's a good thing but not all software is like that so what I'm talking about is the strategy so the fundamental point here is to optimize for fast high quality feedback I think I think to get a high quality outcome you've got to test from the perspective of the technical perspective the TDD stuff and from the user's perspective and for me at the starting point if I was writing software for my mom's cake shop I would still have acceptance tests I would still do TDD and I'd get answers in seconds minutes anyway but back from all of that system yep okay let's move on so we want to try and separate what the system needs to do from how it is that it does it so here's a picture of a system and we've got some groups of users that are interacting through different channels of communication with the system and typically if we were going to write automated tests for something like that we would effectively replace the users with a whole bunch of test cases like that if one of these APIs were to change in a way that broke a bunch of test cases like this the only way that we can kind of manage this situation is to go to each individual test case and correct it that's not very good but we know how to solve that kind of problem we can kind of raise the level of abstraction increase the level of indirection and we can isolate those things a little bit from one another so we can kind of pull the test cases up so that they're focused on what the system needs to do not how it does it we can replace these bits here with essentially plumbing that translates what the system needs to do with how it interacts with the system and now if this break changes and breaks a bunch of test cases we only have to fix it in one place and that's shared between all of the test cases so that's kind of the first step in separating the what from the how as you go down this route what you tend to find is that you tend to find that the infrastructure has a number of properties, there's a number of behaviors that are kind of useful to exist in there and we'll explore some of those as we go through the first thing is the idea is to separate deployment from testing in test driven development every test usually is going to control its start conditions, we don't want to have dependencies being tweaked between tests and all that kind of stuff but in this case that doesn't work very well because the cost of starting the system up might be quite expensive if you've got a big application and you're starting it up for every single test case your test is going to be slow so wouldn't it be nice if instead of doing that what we could do is run a whole bunch of test cases against one version of the system we could share out the costs of starting the system up between all the test cases this also means that acceptance testing is a rehearsal of a production release because now we're going to automate the deployment of the software in our acceptance test environment we're going to automatically deploy it get it up and running and then we're going to start running our acceptance test against it yes I'm just going to talk about that so this allows this gives us the opportunity to run tests in parallel in a shared environment and lower the startup overhead but there's a problem here and this is a problem of isolation so any form of testing is really about evaluating something in controlled circumstances and that means figuring out how you're going to isolate test cases from one another so we'd like to be able to isolate the system under test we'd like to draw a boundary around the stuff that we're responsible for because that's what we want to test and we'd like to isolate test cases from one another so we could run two or more test cases in parallel in the system that we're working on and not have them interacting on pleasant ways and finally we'd like to isolate a test case from itself so we can run the same test case over and over again and get the same results every time that means we've got to think about the state that we're managing so let's drive through those in a little bit more detail isolation in general I think is vital to inform your testing strategy so let's start by thinking about the system under test let's imagine that we're working on a system like this a big organisation, we're working on system B and we are this is the kind of system that's kind of downstream from system A and upstream from system C and what most organisations will say when they've got something like that is we must do end to end testing and that's a problem because if you want to do end to end testing here it means that you don't really have control over these interfaces because there's a whole system in the way of those interfaces so for example if I wanted to simulate what happens if this system was sending me garbage or what happens if this system, the communication channel was broken or down and we can't do that because there are real systems in the way that it's too complicated I can't really get in and test my part of the system but worse than that we're not going to be experts on the rest of the all of the other systems and so it means that our system really is kind of not really in a precise state it's not really deterministic in approach so it severely limits the kind of testing that we can do so what we really like to do is this end to end pattern this end to end testing is really an antipattern what we'd like to do is something more like this we'd like to write test cases right at the boundaries of our system they're going to talk to our system through its natural interfaces if our system has a rest api our test is going to talk to it through a rest api if it's talking sockets whatever that is but they're going to be right up against the edges of our system now we can kind of fake all of the weird circumstances fake you know strange connection defects and all that kind of stuff as well as the normal running of the business now the problem with that is that when people look at these kinds of systems and say yes but you've got to do end to end testing is that what they're worrying about are these bits they're worrying that these things, these interfaces between these two parts of the system are going to break in some way you're going to make a change here which is going to screw up the the downstream behaviour and so we've got to kind of find another way of solving that problem if we want to have this isolation, this test isolation so I think that what we'd really like to do is something like this we'd like to have a bunch of tests focus on each of these things from our perspective from team B's perspective all we are interested in of system A is does it talk to us in the way that we think it talks to us so we can write some test cases for that for these two systems and the number of test cases that we are interested in is really quite small so we can run all of our detailed tests to evaluate our system and then we can kind of just have sort of a verification as the interface to us changed or between us changed for these systems there's one step that you further that you can take this and that's something called use a contract based testing so what you can do is that we could write our tests that define our assumption of how system A is going to talk to us we then hand those tests over to the system A team and the system A team will run them as part of their continuous delivery pipeline and if one of those tests fails they now know that they've broken their contract with us and then we could figure out what to do yes so I think that gets into a different the way that I think about it gets into a different part of the problem which is what's the scope of a deployment pipeline and the way that I usually describe it is I think of a deployment pipeline is scoped as an independently deployable unit of software so that can be a monolith a whole system or it can be a microservice and part of the microservice game is that you don't get to test them with everything kind of by definition which is hard that's one of the things that's hard to think very very seriously about the level of abstraction of the communications between those pieces and use techniques like messaging that give you a little bit more insulation because there's a little bit more wiggle room between the services and all those sorts of techniques the step beyond that so if you go down that route if you're kind of doing in the microservices kind of world the organisational strategy is to try and divide up your team in a way so that you minimise that coupling so that mostly you don't have those cross-service interactions and if you do it's usually calling out a problem in the design I think in terms of the coupling you can't do it perfectly but if it's commonly happening I think it's a problem this is one of those things that I think is just hard I think that the trade-off between I think most people I think including you when you were talking about stuff this morning when you talk about continuous delivery you talk about it almost being a prerequisite to have a modular loosely coupled architecture I'm not quite sure that I think that's right I think that I think that you can absolutely make it work with a monolith but there are different constraints so it can get huge value and there's other things that you don't but there's a trade-off there's no silver bullet so the big downside is being able to do live live deployments and the cost of engineering and infrastructure to get fast enough feedback the big downside of a microservices architecture is that you step right squarely into one of the hardest parts of computer science which is coupling and dependency management and you've got to take those things seriously and I think you solve those problems by more sophisticated design rather than anything and I'm not quite sure that that's I don't think there's a silver bullet testing strategy at least not one that I know yep so these I'm sorry the question was when we're talking about the verifiable outputs here does that cover all of the different kinds of interactions with system A remember that we're driving this behaviourally so we're driving the testing from trying to specify the behaviour of our system from the perspective of an external user of the system so that should pretty much be defining all of the behaviours that you want of the system so if there are any interactions with the downstream system that's going to cover all of them pretty much yes so the sequence that I recommend is the one that I've shown start off so start off by the one that I'm describing so start off by creating the acceptance test the executable specification for the behaviour of the system for this particular story write that first before you've written any code so now when that test passes when that specification is met you're done then you use TDD TDDTDDTDD to meet that specification and then you're done so that's kind of primarily and then you might want to think about other specialist kind of tests like performance tests or resilience or scalability or security and those sorts of things but primarily that a lot of those you can just treat as kind of behaviours of the system from the point of view of the user of the system in terms of the deployment pipeline yes you want the fast cycle first the commit stage the unit tests and then you want to run the slower tests afterwards so that they're operating parallel with you moving on and doing useful work yes I'm sorry I'm going to need to press on or we're not going to get through the rest of the slides if that's okay so test isolation the next thing is we'd like to be able to run a test case in parallel with other test cases and if you want to do that we've got to make sure that the test cases don't leak data between them and there's a simple way of doing this so I've seen people do complicated things like try and hold open transactions in a database so they can roll back later and use fake databases and I don't want to do any of that because I want to test my software so that as far as it's concerned it's deployed in production, it can't tell the difference so it's going to be using the same database the same scheme and the same technology all of those sorts of things but I want to be able to run these things in isolation and there's a simple little trick is that you can usually find every case where I've ever looked in a system you can use the natural functional isolation in the application to isolate test cases from one another so if we were testing Amazon, every test would begin by creating a new account and a new book in the store or a new product in the store that you were going to work with every single test case and that means that they don't bump into each other because you're going to be creating new instances of those things for every test case that runs if you're doing eBay it'll create a new account a new auction and so on and that kind of gives you this ability to run these things in parallel with each other without any kind of tidy costs or anything like that so you can move forward quite quickly the next step is that we want repeatable results we want to be able to run the same test over and over again in the same environment ideally so if I run my test case twice it should work each time he's a trivial example so I'm going to have a test should place an order for a book so I'm going to have a store and I'm going to create a book called continuous delivery Jes would recommend that to you I'm sure we're going to place an order for the book you can buy more and then we're going to assert that the order is placed and the first time we run that that's fine so the bit that's interesting in this example is this so if we run that test case the first time somewhere in our system somehow there's going to be a data store of some kind and when we run this for the first time we're going to create the book continuous delivery and then when we run it for the second time we're not in the same state anymore because the book already exists maybe there were other tests that went on and they changed the price of it made it go up because it's such a good book something like that but anyway it's going to invalidate the scenario so we can cheat what we can do effectively I'm going to anthropomorphise my infrastructure so my infrastructure is going to when you say create a book called continuous delivery my test infrastructure is going to say I know that you don't really care what it's called so I'm going to name it for you I'm going to call it continuous delivery 1 2 3 4 and whenever you see it you can refer to it as continuous delivery but I'm going to map it in the context of these tests backwards and forwards and the next time you run this test again I'm going to create a different one continuous delivery 6 7 8 9 and so now we've got isolation between the same test case again so I would advise you to alias your functional isolation entities and name mangal though so whenever you generate one of these things create a unique name for that thing and that name is only visible in the context of the instance of a particular run of a particular test case and therefore there's no data shared between test cases after you've run all of the acceptance tests sometimes it means that your system is in a slightly weird state in terms of the data but that doesn't really matter because you're just evaluating the behaviours of the system yes I'm sure that there are scenarios where stuff could matter but mostly no it works really well so within the context of a test you're creating all of the test data that that test needs all of it yes but they're quite unusual because usually they're contextual by account or product or something like that those sorts of lists so that's what I'm saying is that you isolate those I did some work with Siemens Healthcare and every test that they wrote started off by create a hospital because that was kind of the level of granularity that started off the whole thing and then the test was isolated because all of these things happened within the context of one hospital so I am sure that there are some problem domains where this might not work but it works for kind of 99.9% of cases that I've seen yes because it's slow and it's not really scalable because you'd have to do that between each test and if you were running tests in parallel you can't really do that, you can't really share the system so this is a more efficient way of doing there's a level of testing that you must do in production I don't think that's enough for some kinds of systems yes but if production is going to kill somebody if you introduce a bug I don't think it's due diligence that you didn't test before you got to production so there's a difference between whether you're kind of doing this in terms of environment management there's a difference but I think in terms of testing you need an isolated for software that's serious that's going to hurt people or lose people's money or something like that I think that you need to be sure that it does what you think it does before you release it into production so I think just putting things out into production on its own is okay in some circumstances particularly if you don't mind if your software breaks in production it's okay but the sort of software that I work on the whole I don't want to do that, I want to be more diligent next, we want our test to be repeatable so here's our system here's an external system here's our system on the test the bit that we're interested in here's the kind of level of isolation in our code that isolates that from the rest of the system and here's the kind of interface the sockets layer or the REST API or whatever it is and we want to fake those things we've already said we want to eliminate these external systems my mental model is that what I'm trying to do is I'm trying to put the system under test into some kind of test rig some harness so you're trying to plug it into this thing that allows us to evaluate that piece of software effectively and so we can kind of do this so we can through configuration at deployment time in production it's going to talk to the real external system here it's going to talk to some kind of stub that's going to fake the behaviours of the real system I'm sorry I'm not quite sure that I understood the case no so the way in which my preferred route is that you populate the system to get it into the state that you want it to be in for your test through the interfaces that you would use to normally populate the system so you register the users through the user registration process you add scenarios so we were building an exchange so nearly all of our tests started by creating an account to trade with creating a trader to trade against creating a market in which the trade starting the market putting some prices into the market so one of the outcomes of this strategies that you optimise your software in weird places so we once had one of the banks connected to our beta site to use our stuff and they said we've just tried to create a thousand users and your software has broken and we said why is that because it came back in two seconds and he said yeah that's about right so you optimise it in weird places to make the test more efficient but that's a good thing it means that the quality of the whole system tends to go up yes yes it does yes what this does is it makes your APIs really really good and that's a good thing it's a great example of tests exerting a force on the design of the system keynote crashed on me I'm not seeing keynote crash let me find that acceptance testing we've got that half an hour right so back to here so we can use these test doubles to be able to isolate our system so we can create this test rig that we can plug our system into to evaluate it and through configuration at deployment time if we're running a test environment we plug the fakes in and if we're running in a real environment we connect it up to the real things we take that a little bit further so here's our system here's our system on the test here here's our test infrastructure here are our test cases running on our test infrastructure and we're going to create a back channel of communication from our test infrastructure to be able to inject data into our system as though it came from an external system or collect outputs from our system where our system is talking to another system so we can make assertions on it so these things don't have to be massively complicated simulations of the external system all they have to do is assert the inject the data that we want or assert the outcomes that we're looking for that can be relatively straightforward the stubs that we use I think that I've kind of been leading you down a bit of a track and I think a simple domain specific language helps us to address a lot of these problems it means that the test cases are easy to create in the first place I have this is a bit subjective but if you if you build the first couple of test cases that you write using this kind of approach that I'm about to describe is probably going to be a little bit slow once you start establishing the DSL though it takes about as long to write one of these test cases as it does to manually execute the testing so the second time you run it you're winning already so this is a really powerful strategy to move forward because they're very abstract and they're easy to they're easy to read, easy to maintain they separate, they very cleanly separate what from out I'll go back to the example that I mentioned before let's imagine that we're writing this book buying test case we're going to go so here's my test case this is what the script would look like so I'm going to I'm going to go to the store I'm going to search for a book called continuous delivery I'm going to select the book and I'm going to put it in my shopping cart I'm going to go to the checkout I'm going to buy the book for however much the book is worth and then I'm going to confirm that I now own the book okay so that's the language that's what I would write for my test case at the moment think what that means I've just described that to you and I said nothing at all about how Amazon works so I could have some code that interpreted all of that when I said go to the store it goes to the web page and when I say search the book it goes to the search field with the book and all that kind of thing but equally if I was designing a robot that did book buying at my local bookstore I could drive the same robot from the same specification I'm going to go to the store I'm going to search for a book and all of abstraction is such that we don't care what the system how the system achieves the outcome this is really really quite powerful it also means that we can kind of abstract these complex setups I mentioned the setup that we commonly use when we were building our exchange creating a market to trade in starting the market putting prices in it but creating traders to trade against it creating accounts to trade with we had one line because we did that lots of our cases we had one line that did that in our test case it was about 12 or 15 different interactions with the backend system including registering users and accounts and putting some money into accounts and all that kind of stuff but we could do it in one line so we can go really quickly with these kind of things some examples so the BDD stuff the BDD tools are useful in this context I think that certainly many of the people that were involved in building the tools had this kind of approach in mind I don't think all of the tools kind of enforce this approach so you see some really horrible kind of cucumber and spec flow kind of scenarios where the clickity clickity you know go to this URL type in this value and all that kind of stuff you've got to get the abstraction right but when you do get the abstraction right these tools are really effective my own preference is actually to use an internal DSL rather than an external one because it's a toolset I'm writing the DSL in language that I'm using and that gives me the power of a development environment and all those sorts of things it requires a little bit of discipline to keep the separation of concerns and to design the DSL but it's really effective there is an open source thing called EZB that does this kind of thing but mostly the way that I do it is I just build the infrastructure myself so this is a Python version of my stuff and a Java version and this is a real test case from our exchange it starts off saying you should support placing valid buy and sell limit orders if you were a trader you'd understood that context what those words meant it says we're doing trading we're going to select a particular marketplace we're going to place an order in the market we're going to check feedback we're going to place another order and check feedback and so on here's another one this is going through the fix API placing a mass order and confirming that the match happened if you were a trader although these are written in Java and they've got dots in and camel case if you were a trader you could read that you could understand what it meant this is kind of the next layer down this is kind of the implementation of the DSL and the next thing that's kind of useful because we've got this place hold this language that we're building we can do some useful things one of the useful things that we can do is that we can fill out a load of optional parameters I'm trying to create a test scenario where I don't really care very I just want a bunch of orders in some way I don't really care the detail of the order I can just say place order place order place order and my test infrastructure will just make up an order that will work for me or I can be completely precise I can specify every parameter of the order if I'm being particular so again I can move forward quickly or with precision whichever I want this is the equivalent of that for one of the other APIs this is for the fixed API we ran our tests like this for a long time and then after a bit we started realizing that actually these test cases are still saying quite a lot about how the system works and so we refactored it and we ended up with test cases that looked more like this so we have an annotation here which is saying this is through the fixed API which is a programmatic API for trading this is through the user interface through a web-based user interface and this is through a public API that we provided that allowed traders to write their own bots that would trade against our exchange and we're doing trading we're going to place an order at run time our test infrastructure would say ok this has got these three channels so I'll run these three times first time I'm going to plug in the translator for the fixed API next time I'm going to plug in the translator this just sort of you might not have this problem but it demonstrates the level of abstraction these test cases care nothing about how the system works they only care about what the system is supposed to do and that's a really, really powerful tool I had a client a couple of years ago that was re-architecting their system they built ERP systems and they had a load of old unit tests that defined the behavior of the ERP system and they I showed this to them and they kind of took it on the channel was the old system or the new system they re-wrote some tests for the old system and then they could cross-verify behaviors and use that to develop the new system very, very powerful technique so, evolving a DSL sounds like a horribly complicated thing and it sounds like an awful lot of effort and there's probably a little bit of work the first the first time that we did this on projects it took us took us a week or so to build the first test case and think about this the last time I did it because I'm now used to the patterns it took me about a day to do this in a system that I hadn't seen before to build the first two test cases so this is not horribly complicated stuff there are some patterns at play here that you need to think about but it's not terribly difficult the strategy that I would reference to you is to start pick two reasonably general cases write the specification for that and then do the plumbing to make that work do the DSL layers and the translator layers to make that work for a couple of simple test cases and then just let the team at it and if the DSL doesn't yet have any language that they need to express a requirement just to let them make up the language that strategy works really effectively for us you might, as the developers take that on to do the plumbing bits of code to make it all work you might refine the language a little bit to keep the DSL abstract but it's quite a scalable strategy and a relatively simple approach to moving forward what I'm describing is kind of this four layer architecture for testing so at the top we've got these executable specifications they're written essentially in the language of the problem domain they express ideas from the perspective of an external user of the system the next level down we've got this domain specific language stuff it does the kind of name translations and all of that kind of thing that we were talking about and it's shared between all of the test cases this is one of the mistakes that people sometimes make using spec flow or cucumber is that they end up writing a a feature file for every individual test case and there's no shared code and so that's not a very scalable approach but if you kind of think about this as a domain specific language and a common language that's shared between the faces it gets scalable very quickly it works effectively the next layer down are the protocol drivers these are the things that translate from the domain the problem domain the talking to the real system so these are the things that use Selenium to go and talk to Amazon to navigate around the website and all that kind of stuff and then at the bottom you have the system under test I work as a consultant these days and I've had many of my clients have picked this up and have started operating this in a whole variety of different problem domains and it seems like a very effective strategy widely applicable next we want to be able to test any change so test cases should be deterministic one of the things that often makes this difficult is time and there are two strategies for that we can either ignore time if time is not very important in your system or we can take control of it so here's some thinking about those so ignoring time we're just going to filter out any time based fields when we're comparing values we're just going to ignore time stamps and stuff like that and the advantage of that is that it's dead simple the downside is that it can miss problems if there are bugs in the time and it doesn't allow you much control over time based scenario four systems where time is more important then I think that this is a better strategy you take control of time and the best way that I know of doing this and this is also true at the TDD level the low level test level but you treat time as an external dependency and you fake it basically in the context of the test this is incredibly flexible it's slightly more complex in terms of infrastructure it meant for example that what we could do when we were testing our exchange we could run test long running scenarios one of the things that we used to do is that we used to we would have tests that would fast forward to the next daylight savings change to make sure that our software worked through daylight savings change times and that kind of thing so here's a little example this is a made up example the other ones that I've shown you so far were real tests this one I just made up so here we're going to go to a library we're going to borrow a book called continuous delivery I'm not trying to sell it to you we're going to assert that the book's not yet overdue we're going to time travel forwards one week we're going to assert it's not overdue we're going to time travel forwards four weeks and we're going to assert that it is now overdue this isn't a great test case there are too many assertions in it but it's just trying to demonstrate the tool the bits that's interesting are these bits the time travel bits so here's our system under test here's our test infrastructure and if we've got if time is a factor in our system somewhere or other there's going to be whatever our language there's going to be a line of code that looks vaguely like this system.getTime if we're doing that the only way that we can control time is to change the numbers and you don't want to do that because that gets really messy really fast so what we can do is we can cheat we can put a level of indirection again so instead of asking the system for the time we're going to ask a piece of our code for the time let's call it a clock and now we can get in the way we can cheat, we can make the clock tell it tell whatever time we want it to so maybe we could have a clock like this by default it's just the system clock but we can also set the time to a new value and tell the clock what time to report and now we have our backchannel of communication to our test infrastructure again and when we say time travel our DSLs are going to say ok this many milliseconds forward from the epoch and tell the clock to report that time and you can do these long running time based scenarios if you're doing that kind of thing then one of the things that you probably don't want to do is run time travel tests at the same time as you're doing other kinds of tests because your system is going to go crazy then time travel tests at the same time is probably not going to be good so you could tag your tests like this and say this is the time travel test so it needs special treatment this is a destructive test I'm going to break bits of the system that needs special treatment this test requires a specific piece of hardware and that requires special treatment and now you can start to imagine a piece of software that at run time interprets those tags and allocates those tests to different parts of your test infrastructure to evaluate them this is a little animation of our version of that piece of software so over here this is the normal case we have one test environment and we have a bunch of test hosts that are sharing that one test environment and here we've got time travel tests where each test has its own version of the application so these are quite expensive tests but you don't have many of those and then there's here some destructive one we built that visualisation just when we were building the software to do the test distribution finally we want our test to be efficient we want to be able to test our software quickly and efficiently one take on efficiency is this so if our production infrastructure looks something like this but most interactions look like this then maybe our test infrastructure looks like that we don't need it to be a complete clone we just need it to be representative of the interactions in the system if occasionally there's a different route through the test infrastructure maybe some requirements have some different servers maybe our test infrastructure looks like that but we can cut that down continuous delivery is expensive in infrastructure we're going to throw hardware at solving problems or the cloud at solving problems so you need quite a lot of compute resources but you can be sensible and manage that sensibly modern systems are often distributed often complex, often asynchronous and that starts to add complexity we don't want to be confusing ourselves with worrying about asynchronous systems and race conditions at the level of abstraction of this kind of DSL in our test cases so we want to build into our DSL that's synchronous so as each instruction in our DSL operates we're going to step over and until that instruction is complete you're not going to execute the next line a good way of doing that in an asynchronous system you issue the instruction to do something and then you look for a concluding event to signal that that's being finished here's a trivial example of this this is another made up one so I'm going to send an asynchronous place order message for order confirmed or I'm going to fail on the timing and I build that into my DSL level language infrastructure so I reuse those behaviours over and over again next best to that so if you don't have a concluding event in this kind of scenario my first suggestion is to ask why not because it probably would be a better design if you did in your system but if you really don't you can implement as a second best a pole and time-out mechanism where you're kind of just from the testing for a structure and say is the result ready for me yet, no is the result ready for me yet, no is the result ready for me yet, yes I'll now move on that's less efficient but it can work what you should never ever ever do is put weights and sleeps in your tests I reckon I could make a decent living going and fixing people's testing just for that this is a horrible anti-pattern which everybody does the best case is it just makes your testing slow and inefficient if I'm putting a 10-second weight in to make sure that there are no race conditions and the interaction finishes in a millisecond I've just wasted 9.999 seconds where I could have been doing more testing and the worst what really happening though is that it's not really solving the problem all you're doing is that you're just shifting the race condition out it might work it might work sometimes but it's not really a solution so you want to do one of these two things rather than put weights into tests if you done all of these things the next bit what tends to happen is that as you get more and more tests the feedback starts to get slower and so then you need to think about scaling out and if you've followed my advice scaling out is relatively easy because you can run all these tests in parallel so now you can do the thing that I was talking about before here's a release candidate that we've identified our acceptance test environment comes free pulls a release candidate down deploys it to a test environment spawns out a bunch of test hosts it's going to run tests against that shared environment it's going to collect the results and it's going to feed them back and tag the release candidates with the results one last thing we've talked a little bit about data as I said mostly synthetic data but I think there are three different data scenarios that are just worth exploring and there are different ways different approaches to each of those so we're interested in transactional data the kind of data that you build up your system we're interested in reference data look up data, post codes symbol tables read only data mostly and we're interested in configuration data data that defines the behaviour of our system in some manner and then we've got these three options we could use production data we could generate the data in the scope of the test or we could use versioned test data by versioned test data what I mean is that somewhere or other we have data stored and when we start the system up we say ah we're starting up in an acceptance test environment I will pull this version of that data down for you so what should you do for transactional data I think that you should never ever use production data it's too big, too heavyweight too inaccurate so generate that in the scope of the test using the sorts of techniques that we've talked about don't use versioned test data either generate it in the scope of the test that's by far the best strategy in my opinion next reference data you don't really want to use production data for reference data because often it's too big you don't want to be loading millions of post codes into your system just for testing it's going to be slow and unwieldy so you probably do want to generate that mostly in the scope of your test but you could decide under some circumstances to just have a restricted subset so instead of having 15 million post codes you just have 50 and hard code those because it's redonally the fact that it might leak between tests and matter very much and then finally configuration data for the configuration of our systems ideally we'd like the configuration when we're testing to be as close to the production configuration as we can get it we'd like to test that configuration so we would like by default the configuration of our system in a test environment to be the production configuration and then we just override bits of it that we don't want we don't want to access the real production database and we don't want the real the real passwords for external communications and that kind of thing there are some weird kind of tests where you're kind of testing the scope of the configurability of the system where you might want to generate it in the scope of the test and you want version test data to override those secrets and stuff like that in your acceptance test environment so in summary don't use UI record and playback systems don't record and playback production data don't production data into your test systems don't use nasty automated testing products don't assume that they're going to tell you what you need think about your testing strategy and take it seriously don't have separate testing QA teams by all means have professional testers but have them working and collaborating as part of the development process not as a separate stage don't let every test start and initialize the application you gain more flexibility by separating those two steps don't include systems outside the scope of your system think carefully about what you consider to be the boundaries of your system and test to those boundaries don't put weight instructions in your bloody tests sorry, do people want to take a picture of that yeah let's just people, I've got their cameras there let me just finish okay how do you automate without the weight you do what I said you look for the event or you do a poll in the time out you don't do weight I tell you what you can hire me and I'll come and show you tricks for success so do ensure that developers own the test close that feedback loop do focus your tests on what the system needs to do not how the system works think of your test as executable specifications and use them in that capacity make acceptance testing part of your automated definition of done build it into your development process keep tests isolated from one another use the functional isolation and temporal isolation techniques keep the test repeatable use the language of the problem domain that's the really really powerful tool here to get you the right level of abstraction do stub external systems you don't want to be testing you don't want to be taking on the responsibility for testing the world test your system not everybody else's testing production like environments evaluate your software deployed and configured as close to production as you can achieve make instructions appear synchronous in the test case and test for any change whatever the nature of the change you should be able to test for it and finally keep your test efficient you want to be able to run thousands tens of thousands of these test cases and not feel a barrier to adding a new one so you want to keep them efficient some extra information if you want to look up about this jazz will be pleased to see that I'm recommending our book there's a chapter on acceptance testing which is completely in line with the philosophy that I've described here this talk goes into a bit more technical detail here and there I think than the chapter but it's good there's a great book called specification by example which talks about the general specifications and heads come up with some of those examples those acceptance criteria in your stories and that kind of stuff and this is really about the data stuff and the data manager refactoring for database it's not directly related to the stuff I've talked about today but it's a good book to reference there's a few other links in here as well you can get a slide I'm surprised we finished on time but we have thank you very much so let's say we within the shopping cart example you've said we want to delete the user so as per your overall approach that you have been taking do you first start with registration doing a mock purchase of a particular product and then do the deletion and ensure that all the things there's a part of recommendation the process history gets removed so what I'm trying to achieve in terms of the test isolation is that I don't want I don't want there to be any required test ordering so I don't want to have to run this test before I run this other test to get for a particular scenario so yeah that means that kind of forces on me the requirement that each test is completely responsible for getting the system into the state that it needs to be in to do the evaluation so if I was deleting a user yeah first I would create the user and then I would delete it in the scope of the same test does the infra also play a role as far as the acceptance test is concerned so let's say a system under test integrates with a legal system and there are certain certificates that exist between those two so does the acceptance test so I think that gets back to the stuff that I was talking about before when I was talking about the system A system B system C so primarily I want to focus my testing on my system the bit that I am responsible for developing software for anything external to my system I don't want to test that and I don't want it to be there because if it is there it means I can't properly test my part of the system so I want to get rid of that there are some circumstances where some external piece might be considered part of your system in which case bring it inside and test it as part of your system but mostly if they are genuinely external systems then I want to stub them out so all of this to my mind is the scope of acceptance testing I see no role or separate integration testing particularly so I have a question as well you mentioned about the weights there is also another problem when it comes to implementing automation is about assertions you refer to that there should be just one assertion in acceptance test you are really going through a particular scenario of sorts search for a book, buy it and verify if it is purchased what type of assertion strategy should be there for such cases so I like to have simple explicit assertions that tell you what the test is doing so at the end of the test it is going to say that this thing happened along the way in order to get the synchronous behaviour there are probably implicit assertions built into the DSL so let's say I am placing an order to get something into a particular state then inside it is going to fail my test if the order doesn't get placed so there is an in built assertion but I am not going to surface that at the level of the DSL language any more sorry I couldn't hear you if your system is asynchronous you have got to cope with the asynchronous I am sorry that wasn't what I was saying what I was saying was that if your system is asynchronous you have got to deal with that in your DSL you have got to make the DSL in each test case you have got to make that synchronous so you do that in the implementation of your DSL sure but my point is is that at the level of abstraction of these test cases these acceptance test cases that I am talking about you don't want to be dealing with the hard problem of the concurrency in software and race conditions and stuff like that so you must build that into the language to make it the same because otherwise you way up here at one level of abstraction and way down here in the weeds in terms of concurrency so you want to try and build your language your DSL that your test cases are operating so that they are with respect to the test case it's the synchronous steps even though your system is asynchronous underneath thank you very much for your time today thank you