 All right, so thanks everyone for joining in. This is the second keynote of the conference. I see already people pouring in, so that's great. This is the second keynote of the conference and we have none other than Simon, who is the project lead for the Selenium project and someone I've known for many years, probably even before that. We happen to work at ThoughtWorks together. And Simon reached out to him asking if he would do a keynote and he's gracefully agreed. Last time he did a keynote at the Selenium conference we ran last year, which also was a virtual conference. He set a new trend for people not presenting life and doing a video recording while he was live commenting in the chat room. This time he's apparently decided last minute to do a live talk and set a new trend, hopefully. All right, so without much delay, I want to hand over to Simon. This is a topic that I'm really looking forward to joyful development, towards joyful development and all about feedback loops is what I think of it. So over to you, Simon. Thank you very much for that kind introduction. Just give me a second as I figure out how to share all this bits and pieces. So yes, thank you for that kind introduction. And can I say what a pleasure it is to be here and joining you at Appium.com. Think I'm allowed to be here. Although I'm the leader of the Selenium project, I also created WebDriver and developed not only the original JSON wire protocol used by Appium, but also co-edited the W3C WebDriver spec, which is what that wire protocol is evolved into. But weirdly, I'm not going to talk to you about the WebDriver protocol. Actually, I'm here to talk to you about neurochemistry and software development. Let's start at the beginning. Your brain is sitting in a soup of chemicals, and those all have an impact on how you behave and react to the stimuli in the world around you. Too low in one chemical and you feel awful. Too much of another and you become addicted to something. Get everything just right. You feel great. Or in these pandemic times, normal at least. Now, some of the chemicals your brain is steeping in are ones you may have heard of. And two examples are serotonin, which regulates mood, sleep, appetite, and a pile of other things. And oxytocin, which regulates social cognition. It's otherwise known as the love drug. Neither of these feature in the rest of this talk. Eat a banana and hug your loved ones if you'd like a little bit more of either of them, though. That's a great way to get both of them. The neurochemical I'm most interested in talking to you about today is dopamine. Dopamine has a host of functions, but the one we're going to look at today is its role in pleasure and reward. This is my dog Matilda, by the way. Dopamine is a funny substance. It sometimes gets a bad rap. Think of like a dopamine rush or something like that. You can't become addicted to dopamine itself, but your brain releases it in response to pleasurable experiences. It's a reward for doing something fun and enjoyable. Every time a response to a stimulus results in a reward, these associations become stronger through a process called long-term potentiation. And our world is filled with pleasurable experiences. Well, what like? Well, positive social stimuli will result in a release of dopamine. And what's a positive social stimulus? Think of like someone laughing at your joke or being recognized for your good work or a message from a loved one. Just thinking about it feels good, right? So eating food, creating art, spending time with friends, all of these cause dopamine to be released. But also things on your computer or the internet can do the same. Someone likes to tweet, dopamine. Someone says a hug on Facebook, dopamine. The positive reinforcement from other people's reactions, both online and in the real world is why we love these things so much. What does all this talk about social stimuli and dopamine got to do with software development and heading towards a joyful software development in particular? I don't know about you, but lots of software development can feel like a real slog. It's not particularly enjoyable to wait for a compiler, wait for a browser to do its thing and click through a site, wait for a phone to do whatever it needs to do, wait for results to be formatted, wait for Jenkins to finish running the pipeline, wait for the GitHub actions to finish, wait for your colleague to actually finish your code reviews. I mean, why do they even need to do that? You don't write garbage, right? Hardly a recipe for your brain being flooded with dopamine and you're not more, I mean, no, you're more likely to look forward to your next coffee break, right? And yet, and yet throughout the day, there are these little fleeting moments of joy and happiness. That moment when you finally understand why that tricky bug is happening, that brief flash when you realize you've found a completely new way of breaking your software or when that test you've been struggling to fix for so long finally works or when a build finally goes green. We want more of those moments, right? We all want our day to be scattered with little fragments of joy. Arumi once said that the hurt you embrace becomes joy. Well, I think we can do better than just embracing the endless grind of our CI servers. I'll put this out there. What we're seeking here is a way to bring back joy into our lives as software developers and testers. I want Earth to be bring joy, dopamine, into the art of software development and testing. The literature about dopamine tells us that the shorter gap, the shorter the gap between an action and a reward, the stronger the connection your brain forms about the act being a good thing. At some level, this feels like it makes sense, right? Waiting for something pleasurable to happen can be awful. For example, waiting to ship a major release of a popular open source project. But actually doing the thing and getting the feedback for it, that's amazing in unrelated news. Selenium 4 will ship really soon now, folks. So one of the things that we should be looking to do is to tighten our feedback loops as much as possible, ideally to instant. We should be striving to go to a world where we can make a change and know as quickly as possible that it works. That's the way we can get these little boots of dopamine all the way through our days. Grace and Gorman tweeted about another reason why you might want to have fast feedback loops. Not only are they good for your dopamine levels, but it also makes good business sense. And that's got to be a good thing, right? The commercial advantages on top of the positive effects and developer morale are all good reasons why we should be moving towards joyful software development. And we do that by setting the goal of getting to fast. Getting to fast. It's an important idea. It's got to slide all to itself. Now, computers are amazingly fast things. I mean, just look at these numbers from 2012, almost a decade ago. I mean, a level one cache reference in the CPU, 0.5 nanoseconds. I can't even imagine how tiny that is. Going out to main memory takes 100 nanoseconds. Maybe a round trip within a data center takes like 500 microseconds. I mean, that's super fast. And of course, over the years, over the nearly decade, we've made our own improvements since these numbers were created. Oh, oh, no. That's not good at all, right? That's distinctly suboptimal. Worse than that, we've really blown this idea of joyful development out of the water, haven't we? What can we do? If this was some sort of storybook, our hero would likely grab their bag, ask their trusty companion to join them on a perilous quest and go and consult the wizards of yore for some satially advice. And lo, there are wizards out there that we can consult. There are two in particular that I like to consult when I think of fast feedback loops and joyful development. The first is Michael Feathers and the second is Kent Beck. I mean, one of the interesting things is we've had the ideas and tools available to us for a really long time. Kent Beck's XP Explained was first published in 1999. That was last millennia. There are folks who have graduated from university who are younger than this book. And one of the many things I like about XP Explained is the idea of the shared metaphor, which is used to unify an architecture and provide naming conventions. You can see the idea poking through and expanded upon in Eric Evans' domain-driven design, which is also a great book, by the way. The shared metaphor I'd like you to take away from this talk is that Going Fast leads to a more joyful development experience. I'll repeat that. Going Fast leads to a more joyful development experience. There's another book that stood the test of time and that's Michael Feathers Working Effectively with Legacy Code, which was originally published in 2004. Unusually for such an old computer science book, still loaded with great ideas and strategies. One nugget that I particularly like is the idea that Legacy Code is just code without tests. Why do I like that? Firstly, because it's non-judgmental about the quality of the code and that in and of itself is a good thing. But secondly, he suggests throwing tests around Legacy Code when making changes and then driving down to ever smaller tests. In other words, he's suggesting tightening feedback loops as we make changes. After all, the slowest feedback loop your code has is the one from adding a line of code to some rando finding a bug and filing an issue on GitHub. By adding tests to your software, particularly as you fix bugs, you can tighten that feedback loop. But there's a problem. Writing correct software is fiendishly hard. Even the simplest software is a fantastically complicated beast. There's the code you wrote. Now, we know that our code contains bugs because, well, all code contains bugs, right? That code runs on some language runtime and that runtime was created by other developers. So, you know, it's got bugs too. And that runtime runs on top of an operating system. That's bound to be faultless, right? No. All software engineers know that all code cannot be trusted. Well, we can console ourselves with the fact that the code is running on hardware and that's always good, right? Right? No. There have been many instances of bugs and CPUs and firmware. Who remembers such classics as Rohammer or the venerable bug in the floating point unit of Pentium? Division is futile. You will be approximated as some Trekkie Wag said at the time. Bugs get me started on networks. Frankly, the fact that anything works at all is something of a daily miracle. Bugs get everywhere. There's an entire branch of computer science to do with provably correct systems, typically dealing with pyflute and concepts such as formal methods. That's all about having software that's shown to meet some strict definition of correctness. If the assumptions of the software's model are correct, then the software can be formally proved using logic and maths and other things I don't understand to also be correct. Takes a huge amount of effort and I doubt anyone here, anyone listening to this talk, practices software development that way or has the time to do so. Also, there were bugs found in the first proven correct kernel so I'm not sure it even works. It follows that no code base is correct. Now, we have several guardrails that try and prevent bugs making it into the world. And each of those have different feedback loops that take different amounts of time to operate. One of those guardrails is the work done by manual testers. They are awesome, but for the sake of this talk, I'm not going to focus on them. Our feedback loops are slower and less urgent than those provided by our automated tests. Sadly, no code bases are the right amount of tests written for it. Often, there are too few. Very rarely, there are too many. If we follow the advice from working effectively with legacy code, then every time we want to make a change to a part of a system that lacks tests, we should write at least one to verify that the software continues to function. Going a step further, what I like to do is write the smallest possible test that demonstrates a particular bug is present and ultimately, it's been fixed. Sometimes that smallest test ends up being an enormous thing that ends up, I don't know, spinning up an entire Kubernetes cluster with a test environment and then running the test there, but frequently and more importantly for tight feedback loops, quite often I can write a test around a single function method or class. And so now we're in the fraught world of how to actually name our tests. Unit, integration, end-to-end, smoke, fast, slow. This is endless and everyone has their own definitions. Back in the day, Google used to talk about test sizes small, medium and large, later enormous. I know this because I wrote the blog post talking about this in their testing blog in 2010. I, too, am one of the ancients. I prefer these terms because they're really clearly defined. Whereas a unit test might mean anything, depending on who you're talking to, a small test is very clearly defined as running on a local host without access to disks at all. Go read the blog post if you want to know more. If you can't read the URL on this slide, don't worry, just search for test sizes Google testing blog. You'll find it. How can you use test sizes to help create fast feedback loops and bring some more joy to writing code? We might want to write a large test to allow us to repeatedly run a particular scenario, narrowing down where the problem is until we can write a medium test. We can hopefully repeat the process until we write one or more small tests. Each time we shrink the size of the test, we make the feedback loop faster and more effective. Once I've written the smallest possible test, the thing I like to do is to delete the larger tests, leaving just the smaller ones in place. Why do I do this? Because those larger tests were scaffolding and used to help me track down why the bug was occurring. Just as a builder, they put scaffolding up around a building as they work. Once the work is complete, the scaffolding is removed. I mean, imagine our cities if we never removed scaffolding as buildings were created. Now go and look at your test suites. Of course, it's entirely possible that the smallest test is actually a large test, and that's fine, but often it's not. If we follow this advice, something weird tends to happen. We end up having a mix of tests that leads us to something like the test pyramid. Now, I know it's not the world's most sophisticated model, and yes, I know there are refinements and rebuttals to this model, but you know what? Just because it's old and simple doesn't necessarily mean it's wrong. There's a lot of value to be had from the conceptual simplicity of this model. Look, I can make it more sophisticated. There we go. The key thing to take away from this model of the world, is to have faster feedback loops if we have lots of small tests and far fewer large tests, just because of how these things are defined. There's another advantage, and before we go digging into that, we need to go back further in time and remember what an ideal automated test is. An ideal automated test is one that's reliable, repeatable, and hermetic, sometimes called isolated, no matter how large they happen to be. Let's take each of these in turn. A test needs to be reliable. It's not reliable. We can't trust the test. A test we can't trust is worthless. It only means something if a failure of that test means there's a problem, and the success of that test means that the problem it's designed to identify isn't there. We don't want to lengthen our feedback loops by having to track down whether this test failure was expected or unexpected. No! I want to get on with writing code. When you're trying to track down a problem or when we want to have the same level of confidence that the software is functioning as intended on both the developer machine and the CI machine, it's really helpful to be able to run the same test over and over and over again. We don't want to lengthen our feedback loops with tedious and error-prone pre-test steps at, I don't know, load data into databases, flick toggles, press fairer buttons and barriers UIs to allow us to run the test again. No! I want to get on with writing code. This tends to imply that test data setup is done as close to the test as possible, ideally within the test itself. And finally, a test needs to be hermetic. It needs to be isolated from the other tests that happen to be running at the same time, be it on the same machine or on some other machine. Things like sharing mutable data between tests is a nightmare for this kind of thing. Sadly, creating hermetic tests can be denied. Creating hermetic tests can be demanding, particularly as test sizes increase. But pay our fears we can run tests in parallel. Being able to run tests in parallel allows us to make the long pole of test execution time to potentially be the time it takes to run your longest single test. Parallelism is one of the key weapons we have in achieving fast feedback loops. One weird thing that happens, by the way, as you parallelize, is that although the wall time of your test goes down, the total CPU time tends to increase. I mean, thinking about that blows my mind, but hey, it's just computers, right? There's another reason for wanting to write more small tests and large tests. And that is the law of averages. A small test tends to be very isolated, focusing on only a single thing at any one time. As the test size creeps up, however, we need to start interacting with more and more systems. It's even more true if you're using microservices. Now, let's imagine that each service you want to interact with has a 0.5% chance of not responding correctly. When there's only one testing, that means that 0.5% of all your tests will now be flaky, probably more than that because each test will have more than one interaction. But what if we have two services each with a 0.5% chance of failing? Or three or four, or as the number of potentially flaky interactions goes up, our test reliability plummets. The compound risk of failure shoots ever up as we add more and more moving parts. This is one of the reasons why end-to-end tests get a bad rap as being unstable or flaky. They just simply require more moving pieces than any other test. And mixing together a browser off-bone, the network as well as your app is a recipe for instability. This is also why I like to suggest to teams that they minimize the number of selenium tests they write or, for this conference, apium tests. There's nothing to do with how good I think the software is. I'm on the developers to have fast, stable, reliable feedback loops. Now, we can attempt to mitigate the problem of flakiness by rerunning failing tests. But by doing so, we lengthen our feedback loops and that's not good for our dopamine levels. Or, let's be honest, how joyful we can be. Worse, when something genuinely does fail, it takes even longer before we find out about it. Less speed, less dopamine, less joy. It looks like we should be focusing our efforts on small tests where we can, right? Sure we've all been there. Written some code, found a problem, and then spanned in the tests after the fact. Writing tests like that can be incredibly dull and often incredibly tedious and demanding at the same time. It's a perfect storm of awfulness and frankly, I hate it. The worst thing is when you find the code is so tangled you can't write a small test. The only thing you can realistically do is write a medium or a large test. El horror. Now the wisdom of the agents tells us there's a way out. In 1999, the first version of XP explained by Kent Beck was released. From XP came the concept of writing code test first. This principle led to both TDD and later, BDD. Something wonderful happens when you write code test first. As a developer, you get the chance to write the APIs you wish your code had. Something that's head scratching. Sometimes there's some head scratching and some redoes happen, but ultimately you end up writing code that just feels nice to other developers. Another thing that happens is that weird practices like dependency injection start to make more sense. There's a lot of scenes in your code. This leads you to accidentally writing code that's less tightly coupled to other parts of the system. This makes it easier to write the test, but this loose coupling also makes system maintenance easier. Because of all this, I like to think of TDD as test-driven design rather than test-driven development. The way that the best TDD is to write their code is fascinating. They write a small test red bar. Then they write the simplest. Not the stupidest, the simplest code that could work. They run the test again. Green bar. Then another small increment added via test. Red bar. Green bar. They do this quickly. The feedback lip is tight and each time they run the test and get the test green again, they get a tiny hit of dopamine and they do this time and time and time again. Writing tests becomes rewarding in and of itself, as well as the benefits to the improvement of your own designs. Done well, TDD is a joyous experience. This might feedback lip is great when you're writing a single test case, but in most projects you need to actually run a build in order to run the test. You change the production code you're testing and now everything that depends upon that code needs to be recompiled, right? How do we get our dopamine from a slow build? It's hard. If it's fast enough, feedback lip is tight. Voila. Dopamine. Joy. Dan Bodard popularized the idea of the 10 second build. It's not just something to aim for, it's something to reach. When I'm writing unit tests in Selenium I often get incremental build times of under 10 seconds. With a right setup it's possible to do an entire clean build within seconds. For me, it's a careful balancing act of build tooling, code organization and programming language. Give me an hour and I will happily install the virtues and mono repos in the basal build tool. But we don't have that hour right now. So let's talk about architecture instead for a while. One pattern to consider is a so-called hexagonal architecture. I have almost no idea why it's called that, but perhaps it has something to do with this diagram, which is from the paper written by Alistair Coburn that introduced the concept. I prefer the alternative name, ports and adapters. Azlak Helsoy spoke about this at SeleniumConf in 2018, so I won't repeat everything here. Go watch the video. Azlak Helsoy SeleniumConf 2018. A link is here on the screen, by the way. The executive summary is that the main aim of this architecture is to decouple the application's core logic from the services it uses. For us, and our quest for joy and dopamine through tight feedback loops, this means we can plug in fast local versions of dependencies during smaller tests and then swap them out with larger versions for our larger tests. That helps keep our build times quick, our tests times rapid, and our feedback loops tight. Of course, there are many ways that architects support an application to support fast feedback loops, and this is just one of them. I'm sure you can think of many, many, many more ways to do the same thing. With that out of the way, think about how we put all this stuff into practice. Remember, the fastest feedback loop we can have is one where the code doesn't even compile or package if there's a problem. It's one of the reasons I like to have production code and the tests that verify it works in the same repository. Anytime I make a change in the production code, most of the smaller medium tests that need to be modified, instantly fail to compile. That's a good thing. Conversely, if I had the tests in an entirely different repo, I need to do a build of the production code, package it up, push it somewhere, then pull it into my test repo and then throw it down. I'd fall asleep, man. If you need to argue the case and speed isn't enough of an argument, then remember that the tests and the production code are inextricably linked at runtime. They belong together. Now, starting up an Appium session can take a really long time. You can amortize that cost by reusing a session where possible, but then you do nasty things to the isolation of your test. There's a pragmatic choice you have to make in order to get the feedback link fast. How often should you use the same Appium instance? With a cloud provider, you might be able to go as low as once. On your desktop, perhaps all the time is the answer. But there's a more sensible reason to avoid Appium-based tests and large tests in general. Large tests cover a lot of ground, so they give you great coverage, but when they fail, they lack precision. They just tell you something has gone wrong, but very seldom are they able to tell you where that something was. So not only are they slower, when they fail, it takes longer for you to figure out why they failed. Smaller tests give you greater precision. Am I saying don't use Appium at all? Absolutely not. Instead, think of your Appium tests as a very rich dessert. A little is wonderful and too much makes you feel unwell. It's not just Appium. Anytime you drag the network into it, you've introduced latency and slowed your tests down. Also, the network isn't using the loopback device. There's not only latency, but you also have a source of potential flakiness. As an example, a long, long time ago in this galaxy, I was the admin of a Linux server. It ran fine most of the time, but occasionally it would crash, seemingly for no reason at all. However, when the building it was in was quiet, the uptime was dramatically longer. After much investigation, I was left with the Sherlock Holmes adage. Once you eliminate the impossible, whatever remains no matter how improbable must be the truth. Somehow, people were causing the machine to crash. After much investigation, it turned out the network cable we were using for the machine ran under the carpet. Don't ask why. It's a long story. I was young and I was foolish. As people walked across the carpet, they occasionally crushed the cable, and over time this led to very rare shorts. The kernel seeing us a garbage coming across the wire, assumed the world was on fire and panicked. See? Networks are evil things. Most apps have a web-based back-end, using either JSON or GRPC to communicate with an app on the device. Patterns such as models you presenter and writing code that follows a functional model, hey look, if you're web-based, take a look at HDTV4K, allow us to isolate our code from the network, allowing us to test in isolation. If we're following Alistair Coben's ports and adapters style architecture, we'll end up writing code this way anyway. The rule of thumb that I follow is that I like adapters at the boundaries of my system that convert from something I don't control into something I do control. Those adapters can be tested thoroughly to ensure that they work, and then I have deterministic behavior within my own code and can control the data coming in to the rest of the application. When writing code, the first test you can write is the one that describes the bug you're fixing or the feature you're implementing. Sometimes when I don't know where to start on the system, I call this test bootstrap, and I write code that tests a system that doesn't exist yet. This test will be at a high level by necessity, like I said, because, you know, I'm testing something that doesn't exist yet, and quite often, this is going to be a large test. But I don't write more than one. Just a single test. Maybe I'll leave test names in place for later if I'm worried about forgetting things, but I just write one test. That first test will lead you to write more tests, probably medium tests to implement the system. There'll be more than one, eventually, but for now, just write one. Nested within this medium test will probably be one or more small tests. For now, you guessed it. Just write one. Think of this as like a depth-first search of the space of automated tests. We write one at a time, because we want to get to a green bar as quickly as possible. That's where the dopamine comes from, where the joy comes from, as the bar changes from red to green. And when we look back, we see a test nested inside a test, nested inside a test. I described this as testing from the outside in, but you could also call it the Matrychka model test. Once you get a green bar, pop back up to the test that caused you to write this one. Do you need to add another smaller case, or are you done? Do you need to consider another aspect of the feature, by the story you're working on? If so, add just one test to help you explore the problem area. Get it to pass using small incremental changes. Run the test frequently. That movement from red bar to green bar is what you're looking for. And every time it happens, a tiny step away. It goes without saying, but I'll say it anyway. This implies your tests run damn fast. Eventually, that very first test you write, the largest of them all will go green. More dopamine. And this is a point to stop and review the myriad of tests you've written. I know they all took time to write, but as we're maintaining the code moving forward, will you still need them? Is this medium test covered by these small tests? Can you sensibly merge some of them? Be ruthless. It takes only a few minutes to write a test that will exist and need to be maintained for years to come. Each of them had better count. As discussed earlier, this way of working, writing a test and then writing the code to make it pass is something the ancients used to call TDD. There's another important triangle to keep in mind when you're thinking about tests. And it's this one. This triangle is about how we balance priorities in software development. We have to find a balance between three. Skew too far towards, say, speed and safety, and we might see cost spiral out of control. Similarly, if we focus on cost and safety, then we can wave goodbye to being able to move fast. Slow and steady would be the way to go. Tests are interesting in this context. As you start writing them, there's definitely a hit to speed. You'd be able to write more production code if you weren't wasting time on tests. They're expensive things to write and maintain, as we all know. But, and this is important. They're one of the few tools we have to increase the safety of each change we make. When anything happens as you add more tests, you start being able to move faster too. They help catch bugs before they're actually bugs. You make a change and some other tests start failing unexpectedly and low. They've helped us start reduce cost too, because you can now fix the issue before it's even a problem. In other words, tests are like the ballast and the ship of development. They help us stay level and focused, letting us leap across the waves of user stories with joy and speed. Hey, joy! So we start adding more and more tests. But as we add more tests, particularly if they're large tests, we start to drag our build times down and down and down. Worse, we end up having to maintain those tests. Boy, some of those tests, we've all seen them, right? Tests which appear to use so many mocks you can't change anything anywhere without breaking it does another test. Tests which are so long cover so much ground that the slightest perturbation in the cosmic background radiation of the universe causes them to fail. In other words, tests also cause drag. They can slow us down and prevent us scooting forward at the kind of speed we'd like to be moving at. Speed, safety, cost. A balance as old as software development itself. So we need to review and perhaps delete some tests periodically. How do you decide which tests to delete? Well there are some easy ones to pick. If they've never passed, delete them. If they've never run, delete them. After all, like they're pointless. After that, things get interesting. If test is flaky and you've not managed to find the root cause to stabilize it, then removing it is better than having the intermittent signal. This is what they do at Facebook by the way. If a test is large but the thing it's testing is well handled by smaller tests, favor speed. Always be thinking about that balance of cost, speed and safety. I'm sure that many people watching this talk already know this, but small tests aren't just for back-end code. Too often I see people write great small tests for their back-end code and then avoid writing small tests for the UI or front-end code. Part of this is because traditionally the back-end was written in one language and like things like react native, I mean that front-ends are written in JavaScript. Changing now as more developers become familiar with JavaScript. But there are still problems to solve. One of the patterns we used to talk about a lot in the DIMM in distant days of ten years ago was the MVP pattern model View Presenter. It makes it simple to decouple logic from the updates on the screen, which in turn makes writing small tests easier. When you're testing an app which uses a remote back-end, one really neat trick is to stuff out the network using tools that allow you to replay network traffic as necessary. One of those is Montbak and another is Sivertian. By stuffing out the network, you can write medium tests in place of large tests. They'll run more quickly and they'll be more stable. If this isn't a technique you've used before, then I'd recommend looking into it. Covering it properly is an entire talk or maybe book in itself. I've mentioned this before, so I'm going to just fly by this slide. But try practicing TDD. When you do, also try to make the changes you make to the test as incremental as possible. The goal here is to get from the red bar to the green bar as quickly as possible. Remember, write a tiny failing test, show it fails, make it pass. Get a little hitted document every time you get a passing test. By the way, that show it fails bit is super important. The number of times I've written a test thinking, oh, that'll fail and actually it actually passed. Meaning that I didn't understand the safety system, which meant that when I got the test to the green bar, actually things weren't working the way I thought they were. Awful. Show the test fails. There's code in the Selenium project that was written 14 years ago. Even if that code took hours to write, it's been with us many, many times longer than that. To make matters worse, over time you learn new things, you practice different techniques, you forget things you no longer need to remember and practice skills go rusty. I'm sure that some of you have equally ancient if not older code that you need to maintain. The age of the code means that when you go back to that many moons later, it can look as if someone else entirely wrote it. To be honest, quite often someone else did write that code. Working on stale code that's hard to read and reason about is a great way to make your life miserable. No document to be found there. So it's best to avoid this kind of complexity. There are things we can do to help make our lives easier. The first is to favor simplicity. That's why in Selenium, we like to encourage people to use IDs to find elements in Appium too. Hopefully those are stable and meaningful. There are similar strategies you can apply in mobile apps too. Elsewhere in code, it might mean, I don't know, keeping modules focused on doing one thing and one thing well. For me, it always means avoiding using more than one mock object per test. If I use any at all I try not to. Martin Fowler has a blog post discussing state-based and interaction-based testing. I'm a huge fan of state-based testing. But you know, there are people out there who like interaction-based testing. One adage I like to bear in mind is that you need to be twice as clever to debug something as to write it. So if you write code as clever as you can be, you can't debug it. Some super genius has to. We need to get the software out and into production. Here are some more numbers that you might find helpful. Although the timings might be a little different for you, there's an important cutoff where you lose focus and switch to another task. For me, that's about 30 seconds. Sadly, although our development loop should aim to stay under 30 seconds, the time that actually matters is how long we have to wait until we can merge our code and ship it to production. There are typically two gating factors for our throughput. The first is how fast a human being can reliably review a PR. The second is how fast our CI pipelines are. To maximize our speed and to keep the feedback loops as tight as possible, we should be aiming for the constraining factor to be people and not machines. How do we do this? We should start by setting a goal time for our CI runs. I like to use the time of a fast PR review, which allows the results to be included in the review tool before someone has to ask for changes or accept the PR. I did the research for how long this is in your organization, but my finger in the air estimate, about 10 minutes is a good time to be aiming for. To get the length of time for a code review down, there's something simple we can do. We should be aiming to... What we should be aiming to do is to minimize the size of our PRs. I'm sure you've all been there. Time change takes five seconds for review and you know this, so you review it almost instantly. The moderately sized PR review takes a few minutes to review. You know this, so you wait until you have the time to actually focus on it and then you choose through a few of them in one go. You've lengthened the feedback loop, that's it. The massive PR, however, either gets blindly accepted or the review is delayed for an incredible amount of time as the reviewer knows that this PR is going to take a mega decade to review properly. Just as with our tests, we should be aiming for small, complete changes that incrementally improve a product with each PR, rather than massive changes that land on the entire feature in one go. The agile folks talk about minimal viable products. Maybe when we're thinking code reviews, we should be thinking in terms of minimal viable change. There's no denying the laws of physics. At some point, we need to parallelize in order to meet our 10-minute trial. So start running things in parallel early and do so as much as possible. Test isolation comes into play here, as does how we set up and use data in our tests. By starting to parallelize tests as soon in a project lifecycle as possible, you can start to avoid problems with accidental test coupling or a lack of test isolation. If you can, make extensive use of build grids and distributed build caches, but only if you can be sure that the build results are correct. If you can't, then choosing a build tool where you have to avoid a clean step at the start of the build is vital, as it allows you to reuse artifacts from previous builds and makes the feedback loops faster, more dopamine, more joy. Those clings really knock back your build time and prevent your build tool from being able to do just an incremental build. Having said this, you should always favor a correct build over a fast one. In the Java world, there's only a couple of tools I've seen that do this before. I prefer Bazel at the moment. Parallelizing large tests takes significant resources. If you have a smooth, well-rehearsed and fast deployment pipeline, you can take advantage of cloud providers to quickly spin up a test environment sufficiently large to deal with a surge in traffic hundreds of parallel test cases can cause. You should use products such as source labs or something to help parallelize slow, long-running tests that use mobile phones . Despite our best efforts, bugs still make it into production. This is one of the reasons why releasing often is a good thing. Not only is the path of production well understood and hopefully automated, but it means the delta between releases is smaller when trying to track down a problem. It's so much easier to find a cause if only one thing has changed rather than 100. Intuitively, most organizations know that batching up changes is painful. That's why release needs to be signed off by so many people and go through extensive test runs and checks. It's why we try and do those large changes in quiet times, not when it's good for our teams. It's why we avoid releasing on a Friday. On a podcast I was listening to, Charity Majors said that Honeycomb's continuous release process allowed them to keep the size of their development team small. Her argument, which makes a lot of sense, was that as releases got bigger, they got riskier and needed more people to work on them. That means that not only are launch releases riskier and slower, they're also more expensive because they need a larger team to support them. There's nothing like the reward of seeing a new feature of Bugfix you worked on make it to production. Another great technique is to make use of feature flags and staged rollouts. This allows you to get a feature out into production and then monitor what happens as more and more people use it. There's a problem. If you have a feature flag to quickly disable the new code path, if everything looks good, keep going. Feature flags were a key tool behind allowing Facebook to iterate quickly. If you're interested, go and look for talks by Chuck Rossi or Grish Patangay on Gatekeeper. Finally, although it would be fun to do everything to make your software entirely document driven and about joy, the best piece of advice I can give you is to stop and think before following any suggestions in this talk. The general principles of having fast feedback loops is a great one to follow. Nesting feedback loops one inside the other, that's great. But maybe in your team or organizational circumstances there are good and practical reasons why you can't do that right now. That's fine. It means you have a goal to reach for and a path to discover of how to get there. Keep in your mind how important joy is to your daily experience of work and finding ways to let your development process be one that rewards you little and often. It's a worthy goal to keep in mind and to strive for. Even small incremental improvements applied over the course of many, many months can get you there. On the way you'll need to talk to people and to help them see why you're doing the things you're doing. So to summarize, fast feedback loops for the win. The test pyramid is still a bit ancient. Reliable repeatable hermetic tests run really well in parallel and so you should parallelize all the things. Finger in the air estimate aim for a 10 second local build and 10 minutes the irons and finally parallelize all the things. Thank you very much for your time and attention. I really appreciate it. I think we've got a couple of minutes left for questions. We have a fantastic happy All right, awesome. Thanks, Simon. We do have a couple of minutes for questions, but we don't have questions. So enamored by the talk that they are they didn't have time to ask you questions. I'm going to give them minutes to see if someone has any questions they want to ask. Of course, Simon is going to be available in the hangout table and we want to have a face-to-face discussion with him challenge some of the things he said very controversial things. Oh, no, none of them are controversial. Delete all that code. It's just like All right, awesome. Thanks a lot, Simon, again and thanks everyone for joining in. Thank you very much everyone.