 Hi everyone, I am Srihari. I work at this place called Anilenso. We're a hackers collective based here in Bangalore. We're structured as a cooperative. We tend to like write a lot of functional programming in general, but we do a bunch of closure. So I'm going to give or share my experience in writing generative tests and share the process in which I've gone about automating the testing software as and when I've written them. I've had the opportunity to work on a few systems that are substantially large and the test software for those systems that are also substantially large compared to what you'd see generally. I'm just going to share that. So I'm breaking down the talk into these parts. The bulk of the talk will be in the patterns, but there is a lot of ground to set up before that. So I'll talk about why I'm giving this talk and what the talk is. I'll give you a practical system that we can hold on to for the rest of the talk so that we can look at all examples with respect to that. I'll talk about benefits and problems of generative tests in general, establish some ground terminology, and then the patterns themselves I'll split into four parts with the four phases of testing and I delve into those. So quickly, why am I giving this talk? Tests are good. I'm hoping that people in this room agree generally. There is a large part of the software world that does not agree or does not write any tests. I'm hoping that's not us. We should be writing a lot of tests, but we don't. That's the truth. So what do we do when we don't write enough tests? We automate it. The flip side of QA engineer walks into a bar, orders one, orders minus one, orders infinity, et cetera, is that machines are better at doing this. John Hughes made the case for this a long time ago. Quick check is in 30 odd languages now. I'm sure we are all familiar with that. People here have written some quick check before. Yeah, at least familiar with the idea. Yet, people don't write them often enough, even generate a test. When I go around asking people, okay, you've done some quick check. What have you written? And the common answer is, well, I've had some unit tests and I've like generatively tested them unit like function type things. I've had some data structure, algorithms, and I've generatively tested those. But often people get stuck in thinking about what are the right properties to test beyond that? How do I think about the right properties for my system? Right? It's easy to think about like circular buffers and sorting algorithms. But what about someone logging in or authentication? Right? Those are real. Those are the real systems that you build on a day-to-day basis, systems with side effects. Talking to a database, a third party. This is the kind of software that most people write on a day-to-day basis. And how does generative testing apply here, if at all? So that's what this talk is about. That's why I'm giving this talk. And the premise of the talk is going to be automation. So we don't write tests because we are lazy. We don't write generative tests because we are lazy. Or maybe we don't know how to, but probably because we are lazy. So what do we do about it? We automate. The interesting thing here is that the unlike regular software that involves automation of domain and processes that are tangible, what we need to automate here is us. We are the people that are writing tests. And if we want to automate that, we need to automate us in the process of writing tests or our thought processes while we write tests. So that's interesting. And in that same vein, what I would think would be a takeaway would be the process in which the automation happens here as it's subject to humans' writing tests. So in terms of setting a common base so that all of us are thinking of a system when we are looking at all the examples, here's one. Consider a very simple generic e-commerce system. I'm sure everyone here has used Amazon Flipkart or something of that like. And this is that. So you have operations related to the user card and payment. A user can sign up or be created. They can authenticate or log in. They can add a payment instrument. You can find information about the user. In the card, you can add things, update things. You can delete things. You can get the card and you can pay for your card. You can ask for a refund. You can apply an offer and you can get a receipt. And to visualize this a little more, think of a simple flow that you sign up and then you log in, you add a card as a payment card, a credit card or something. And then you add something to your card and then you update the quantity and then you pay for it, a simple flow. And this is the sort of thing that I'd like to test. And how do you go about doing this? Before that again, in terms of setting the right platform, what are the benefits and problems with generative tests? Let's try to understand that before we solve it or try to solve it. The benefits are that it's supposedly better at humans, better than humans at input generation. This is largely true. To the extent that it's not true, it's because it's not as domain aware as it's supposed to be. Humans are better at it because they have context. They can talk to people and figure out what should be the case and what should not be. And unless we tell the machine that, it doesn't know and it's mostly just going to be a spew of random input. And the other benefit is you can write maybe a dozen examples by hand, example-based tests. But with generative tests, you're thinking hundreds or thousands. So more tests generally gives you a feeling that you have more coverage, which gives you a feeling that your software is somewhat more tested or you feel more confident of putting your software in front of the users. And ultimately, that's the benefit of all tests, which is you finding bugs in your software before your users get hit by it. So these are the benefits, but again, the premise is, why don't people write generative tests often enough? So one of the first problems is inertia. And a lot of other people that have done a really good job in speaking about this, John Hughes, of course, has given a lot of talks, specifically with respect to Clojure, there's Gary Fredericks. He maintains the test check library here. And he's done a fantastic job of giving some ideas about how to build custom generators and such. The other problem with generative tests is thinking in properties. Not only does it take quite a bit of practice in trying to think about your system in properties and not examples, but it's hard even after you get there. So like I said earlier, circular buffers, sorting algorithms, you've got it. All the examples, all the textbooks have that. You know how to do that. But how do you write a property for login or authenticate or adding something to the database? These things are stateful and most software systems are crud. How do you test crud generatively and effectively? The other problem is, and I've done this a bunch of times, I've put generative tests in CI and you know the moment you put it in CI, it's going to go from 30 seconds to two minutes. And then the moment you add one more, just one more generative test with a thousand cases, it's five minutes. And then before you know it, it's 10, 20, 50. It grows exponentially. And I've seen people lowering down the number of tests that they run because it's slow on CI. And the sad part about that is that I actually don't care about 80% of the tests that run on CI at that point. I actually only care about the 20% of the tests. It's not relevant. For example, if I'm adding a thousand items to a carton purchasing it, sure, I understand it's an edge case. It's probably something my system won't handle. My system is probably more equipped to handle 10 or 12. Sure, I will have some overflow cases when it gets to a thousand, but I don't care about that as much. I don't need to fix that in production today. So it's that the time spent on CI, even if a lot, is not relevant enough, is not spent in a useful manner. And the notion that more tests give more coverage is flawed because you can write a thousand tests that go and hit the same happy part. And you don't know that it has captured all the parts. And to do that, you need to write tests that are written to hit all those parts. And you need to write your tests to be aware enough to do that. Excuse me. Which brings us on to this other part, which is in order to write that logic into your test, you need to be good at it. Writing real code is hard enough. I'm going to write code that tests this code. That's even harder. And in practice, what I've seen is this ends up needing one or two senior engineers full-time sitting on your generative test platform or your test engine and ensuring that it's running all the time, especially so if you have it running on CI. It's brittle and it fails and if something fails and it's very hard to debug all the jazz. And your software changes and this is often the case and your tests need to change. So every time something changes there, you need to go and change something in your test in order to keep up with it and that takes time and effort. And another big issue that people don't talk about as much is what happens after you get a failure. With an example test, you know exactly what your input was to a unit test or something and you know it gets more and more complex as you look at higher levels of testing and an integration test or a system level test. There are more inputs, more complexity to handle and at some point if there's an error or an exception thrown somewhere, you have to know all the context to get to that and that is difficult, especially with generative tests because if you are doing things like simulation testing, for example, which is just generative testing at a system level, you'll end up trawling through logs of multiple services to get where you want, to understand what was the issue. That's hard and at that level where you're writing tests against stateful systems, your tests are not going to be deterministic and you might even find it hard to reproduce. If it fails on CI, how do I ensure that I get that same failing case on my machine and run it and reproduce it and then I can debug it? That's hard. Excuse me. And you need tools to debug. So to get to the root cause takes time without the right tools. So these are the problems that I've seen, I've faced and each time I've come across these, I've done one or the other thing to make my life a little easier. So terminology. I think about tests in three segments. So these are not mutually exclusive. So every test can be in a particular phase, test at a given level and have a certain focus of testing. So phase, you might be familiar with the arrange-act assert or the assume-arrange-act assert, but I think for generative testing what is more applicable is generation of input. Execution of your tests, assertion, and then diagnosis. People often leave out diagnosis. I do want to consider that as a part of the process that I want to automate considering it's as humans that are doing the diagnosis, need to do something there. And the second thing is the level. So this is the classic test pyramid, famous or infamous. The understanding is that higher up you go in the pyramid, if you're writing system level tests, it has more value because you're testing from a user's point of view. And lower down it's easier to write, higher up it's harder to write, it's probably more brittle, and lower down you have things like regression but they also run very fast and system level tests run somewhat slow. In the spirit of questioning this as our testing paradigm changes from manual tests to generative tests, why not write more system tests? Because what's stopping us there? If system tests weren't so brittle, and if they weren't so hard to write, why wouldn't you be writing them more? And extending that to a little bit of a fantasy, why not write only system level tests? If you could write system level tests and had the right tools to diagnose the root cause at all times, you would always test from the user's point of view. Because that's the one that you care about. Your system can change over time, over years, with all the abstractions in between changing. But that doesn't matter as long as you test sweet passes, and that should be from a user's point of view. This is a little fantasy code, we're not there right now, but I think we should strive to get there and I think it's not out of the limits, it's automation. The other thing that I've gotten wrong with generative tests in the past is focus, so people think, oh, there's a lot of tests. Now I can use that for doing performance testing. So there's load testing, stress testing, endurance testing and whatnot. And there's correctness, which is a range of inputs that hit your system in a particular way in order to find out if I'm doing the right thing. And there's regression, which is, I hope I've not changed anything in my software that makes it not work for a case it was working with before. Now these three are different notions and different focuses and often with generative tests, we try to do more than a single thing at the same time. And I have found that to be a bit of a mistake. I think generative tests can be all of these three things, but not at the same time. So you can have a CI test that runs a few flows as a smoke test. But then the same thing, you can have it continuously running in another instance on your staging environment or something for days together. And then you have your endurance tests and your performance tests and stress tests, more abstractions. So the system under test is a commonly used acronym, but I want to think about the system under test, the entire system or a few microservices or a few namespaces or a few modules or a single function. All of these things, I want to think of it as a single unit, which is a function, right? If you can think of your entire system or a group of your systems as a single function, stateful function, then you have an input and you have an output and you have some state. So for a single function, your input is an argument and your output is a return value. At an HTTP level for your system, you have HTTP request and response, right? And then you have state. Here, I am mushing third parties, your database and all the IO that you do all into this notion of state, right? And these are the three things that we need to understand and automate to automate testing this system. And since I'm conflating multiple notions into this function here, I'm going to call it an action, so this is what I want to test. It's the action. And the archetypes that I want to automate are these two. One is ourselves, the people that write tests, originative tests. And the other persona that I want to automate is a QA engineer, which is think of the domain knowledge that they have when they perform a certain test or give certain inputs. With that, patterns. And the customary slide, which I'm sure the non-closure audience here won't get. But what I'm doing here essentially is taking all the notions of a software pattern and throwing out the door. And I'm saying that because all that comes with a lot of baggage. By patterns, what I mean here is I've done this a few times before and I've seen these things emerge. And I think it's worth calling these out because, well, I think that's a lot of what humans do, which is recognize patterns. But maybe together a bunch of these patterns can mean something bigger, which is usually the case if you're getting to something more simple and profound at the end. So patterns. And I'm going to split the patterns into the four phases that I had before. So generation, execution, assertion and diagnosis. So delving into the patterns first is generations. There are five patterns, I'll just dig into them. So first thing is derive a parameter specification. Your test should know what are the arguments and the return type of your function, which to all the people here that write a type language. Good job. You've got this under control. But for the people that don't, if you're writing either Clojure or Lang or Elixir or something, then this is something that you can do. Clojure has this specification library called Clojspec or Prismatic Schema that you can use to type or specify the arguments and the return value of your function. And you can write it as metadata. And here's an example of how I do it in Clojure and say these things are non-empty string, this might be a nil, I need a valid email here, et cetera. And the interesting thing even for the people that write type languages is that this lets me strengthen my types or tighten my types inside the tests, which is useful in generating input. So that's the flip side of this, which is now that I know what I need to pass into my function, I can generate it. So this is what test check gives us, right? And we can leverage it. So we have a Prismatic Schema or Clojure spec that gives us this. Sometimes we might need to write some generators. So how does the language know how to create a valid email? You tell it. But chances are for a large system, you're going to have to write a dozen of these and that's about it. You don't have to write a lot of these. And once you've written it, that probably won't change much either. And I want to get to this point where I'm saying, by the way, I'm writing all this for the action that is user create. So I'm using a namespace keyword here, which is user slash create. And what I want to be able to do is this. I want to say assert that user create has no errors. And I want the rest of it to be automated. And it's not much of an ask, right? Because I know how to generate the parameters for this. I know how to run this. And how do I know if it's not an error? Well, depending on the function, say that there's no exception or it returns only a success response or there's no time out here. That's fairly simple. And this is a very simple example of what you can do when you know your parameter specification. And this is from where it gets a little more interesting. So how do you know what to run before you run the function under test or the action under test? Which is the arranged portion of your test. If you're familiar with that, arrange, assert, acronym. So you need to run a few things in order to run your test, right? How do you know what that is? Chances are it's another action. So my proposition is we express those dependencies using a DAG. And it's as simple as this. So let me take a few examples. User create has no dependencies. But if I have to add an item into the cart, so which is cart add item, then the user needs to be authenticated. So that's a dependency there. So add item depends on authenticate. Another example is payment pay depends on add cart and add item. I'll take questions towards the end. I probably am going to go over 45, 40 minutes anyway. So yeah, so payment pay depends on add cart and add item. And add cart in turn depends on other things as well. So it's transitive, right? So given an action, I know what are all the other actions I need to execute before this. So now what I can do is make a list of all the actions and say, ensure that there are no errors in any of the actions before. Because I know what all needs to be done in order to run an action, right? So this helps automate the arrange phase, right? So now that I've automated some of the parts before running the test, before running the action on the test, what about what I'm going to test after this? What happens after I run this? So if you remember the original flow I had written where someone signs in, signs up, logs in, adds something to a cart, makes the payment. I want to model that so that I can test that entire thing. And we can model that using a probability matrix. And it's as simple as this. So user create, what I'm saying in that first line is that the chances that authenticate happens after user create is 90%. So 90% of the time after user create, pick, authenticate. So someone logs in after they sign up. And that happens 90% of the time. 10% of the time it's any other action as long as it follows the dependencies that we saw before. So another thing is an add item, right? 30% of the time after adding one item, the user adds another item. 10% of the time they update it. 20% of the time they remove it. And then 20% of the time they pay. And the rest of the time, whatever. The interesting things here are that 0 and 100 are special cases. So 0 means that this should not happen. The probability of this to that should not happen. And 100 means that, say, for example, in a system where you combine the notion of sign up and log in, and as soon as you sign up you log in, then the probability of that should be 100%. So you can model that. And this is really interesting because it lets you model user behavior so consider infrequent users, right? The chances that infrequent users' sessions time out is high. And you know that you've had issues like that in production. You want to model that. You can do that here. By specifying the amount of time that a user will spend outside and then model their probabilities moving through the actions. Another example might be modeling some of your user archetypes. So, for example, if you have a rich user's flow, right? Like, and say that 50% of your users are rich users and they never ask for a refund. Or they always have enough money in their card to make a payment. You can model that, right? You can say that the refund is a 0 and whatnot. And this is often the case because on the systems in which I've worked, there are an archetype of two or three customers. You categorize your customers and say this is this kind of customer and this is the other kind of customer and there are like three types of them. And you can model those three types and make your tests more relevant. So, the 40 minutes you spend on CI, it will be worth it. Excuse me. And this is an example of what this looks like. So, it generated a list of actions to run. So, here in the first one, the user signs up, they authenticate, add items, whatever, and apply the offer and then they do cancel. This is generated, but you can see that, you know, if you're writing an example tests, what are the chances you do all this and they do a cancel in the end? On the second example, as soon as they add an item, they do an offer. They add an offer and then to after the end, they do an update item, right? Because that's also interesting. It's normal to see all these things in generated flows. That's what I'm calling out. And note that this is not just a list of actions, actually. I've made it brief here. It's actually a list of the actions and the parameters that we need in order to make that call. Together, what this makes is a specification for testing my action, right? Note that all this is data. Another thing that you can do here in order to make your tests more relevant is make some of it deterministic. A big part of diagnosing some of the issues with generated tests or simulation tests is that a lot of values are just random. You don't know what's going on. Deterministic seed data gives you a certain ground to work with. So I can say that create this user, log him in and add a card. And that much I know has to work and these are the values for all those three things. Test the flows from there, right? And that gives me a lot of base. So I know that this person, for example, has a thousand dollars in their account no matter what the rest of the tests do. So I'm good, right? So that gives you a certain part of your tests that won't fail so that you can focus on other parts that do. Or a certain part of your flow will work and the other part of your flow you can concentrate on. So this also gives you determinism and it simplifies debugging. One other thing that is non-obvious and this is one of those things that you'll understand when you start writing tests outside of your system which is you'll have to model a domain that's not modeled in your actual software. So, for example, we had to model the inventory of supply. Actually, a better example is money in a user's card. So half the users, I want to say, they only have $50 and my average purchase is like 75 or something and I can model that. And I can actually write tests that generate that money and put that in the card but I'd rather model it because what I'm doing on this side of the software when I'm testing it is I want to be like the user and so I have to model the user and the user's never modeled in the real software. Or another example could be a supplier, right? If I'm on Amazon, I don't need to know how much inventory the supplier has but if I am writing some logic that tests some of that and I want to emulate the supplier, then I need to model the supplier on this side and that really helps. So that's for generation. And note that all of these patterns are going to be forward and backward referencing because, I mean, it's not going to be cohesive. These are patterns that might have code in one place but have an effect somewhere else, right? So execution. So we spoke about this action earlier. So there's an input and an output but what's really going on there is a little more so each action has a life cycle. And speaking of forward references, I'd like for you all to take a leap here and just hear me out for a couple of minutes before you get to why I'm saying this. Store everything. Your tests should store the request, the response and the state at all points in time immutably. This gives your tests a lot of power because you can then introspect your test while running it and you know the state that your system was in and your test wasn't at any point in time even afterwards. This helps you report your tests better and diagnose it better. So let me give you an example but before going into why or how it helps, here's how I implement it. So I use data script. Are people familiar with data omic, data script? You heard of it? So what it is is in-memory database, data script but it gives me a data log query engine. So what I'm saying here, well, this is not the query part, this is the storage part. So I'm storing some metadata which is the flow, this is the action which is this user is doing this thing and I'm storing the request, the response and the state of all the entities related to this action immutably and this is the kind of thing I can do with it. So I can say, and this is the data log query, I'll explain it. What I mean here is give me the items or the SKUs really that was involved in this flow. So for this, given this flow ID, give me all the SKUs. That's what this thing says. Another thing is how do I know the parameters I'm going to use in the next thing, next step, just one second, yeah. Or rather, what were the parameters I used to make this request? I can look that up by saying for this flow ID, given that it was a request, give me everything. And another thing I can do is since I'm the test engine or I'm doing all this from a test engine's point of view, I know everything that I have done and so I can actually give you a timeline of all the things that have happened in the flow which is super useful. In a test, you're often looking at logs that are telling you what's happening except here it's data and you can dig in to each point and tell what was the request, what was the response here, when was the error and you can walk back and forth. So given this flow ID and given this user ID, give me everything sorted by time. So there's another part here that I glossed over which I'm hoping some of you caught which was the magic of how is it that from one action to another action, you preserve state or if I created a user named ABC, how do I log in with user ABC? How is this automated? And the answer is rather stupid but simple which is the good ones. I'll call out the flaws with this but the point is just take the latest one. So with this example, pick the latest notion of a username that appears in your state. So if I created a user and logged in and did some things but then I logged in with the other name, pick that latest one and do the rest of it. It's heuristic but it's a really good one. So how this works is this part of that cycle where you take the generated params, you mush the state params in and then you call your action. So you have a generation except during execution there is some state that needs to be ingested in in order to make the action. And like all things, when they are very well automated, you need some ways to manipulate that automation so that you can get some control over it and hooks are great for that. So in this case, before making the call to that action, I have a hook to say that, you know, especially when I am creating users, I prepend gen user hyphen in the name. So generated values come across with the state mushed in and then I can with this function add another thing and pass it through because functions in first class and whatnot. And another example is to make a payment, I need a card and without a card I can't make the payment but I can say for all the payments use the default card unless specified. And here in this params function, I have access to the flow, the params and the entire access to the state as of that time so that I can make that call accordingly. Right? So another thing that you run into is errors, right? What do you do when you have an error? And more importantly, well, if you have a real error, well, it's a good thing, you found an error. Okay, that's what test is supposed to do. Now you go and fix it. But what if you do it's an error that you know is going to happen? Like two users with the same name are being created. Well, yeah, I know it's going to be a duplicate user but I don't want 80% of my tests on my CI to be this. And not do I care about my CI reporting these bugs back to me because I know it's a thing. So this is a thing. Known errors are a thing. Recognize them and write them down deterministically. So with user create duplicate user is a thing, is a known exception, with paying an inactive session is a known exception or a payment complete. Payment is already done, I'm paying again for it. I know this is going to happen, but do you accept it that it has happened and it's okay for it to happen? No, right, because when it does happen, you want to ensure that it happened for the right reason. You don't want to be throwing a duplicate user back when it's actually not a duplicate user. So when these things happen, you have enough control to verify that it actually happened. So I have a duplicate user, great. Now what do I do? Let me actually call the system and find out if there is a user like that. And then I say that it's all right, move on. Otherwise, I report it as an error. Similarly, with payment complete, what I can do is I can look back at all the actions I've already made until now and tell if I've already made a payment call. So yeah, this is a duplicate one, I know it. Let's move on. And a beautiful part of this is that you can absorb these known errors into your system. Rather than have them in your test software, it's better to have them in your real software. And maybe even, you know, let them out to the users in terms of the specification or even as an error catalog somewhere so that you can return IDs and whatnot throughout your software, it's a good thing. And another beautiful thing here you can do is because we are in the business of automating tests and how we do tests is you can abstract the request engine. So imagine you're doing CRUD or something like that and you have HTTP-level requests and you have a controller-level requests or something a little lower than that. You know, you have those three, four layers before it comes down to the model logic. You can abstract that out. If you abstract that out, you can run all your tests at whatever level you want like that, just switch the medium to... I don't want HTTP, I want controller and then all the tests will run at a controller level because you know the parameters that each action needs to get and you know how to automate that. If its function is great, if all your HTTP calls and whatnot end with functions and you know the types for those functions, you can just make function calls. Excuse me. Think of this as a middleware in your test engine, right? That's really the pattern there. A hook in between your tests and your actual system that lets you change things, monitor things and wrap things in or test a different part of the system based on what you give. So that's bird execution. And then this is the part where people get stuck often, which is assertion, right? How do I know what properties to use to test? There are three things I'll go into. Domain-based invariance is mostly the upper completeness, but there's algebraic properties and state-missions properties. So algebraic properties. So there is five. I'm a 38 here, but yeah, I'll try. So there's no error which is also the upper completeness, but there is two columns. So some of these tests are applicable at a unit single function level. All the other things have something to do with order, which involves more than a single thing that you're testing. So there's equality, add importance, all of these things. Let's go into each one of these things. So no error. It's a very basic thing that you do that has a lot of value. No exceptions. Your system should not have any exceptions at any point in time, right? There should be no... And the kind of errors that you get out here are things like overflow errors or out of bounds or, you know, your database size limits and whatnot. And no timeouts. If you're writing a highly available system or a distributed system, you need to ensure that there are no timeouts or if there are, you need to look at them and that it's always available. If you're running this as an endurance test on staging or something, this is really cool. Really helpful. And then there's equality, which is effective equality. So for example, sign up creates a user, but as a response, it does give the user back and that's the same as user get. So if you do sign up and user get, you should get the same response. Both these calls don't leave the system in the same state that it was before, which is why it's effective equality. These are properties that you can still use to test. For example, an update item with a deleted true, which is basically a soft delete, except executed as an update, is the same thing as a remove item, right? So different parts, same result. Another big one is idempotence, and there are two classes of these idempotence. So for example, log n. A user does log n and then logs in again immediately after. Do you write an example test for this? Or do you not? It's probably worth testing these things. Payments, refund and then refund again, useful testing that. But that's one kind where you're calling the same action immediately after it, and this works for times when you have other machines doing retries and whatnot on your system. But there's also the other kind where I log in, do a bunch of things, add items to cart, and then log in again. Because your system is in a different state, at that point in time, you don't know if it'll behave the same way. And idempotence is super important in distributed systems and critical systems like payment and health things and whatnot. So as and when we add more things to automate, we need knobs to ensure we're doing the right thing there. So I have here a percentage of the flow that needs to have these idempotent actions. So I take the existing flow and then I add actions in between to ensure that it's idempotent. And then I can take the same action and put it later on to ensure that they are distant and I can choose what percentage of the times I want to do immediate and distant. And in my system, I have a blacklist of tasks, of actions that I know are not idempotent. So I will not test those, but depending on your system, you might have a white list instead. And then there's inverse, which is I do an add and remove and it should leave the system in the same state, right? So this is the classic inverse. And asset equal here means that I do a cart get, I get the cart before I do both of these things and then I add, remove and then I get the cart and it should both be the same thing. And there's the normalized inverse version, which is pay, cancel, pay should be the same as pay, right? And then there is commutativity, which is you don't care if things are in order. And this is crying automation, right? So you have add one and two and then two and one and both of them should be the same thing. And even better, you can add some syntactic sugar around it and say that, you know, try these things in different orders. Even asserting that your operation is not commutative is a property. So take any of these earlier things and if they're close enough, but they're not just as close, they are not equal. That's a property that they're not commutative is useful. And then there is, of course, state machines. State machines are great. You should model them in code where they make sense and you should use those state machines in your tests as well. So here's an example of the state machine with the cart. So as soon as you log in, you have a new cart and then you add item, it has items and you cancel it and you pay for it, it's a success, it's refunded and it's refunded. I can take this state machine and put it in a matcher. This is a Closure Core Match thing where I'm saying that this was the previous state and this was the action and that was the result and this is the new state. So these are all the valid state transitions that I can go through. Now I know all the carts and all the state transitions throughout my tests. So I can go back and then validate that there were no invalid state transactions, state transitions and this was great. Like it helped us uncover a few important bugs. On that, oftentimes it's very interesting to see the results of this because the amount of effort you put in, you often expect a certain amount of results. And I've written this bunch of code, there's like 10 lines of code and I'm expecting like one or two errors but I end up getting 50 or 60 and that often throws you off because whatever work that you should have done in a month is done in a day. And with lip-sync, it's a good thing but it scares you. Domain-based invariance here are here for completeness but note that you have the request response and the state to figure out what's going on. So things you can do are things like if I created 10 users and I do a user list, then that has 10 things. Lookup by ID, lookup by name, lookup active, all written on the same thing. The number of items I sold are the same things and the number of items the customer bought, etc. And then there's patterns and diagnosis. It quickly goes through this, it seemed to be running out of time. Your test is data at this point. All that I've had so far, the action spec and the state, if I store it all, I can move it around and at this point, that is my test. If I think of my test engine as software, then this code is my test. And now it's data and I can throw it around. What that means for me is I can take something that runs on CEI, make it spit out some data, copy that and run it locally and ensure that it reproduces the bug. I have reproducibility. One, second thing is across versions. So if I run all my tests against v1 and it passes and I store that test, that test, that data that I have now is my regression test for v2. So this gives me both determinism and reproducibility or rather determinism implies reusability. Another thing is if you have like a hundred flows or a lot of flows running, oftentimes you don't know where things failed and it's really useful to think of your code as a tri-data structure where each move helps you navigate to your error really fast. So one thing that has helped me a lot in the past has been checkpoints. So after 10 things, you know that there's been a hundred flows. Out of them, this thing failed, this one flow failed and then there I know that the user went up to add item to cart. So if the user has gone up to add item to cart, so I know that they have logged in, they have done a bunch of other things. And then I can diagnose the error from there on. So these domain-based checkpoints imply that I am telling the machine that these are things I care about. That's the part where I'm automating my thought process into the software that I'm writing. Yeah, done. Three or four more slides. Yeah, we can have questions outside, yeah. So one minute to wrap this up. Okay, so what I can then have is a timeline and what I can do is each action is a UUID. Given the UUID, I can tell the request response and the stack trace. And here's an example of an issue that I filed on GitHub with the exact results of what I just showed earlier. Bug reports can be automated. This is just a summation of all the patterns I have so far. Bottom line, automation is great. We should think of automating test software in the same way we think of automating real software. The references are Microsoft PECS patterns. So there's parametrized unit tests in the F-sharp and they have written a framework called PECS surrounded. A lot of the algebraic patterns and stuff are from there. They're very good material. The Datomic team has a software called Simulant and they do a bunch of these things as well. And that code is pretty small. You can have a look there. And that's it. Thanks for your patience and thanks for listening.