 Good morning everyone. How do I pronounce your name again? Niana mentioned I'm Paul and I'm a developer. I develop end-to-end dotnet solutions at work. Outside work I'm involved, I'm volunteer in two awesome organizations. Big Data X and Data Kind. In Data Kind we use Data Science to help nonprofits get insights from their data to help the communities better. And at Big Data X we aim to improve the data engineering literacy here in Singapore and surrounding regions. Hence we try to organize free workshops and events for this goal. And last year I went to Australia. I usually attend conferences because those are the things that make me happy. I learned about property-based testing from Zach who's speaking right here talking about hypothesis which is a library on property-based testing. And when I came back to Singapore of course I delighted about the library so I shared it with the community here. I talk about property-based testing with hypothesis. But today I'd like to focus more on the essentials. Like what really is property-based testing if you want to dissect it a bit. So I try to define it like it's a type of testing that asserts based on properties that describe the relationship between the input and the output of the function that is being tested. Is this clear enough? Yep, exactly. That's why I like examples better. So let's take a testing example. Let's try to test a simple multiply function. For this test function that we have, do you have any comments on the test? Yeah, don't do this. Don't ever do this. This is like implementing the functionality as part of your test, right? So if this is not the right way, what other possible approaches in testing that we can possibly do? So one of the basic ones is giving examples. So we give an example of the inputs, like two and three. And then we try to provide the example output. So here we have two of those examples. Example one, we have this group. And then example two, we have a bunch of inputs. And then the corresponding output, four times five equals 20. And we can choose to refactor it, make it look beautiful. So we have the common logic here. We're still comparing actual output and expected output. But it's nice to have all of our parameters for factors and expected output in one place that we can easily add. So that's still okay. But what we want to do is, is there a way that we can have a way not to depend on the expected output? Can we do a test just by depending on the inputs alone? So I use the multiply function just not just because it's a simple function, but it's also something that reminds me when I was in primary school. So in primary school, we learned about properties. Let's say arithmetic properties, right? You have like addition properties. You have the multiplication properties. And when we talk about property-based testing, it's really this kind of properties that we're referring to. It's the attributes and then the characteristics of the function that we're trying to test, right? Not the properties that is like a syntactic sugar for C-sharp or properties that you own. Of course, you have maybe like bungalows and stuff. But the properties that we're talking about here is something like this. So in multiplication properties, you have cumulative property, which you can reorder. It's still going to be having the same result. You have associative property, multiplicative property, and distributive property. So how do we apply this to testing? So if you notice here, we got rid of the output now. We're just going to depend on the input data that we have. But using this input data and our knowledge about properties, that's how we're going to improve our approach to testing. So we can do the cumulative property in this fashion. So we just rearrange. And then using the same function that we're testing, it should give the same result. So that's the property, and we're trying to test the cumulative property. And so on with the other properties you can do also in the same approach. But here, if you notice, what we're trying to do is we're now using property-based testing. But if you notice, I didn't use any framework yet, right? Why would you use a framework for such an example? We're still using predetermined inputs. So predetermined inputs, that means I have to set these inputs manually by myself. The thing is, here, for the multiply function, what is the population of our inputs? Can I do the testing in such a way that I can exhaustively test the function? It's quite impossible, right? Because you have, like, factors which can have, like, millions of samples, multiply millions, and all of these is impossible to me to manually write it down. So is there another way that we can deal with this sort of fashion? If your input is only a to z, I can just for loop it, right, and then test all the data. But if your sample input is too big for you to do exhaustive testing, we still want to test. But what you might be able to do is, instead of testing using the whole population, why not, like, test, like, sample at a time? Like, maybe 100 samples, test with that. If it's okay, then try with another randomized sample. So at least we might not test all the population, but at least we're moving there. And hopefully somewhere down the road, maybe some combination will spawn a bug or something that we'll be able to learn from, right? So that is using randomized inputs. So in this example, I'm using now the hypothesis for me to be able to generate a random set of integers and another integers for the factors, right? And for this, the hypothesis is just giving us, like, 100 samples at a time. 100 samples is the default size of the sample that the hypothesis is giving. So you might ask, why use a library? I can code the randomized inputs by myself, right? Why should I use a library? Well, there's nothing stopping you from creating your own library or function, right, if you want. It's just that why do you need to reinvent the wheel if there are lots of libraries that are doing this already? Even if you're using it for another language like Java or C sharp or F sharp, why would you recreate the same library, right? So perhaps before you create your own library, maybe you can look into the open source ecosystem and community. Maybe there is a library that's already built for property-based testing for your language and maybe it's not perfect, but we're in a community. We're here to help each other. If there's something that you notice to improve, maybe you can send a pull request and make the library better and then it will help the rest of the community as well. So apart from randomizing, what else can a library do? So at least let's say for hypothesis, it doesn't give you just randomized samples alone, but it also allows you to have more elaborate input criteria. So here, apart from integers, let's say you want to only include those that are greater than zero, all positive integers. You can do that or you can do arbitrary conditions. Think of formal methods, right? You can define your criteria in such a way. And apart from that, a while ago, we were doing example-based testing. Let's say you have a situation or a set of inputs that you really want to test, but you're already doing randomized testing. You can do that also using this property-based testing library. You can have an example. So what it will do, it will run with your example plus the randomized samples as well. So you have the best of both worlds, right? And from here, you don't have to depend on the default value or default size, which is 100. You can increase it to a few hundreds more. And in fact, if you want to do it with continuous integration, you might want to increase it to a lot more because you want the more samples that you test, the more probability that you will be able to catch the bug that you're hunting for. So you might ask, like, property-based testing is really good. Can I use it for everything? So this is my thought process. I'm always open to new ideas in case some things change. But currently, this is how I see it. So I usually ask myself when doing testing, is that function that I am testing, does it have testable properties from the inputs that I'm providing it? If there's none, then I just continue with usual example-based testing. But if I can derive a generic property that I can sort of use as a high-level way of testing, then I try to approach it in a property-based testing kind of approach. And then after that, I ask myself, can I afford to test all possible inputs to this function? If I can, if the input is maybe just A to Z, then I can just for loop it, and then use predetermined inputs for that manner. However, if the input is a combination of inputs are so large that I cannot really test manually, then I'll try to leverage on randomized inputs and then test sample at a time. Now that we have randomized testing, I don't want to be manually triggering this, right? You don't want to be manually triggering for bugs. You want it in such a way that you might be sleeping at night, but something is working for you, hunting the bugs for you so that when you wake up in the morning, first problem or some bugs, you'll be able to just get a report that, hey, there's something wrong with your function, you should fix this, right? So you can have some sort of system that you don't have to pay for it. It will do the work for you. So why not? So that hence we have this hunting bugs with CI. We have this normal CI. You can keep it there. It will do its job. The purpose of that is so that we have early detection of problems that is there, that's okay. But what we want to add is we want to have another CI pipeline dedicated for our bug hunting. So I call it the bug hunting continuous integration. So the difference here is instead of being triggered by merchant pull requests and so on, I want this to be scheduled. And if the CI platform that I'm using allows me, I want to schedule it as often as possible, maybe every hour or every day if it doesn't allow me. The purpose for this is for me to hunt bugs. Even if I'm away, I'm sleeping, it will hunt bugs for me. It will try to do those around my inputs and then probably give me a result with certain combinations. Maybe it fails that system. So let me do some demo. So I have some screenshots, but I'm not going to use this. This is only for backup in case the internet goes off or the electricity has some problems or the world as we know ends. I can't do much on the other two, but you know what I mean. So what we have here is a sample for arithmetic, the one that we discussed. So we have the multiply function and we also have a set of property-based tests with randomized inputs. So you have random integers and then this. So what I'm going to do now, I'm going to integrate this using... Who uses GitHub here? Anyone using GitLab? Which one do you like most? I use both of those. So for GitHub, because it has nice integration with Azure DevOps, Azure DevOps, by the way, is free. Yeah, I don't know why they put Azure there because usually when people think of Azure, it's the paid version, right? But yeah, all of my setups are free. So I tried to... Let me add a simple CI for this. So inside Azure DevOps, I can just build a new pipeline if that clicks. And then I'll just go to GitHub and then select my... This one. Multiplication. Because it's done in Python. Multiplication GH demo. And then once I have that selected, I just need to configure. And then this one, I'll need to cheat a bit. I have something configured already. Let me just... So what it does, once you have this configured, it's auto-checking into your GitHub repository. So I'm just going to copy this and then copy it here. So I included the test step here, where I use spy test to test using property-based testing using hypothesis. So that's part of the step. And then I'm just going to save and run. I'll just commit it to master branch. And then once that is done, I should be able to see it, check into the repository. And then once it's checked into the repository, I should be able to see the steps in the CI. So it will try to pull the necessary files and information. The normal CI steps that you usually see, including the building of artifacts and then also the testing. So it's currently preparing for the job to be queued. So while waiting, why not I can take some questions if there's any questions in the audience for now while we're waiting. So it's doing the normal CI, what CI needs to do, CI needs to do. It's quite fast installing the assemblies. So I did this setup both in Azure DevOps. I also did it in GitLab. So my GitLab setup I'll show you also later. So what we're trying to do now is we'll set up the CI first. And once the CI is done, you don't want to be manually kicking off the CI, right? So you need to explore whether your CI platform has ability to schedule the trigger. So I'll show that to you also. So it's done. What I want to show here is it has a step on testing. So in the test, you should be able to see it's so small. So if I scroll down, you should be able to see the hypothesis statistics where the first test that we have, I'm using 200 examples. The rest are just defaults. So you have like 100 passing examples. So this is fine. This is good. So once we have this, the next thing that you want to do is to be able to set the schedule so that it will trigger on its own every day or every hour. Unfortunately, I think for Azure DevOps, the granularity where it allows me to do the scheduling is only, I think, once per day. So it's okay. Once per day is better than nothing. I can just increase the sample to 1,000, right? So I can test more. So I'll just go to the releases. Oh, sorry. It should be builds. So in the multiplication, I'll just edit this. And then in the settings, under the settings, you have triggers. And in the triggers, you should be able to see setting for schedule. Here, scheduled. So I'll just add a schedule. It allows you to do scheduling every day. And then the thing that I want to uncheck here is only schedule builds if their source or pipeline is changed. We don't care if there's a change in the source code for this approach for bug hunting CI. So I'm going to uncheck that. And then let's say every 2 a.m. in the morning while I'm sleeping someone or something is doing the job for me in hunting bugs. So I'm just going to save this. Save and queue. So it will save it on its own. For GitLab, it has this similar functionality. You can set up your CI also. But the thing with GitLab is instead of allowing you to just do it like once per day, it allows you to do every hour. So if you notice here, once I set it up, this is GitLab, by the way. It's building and hunting bugs for me every hour. So you can set it up. So it's a choice whether you're hosting currently in GitLab or GitLab. And these are free public repos. If I want to go to my builds, this is where I can monitor the results of the builds as well. So in this multiplication GH demo, every build result you can see from here. The ones that I showed to you about how does the test result write. So if something fails, you should be able to see it here failing as well. And then let's say hypothesis found that for this combination, your function failed. And then you go and fix the bug. So with that, let me bring it in the summary. So what we've done so far is we've tried to explore on various ways of testing a function. We've tried with a basic example by providing example outputs. After that, we refactored and then tried to do it in a parameterized way. And then we explored at least two approaches in property-based testing. We're using predetermined inputs and then the randomized inputs. If we're not able to exhaustively test using all of the possible inputs to your function. And then we also explored the bug-hunting CI pipeline where it's a pipeline that is separate to your normal pipeline. Let your normal pipeline do its thing. But we use this pipeline to help us hunt bugs with those certain combination of inputs. You can reach me here. That's my Twitter and GitHub handle. And I've shared the links here for the previous hypothesis demo if you want to dig more into the hypothesis property-based testing framework as well as the demos that I used a while ago. And that's it for me. Thanks. Okay, I think we have time for one question. Yes? Yeah, you can give the mic. Yes? Give a real example of property-based because there's no, I mean, like, multiplication as it seems, the test doesn't really catch the bug where it's overflow, let's say, integer overflow or something. Because you test commutative and all this, you might not get it right. Yeah, that's true. So apart from randomized testing, there's also this approach called fast testing. And usually fast testing is usually you have randomized inputs and then your goal there is try to do as much input so that it breaks the system. That is another type of testing. But for property-based testing, you're thinking of a predetermined or deterministic result where you want to test that once you have this result, it coincides with the property of the function that you're trying to test for. So what I'm sharing here is property-based testing, it won't replace all the testing that you've done so far. So you still have your example base, you still have your fast testing for that purpose. Plus, now we have another approach called property-based testing which we can add in our suite of kind of tests that we apply to the functions that we do. So other examples probably for this is just like multiplication, right? But other popular examples for property-based testing is, let's say, if you have, let's say, a list, right? You want to reverse a list. So one property there is if you reverse the list twice, it should be the same list that you input from the previous. So if you order like one, two, three, I reverse it, this is three to one, I reverse it again, it should come out as one, two, three. So the input and then if I reverse twice, it should be equal. And then of course there are like a few other approaches, like if you notice there's this diamond kata where it also teaches you on like how to spot properties from the inputs and outputs. Usually you try to look the input to the system and then output and try to check whether there are properties. The one that you shared, because people are also sharing that, hey, overflow is also a property or something, it might be case to case, I would categorize that maybe more appropriate for fast testing if that's the goal for that, but for property I want to have like deterministic expectation. Thank you. Okay. Thank you for the question. Thank you very much, Paul. Please give him a round of applause. Thank you everyone.