 Okay, so I want to welcome everyone to this meeting of the Houston Functional Mutual Programming Group. Today, we have our owner speaking. I'm going to let him introduce it. So, but he is our relatively new liaison to improving and organizing for all the technology and making sure everything actually runs. And so we are very, very grateful to him. I appreciate it. Turn it over to you. Okay, cool. I'd like to talk a little bit about myself. Introduce yourself. Okay, my name is Ahmed. I am an approver here. Yeah, I am. Sadly, I don't. Professionally, I know for a couple of languages, I see sharp. I do that. Usually some kind of scheme of records. So, what I do like it. So, um, yeah. So, um, yeah, so this talk today is about profit based testing on F sharp using F s check. And in reality, it's, it's really, it's really, you know, I'm using F sharp at this check. It's going to be about profit based testing in general. So what I'm talking to you about is really more set apart based testing. And, you know, and just, you know, kind of what it is, how it fits into your testing strategy, the mindset to bring to it because you can really So, but again, I'm going to show the deficit. Nice to speak up. And so, um, and if you're going to talk about like, how the mindset. And if anyone has any questions or anything, please let me know. So then our show just so we speak up. And so, um, so yeah, so we'll talk about like the concepts of my team are pbt's tests and variants which are central to, to really, uh, this is all about. And I'll walk you through on the actual like toy example to show you all the features and then hide all the case that you actually use this. And the caveat again, I'm not an F shop or FS check expert. I gather in the various language is includes. And he's like, talk to this test. So, so, so before starting, um, definitely here, like winning automated tests, are you familiar with the general idea behind automated testing. You need a bit of a sort of pressure on this pretty comfortable. Take that and you decide on it. What's best. Okay. So, okay. So, um, so right now, so, so right now, again, to run FS check, of course, install it. And so I just saw what we did. FS check can be used. And that's how we use it here. Because I'm running everything here. It's basically like you put it on both. So I'm going to run this directly. But FS check also does have support for any unit. So if you're using those frameworks, you also have additional libraries that fit in very nicely. So I'm going to be running it directly just because of the way. So we have this check. We have installed. And now we want to, we want to talk about the code test. So I want to go through, like, give you like a toy example to test with something that can get us to focus on the test but still has enough interesting edge cases to kind of, you know, give us something to focus on the process, right? So I'm going to write basically a division function, which essentially is the same thing you're running. I'm going to write my own intentionally not so good version. And how we use it as a way to kind of show the different, different sort of capabilities, features of FS check. And so I'm going to do some thing testing as well. And then we move on to something. So I divide. So I said, okay, I want to division function. I'm going to attend that math. And I'm just going to say, yeah, okay, fine. I'm going to take, you know, divide X and Y by questions. Why do you work? Let me get back a portion of it. So I define this function. Let's say this is my function. It's going to be my function. Okay, I want this function. I want to start testing. So normally, the way we normally write texts is what's what we call work on example based tests for a lot of people who just call tests because this is kind of like the people, right? And essentially, essentially what happens is, you know, you essentially go ahead and call what you test. You divide, call it on a specific data points, right? And you compare it to what the results. So essentially you're tested at the same time. This is an example of input. I know this input should produce this output. Here we go. And at the heart of it is essentially some kind of, you know, social expect. Okay, so here I run my test, I run an assertion. Yeah, right about two by two. I should get a question. Yeah, okay. I'm shipping, right? Yeah. We're done. We have a one test. We're good to go, right? Yeah. We should have to do this. And of course we should have to do this. We got bigger problems and not using problem based testing. But yeah, essentially with enough of these tests, right? If we have enough of these types of example-based tests, that we get more confidence that the code that we have is working the way it should. Okay, now, and again, this is kind of like a beautiful, whenever you talk to people about, about any kind of unit test or automated tests, a lot of times it's example-based tests, right? We don't really think of any other kind of tests, right? And it's, I mean, it has some advantage, right? And we can find that it's correct, at least for very specific data points. We can actually test a function based on what we know should go in and out, without any kind of like, in a little bit overview or insight into how the question works. But disadvantages are we'll have to provide all of our data points. We'll have to provide enough of a representative set of data points that we feel would cover the testing space enough to wash out the bugs. And it becomes easy to miss, like, edge cases from different code tasks, or even to, like, let our own bias keep going to the test, right? Even if we're not trying to be wise, right? So, can I turn up with a very quick story? Sure. Back in the 90s, I think she applied, started from back in the 90s, when I had been hired and researched her, and I ended up, I was tasked with writing tasks. And as writing those tasks, the system, was writing all these tasks for things that couldn't possibly happen. And I need to stop writing those tasks. The impossible happens wrongly, I think. You don't remember anymore? I just remember thinking, I don't miss that, you know. But I mean, that's a really good point because there is that danger that you think, okay, well, this is always the way it should be used. I'm not going to test it. Maybe it doesn't even include any of something that we could or should test. It's just so much like that world. That there would be a thing about it, right? And that's one of the things with a web-based test that, you know, sometimes the people who still open these test cases not realizing where the test case is. So, what are our alternatives? Now, before I say, I used the term alternative, but in reality, probably based tests, and if anybody's tests, no side-by-side, use them together, one feeds into the other, it's not an either-or situation. You use the one that's appropriate. So what does an initial tool look like? And probably based tests. So let's look at an example of a problem-based test, right? And here's where you get to see some curriculum. We're talking about curriculum. Jasmine, you were a little bit weird to have shops on these popular curriculum. So the problem-based test, what you do is you're testing for properties that have to universally hold, regardless of anything. Another way of looking at it is this. You're going to write a test, but you don't get to choose what the data is. And you have to test those things. So, and so you look at the things that are called imperatives, things that have to hold no matter what your data is, no matter what your data is. So in this case, so we start thinking, okay? The vision, the vision. What's something that universally holds no matter what your vision is? One above is that I've got a proper division, right? Anything divided by one is equal to itself, right? X over one equals X. Okay. So what I do is I write this property, right? And this basically says, hey, this is a, this is a property. And of course it has two arguments. One above is the function to actually call and the other is the input X. Now in this case, I'm not actually recording the actual functions. I want to pass in some different questions. I want to pass them divide. I also want to pass in some more functions. So I'm going to say, hey, yeah, any problem, I'm going to pass out a function. I'm also going to pass in, you know, some argument around the systems that get some argument. And I'm going to say, hey, this function would apply to this argument applied to one. That's going to be a tuple of that same argument in zero. Anything divided by one. So that's my identity problem. And that part of it is this thing here. Check. What this does is it'll tell a property based tested forever. Hey, go ahead and run this over a set of random values. And confirm that every single one of them. So in this case, I say, hey, take this identity property. And pass it to the division. In this case, I say, hey, take this identity property and pass it to the division. In this case, I'm just passing one. Because it's current. It takes up arguments and it has a function basically that expects one more argument is unspecified. And the property based tested forever FS check will provide buys for the other part. And it knows which one they need to provide because it knows the type of this. Right. So I'm going to run this. Yeah. And you can see right here, it has 100. Now. And again, it takes a moment to run, but that could be just because of the book. So. So. So over here. So you're basically rain 100 terms was random values for X. And it confirmed that for all of those runs. Every single one. Right. See which values. Yes. In fact, later on, I'm going to show you how we can actually see all the values that are generated to troll the. Yes. Sure. Answer for both check for both long answer. You can see the moment. Yes. Yes. We know. Yes. You're anticipating it, but yes, there are. The support for the turds. I'm going to show you how to use those in a little bit because we're going to run into those filters and especially around your points. Yeah. So kind of, you're kind of anticipating. Yes. You can see how to do that. Yeah. Yes. Yes. So, so this particular case, so this, this is a really good question. In fact, it's something that, that bears a little bit of. A little bit of attention. So. It does take random values, but I've noticed that it'll actually pick random values that are constrained within a narrow range. So it's not actually picking. It's not actually generated. Say. Values from say. In between. It's actually taking a much narrow range. Some of them are negative. Some of them are positive. Some of them are around zero. And I think it's doing that intentionally because it's trying to pick. Basically values that are likely to off-issue. So in fact, when you look at the verbose that we want to keep past the verbose, you're going to notice that it's generating a very, very narrow range. So you can give it an N size. So generate a much broader range of numbers. But you do that. It's going to be much less likely. It's going to start hitting say, but it's going to be a much smaller range. So the answer is right? But it's constrained to a really narrow range. Which nodes. So it's, yeah. So it's basically, so it's not generating, it's not generating say. Like the whole range of numbers, it actually has a narrow window of integers in a generator. So it's actually, it actually generates a multivariate. Yeah, no, that was my reaction to my first. Later on, I was like, Later on, I was like, okay, I kind of see why just because we have a really, really broad range of numbers, there are chances of, say, hitting certain points, like, say, zero or zero or much, much more. So I think it tries to stay in that window because we're going to run it to anything. So yeah, sorry about that. I thought I read, like, your original quick check was in Haskell. Yes, yes, they did it just by default. It sort of analyzed an input range and sort of heuristically picked boundaries that it felt would be interesting, like it automatically gets zero just to check. And maybe you would pick max integer, just check. Yeah, I can't think. So let's do this in F sure. So F's check definitely, F's check definitely is inspired from quick check, quick check, kind of like the granddaddy writes. And it may very well be choosing these because it decides they're interesting. But in running them, in running like the values, unless I actually change the end size, which is like a parameter, it says, what's the highest end point you should generate? I've noticed that it never goes beyond a very narrow range of numbers. There is a default max. There is, there is a default max that you can configure. And when you configure, kind of see why it's, it's, it stays in a narrow area. And when we run them for most examples, you'll see the kinds of values it's generating. You notice that the very bias towards a very small window of numbers. Yes. And it did, and it was something that it did kind of like urge me for a bit. And by default, how many is it to 100? By default, it's 100. And you can change that also. Yes. Yeah. And rather just like 5, 11 from the range than 100 in a band. Yeah. I think, I think the general idea is that you want to have enough, enough, like enough samples there that you're likely to hit some unexpected behaviors if they're there. But of course, you don't want so much that it's going to bog down your test. So, I mean, so I guess they chose 100. But yeah, I mean, the fact that you can change like the number of tests, at least like a max test parameter to a configuration object, and you can control it. Yeah. But I think it's like kind of finding a balance where you're going to get enough of that sample that's tested to at least figure out if you do run into any issues. So, good questions. Any questions for anyone out there? I want you to go on and then we'll ask you. Okay. Feel free to interrupt if you have questions. Okay. I mean, those are my interruptions. Okay. So, yeah, spoke about myself in the third person. So, yeah. So again, so we leave at least one argument to be randomly generated, right? And we can have multiple arguments that are randomly generated, and it will provide them and will provide them all according to the type. So, to the very least, it respects the type. And yet it gives us test coverage, more test coverage, right? We got 100 tests. And the voice test bias is going to pretty much be picking these different numbers that we may necessarily even think about picking. But, you know, the cost here is that we end up with less depth, less test depth, right? In an example-based test, we actually confirmed that this division was correct for that example. In this case, this is a very shallow property, right? We got 100 tests, but there's a lot of failing functions that can pass this particular test. And there are more specificated invariants that can give us more depth. But we can't always be so lucky as to find these kinds of invariants, right? So we kind of trade depth for breadth. And of course, we always have to think of an invariant, right? What is it that can universally hold in the code that I'm testing? And sometimes, you know, that may be a bit of a non-trivial test. So, to give you an idea of like the shallowness of this, right, this particular property, here's a fake division example, right? It takes two arguments and always just returns the first argument in zero, right? And of course, when we run that, and that's why I also parameterized my identity property just like it passed in different division functions. And of course, if I run this, we're going to see that, you know, lo and behold, it passed. Even this is an obviously fake example. Again, it's a shallow property, right? So we get test breadth, but we want to also like start to figure out how we're going to get depth, right? Do we add more properties? Do we make our invariant more sophisticated? Because also we're going to start digging into it in a little bit more detail. So, the first key with property-based testing is that we need to start thinking about invariants, right? We want to look at our code and start asking what are the universal properties that will hold my code? And maybe it's even a property that holds in a particular window, right? Maybe there's a constraint there, but it's something that at least is going to hold for a sufficiently large subset of the input, right? And so if we can't, and so you want to really use your insights about your code, but if you're not too sure, you've got some stock invariants here, okay? And there's more out there, but there's some common invariants, right? Identity, which is what we just tested, right? That there, that given your function, there's going to be some value-wide that's always going to basically give you back your argument. And again, if you have multiple arguments, different types, you kind of squint a little bit, massage them, you know, a constant element, right? That, again, you've got some, you've got some other value for your x or your y that basically causes it to where your function is going to return a fixed point, right? Just the same, the same constant all the time. Of course, we know about commutativity, right? Hey, can I switch my arguments and still get the same results, right? I had impotence, hey, if I apply my function and then I successfully apply the function to results, do I get the same results, right? Or to successive applications of the function beyond the first one have an impact, right? We've got an inverse, hey, does there exist some function g that'll give me back, you know, that'll give me back my input, give any output, right? Is this function reversible and do I know what reverse is in? And even have this idea of an oracle, right? It's like, hey, is there is another function that does the same thing that this function does, even if it's over say a smaller subset of it? And in case of an oracle could be, if you're writing a function that's supposed to be a more efficient version of a known function, you might say, okay, well, I can at least use a known function to test and make sure that my more efficient version is correct, right? Is there a reason associated, isn't listed associated? So there is a very good reason associativity is enlisted because I forgot. Yeah, there's definitely, and there's, and there's definitely, there's definitely, yeah, this is not a fullest embarrassed. There's nothing going to be more, right? So you can have associativity and there could be like another dozen that I didn't even think about, yeah. I hadn't heard of oracle before your description is perfect. Why is it called oracle? So, okay, sort of short answer. Okay, so I'll give you a short answer and I'll give you the good answer. The short true answer is the website where I got this from called in an oracle. Okay, there's actually a website and actually link it that has a list of good properties to use and goes into more depth on these properties. And I've got a further reading section I can paste the URLs. The longer, the longer answer is that the term oracle is used in some like circles, like economic circles as sort of a function that can give you our results or that can give you some kind of prediction about some way. And so the idea is that this function that you're tested against serves as this oracle in the more like economic sense. Yeah, you're reading something to read about in some like economic supposed to talk about these like oracles. So, I think I'm going to be a little of a butt here, but how is oracle, how is that a property and not just a different version of standard unit testing? Well, so the oracle, so it would be a property because you're not giving it specific data points, you're going to say, hey, for all x over 100 data points, this oracle evaluated has to be equal to this, right? Remember, with the example based test that you did test, you're giving it specific data points. Okay, I think that's my point is so this is coming from a real example that we can talk about offline and over alcohol. But basically, I've got equations, I've got software that produces a particular output. I've written new software. And then I'm just going to compare them and see if they give the same results. They give the same results, a dozen times 100 times 1000 times. And therefore, I conclude that my new software produces the same results as the existing software. Correct. But you're not proving it, I guess is my point. Oh, yeah, correct. Correct. That is proof. Yeah, that is proof, yeah. Exhausturally sample the input. Right, because everything is just running it over instead of your input. So, because even your properties, anyone of these properties, it's not going to be proof because it's like, okay, I ran 100 times. And yes, by many developer standards, 100 times might as well be approved, but any mathematician would just laugh us out of the womb, right? Because it's definitely not approved, right? And none of these 100 samples is never going to be approved. Absolutely. Yeah, because that's a thing or a million. Yeah, I mean, unless you also run through every single one, or you actually have a mathematical, you know, your proof language, that should prove your program is correct. Right. And you've got some, and you've got stuff like F-star, which is kind of neat, and a few other nice ones, but that's a whole other thing where to actually do like more of that symbolic analysis that you actually get this, hey, this was actually proven based on that. But yeah, none of this is actually... Okay, I understand all of that. I think what's interesting for me is until now, I wouldn't have thought that getting the same results from different software packages could be considered property. Whereas identity is clearly a property, connectivity, associativity, I hadn't thought about like this oracle property, I don't know what I mean, but maybe for that reason. But I hadn't thought about that as a property, and that's where I got a little stuck. Right. And so yeah, so what I would definitely say is, yeah, even though we're calling them properties, we definitely want to think about it as, they definitely bear a resemblance to mathematical properties, but we don't want to like read too much into it, we're wanting to be exactly the same, because even like you both pointed out, we're not even proven at any of these properties. We're just like, hey, we're in a hundred times, looks like it holds, let's move on. And yeah, I might even put like quotes and sort of properties, but I'm sorry. That makes sense to me of like, there are multiple ways of calculating the mean. So if you're like, yeah, this calculates the mean, and I'm going to test it against this, then I have more confidence that the way I'm calculating the mean works. Right. And that's exactly what it is, just to give you more confidence in these things. Yeah. But for example, that sounds like what you do when you're refactoring, but you're refactoring some huge piece of code that you don't have a lot of tests on or something. Exactly. And you change this thing you think, you cover all the cases, you know, or even for an entire rewrite of a system, a lot of times they give you the old system, you can write as much as you want, just make sure that your new system from when you new code matches every year. Yeah. Yeah. You have a function F1, you refactor it maybe into function F2. You run this on it, maybe set your end size to say 10,000 and have a really nice deep run, and then basically be like, hey, okay, looks like my refactor was a success, or at least, you know, definitely seems gives me confidence. When I refactor now, I'm using a static language that is like, yes, it still compiles. So yeah, I don't actually test whether it works. Want to set our sites a little bit? Well, someone else will. And yeah, if you get nothing else from this talk, just set our sites a little bit higher, just a little bit higher than that. But yeah. No, no, no, no, no. This was helpful. These are all good. I definitely want you to bring these up because again, this is the kind of stuff that we want to talk about. I'd rather just be a nylon. So this is good. No, you're absolutely right. And yes, for refactoring, this is a classic example, the organ refactor, a beautiful example. And then again, you sometimes have solutions that are hard to find, but are easy to test, right? We know like some of these like NP, you know, hard problems, it's like, hey, okay, well, I know that at least, you know, this has to hold for a solution, I can test it out and let's see what happens. Right? And that's an indicator. Yeah. And again, and there are definitely more of these, like you pointed out, associatives, none of the less, right? But this is going to get you started. And again, a lot of times you may actually figure out that your own ad hoc property from your own understanding of the code. Definitely go by that. But these are things to lean on if you're not sure. Okay. Okay. So now let's talk about commutativity. So, right? So we know in commutativity, is it invariant that holds in some, in some operations, right? Some functions, right? Like for example, addition, right? It's commutative, right? Now, division, of course, is not commutative, right? I mean, generally speaking, x divided by y is not equal to y divided by x. And another example, just to take it away from the realm of mathematics, because this is just about numbers of math, right? You can do this on anything, depending on the list, it's not commutative, right? Dependent, x dependent to y is not the same as y dependent to x in general. So we'll say, okay, well, you know what, we don't commutativity, it doesn't hold. Let's have an invariant that, you know, that is not to be commutative, okay? So we have right another property here called not commutative. And again, I'm going to go ahead and just pass in a function just like a test my fake divisions and my real divisions. And again, my arguments, x and y, right? And I'm just saying, hey, you know, f applied to x, y, it should not be equal to f applied to y and x, right? And then I'm going to go ahead and run it first on my fake division, right? And here, so here we go now and we see what failure looks like. Behold the face of failure, right? So in this case, you can see here that it says, yeah, this doesn't work and here's why. And it gives me, and it gives me a little bit of an error report along with the data where it fails. We're going to break down what this, what this means, like what a standard, what is a shrink, all that stuff really get into that, right? But right now we get a sense of at least what a failure is, but we also get another important point that if you have a very shallow property, you can add another shallow property, test them in sequence that there's a good chance that, you know, they're going to start flushing out issues. So even a shallow property doesn't may not get you that far, but two shallow properties can get you a surprising amount of depth and flush out issues. Fake division passes our identity, but the moment I threw commutativity at it, it fell a little in this case, you can see here from where it failed, you kind of suspect it will fail elsewhere, right? And again, you run a few times and, you know, issues generating, you know, different sets of values, right? But you also see like some of the narrow ranges of devices set. That was after 9210. Yeah, exactly. Yeah, it took 92 to get to this point, exactly. And then if I run this, that is, yeah, that's why you add more of the shallow properties and one of them has got a very good chance of flushing something out. So when I run this, when I run this on the actual division, right, instead of the fake division, I'm like, Hey, okay, we're running fake division. Here, it's like, Hey, it finds, it feels faster. It says, Hey, this is falsified after two tests and one shrink. My original value is zero three where it failed. Okay. And then it shrunk it to zero zero. And when it shrunk it to zero zero, it gave me the exception. So shrinking is an attempt by the property based test and framework. And most property based tests and frameworks try to do this, where it finds a failing test case, it tries to walk back to the earliest point where the test failure occurred. But sometimes it'll actually flush out or flush out another test failure. In this case, it found this as as failing. And this one didn't actually fail because of division by zero. And it shrunk it to zero zero, right. And now we see here that it tells me, yeah, I got a division by zero error. And intuitively, you can see why we're getting this right. We're trying to test commutativity, we're generating random data. Well, my denominator might be zero random data that's being generated. And this brings us towards what you were saying, Brian, we're going to see how to address that, right. So in this case, I'm like, okay, I got the vision by zero. That's not cool. So just on that test, it was trying to do zero divided by three, and then a check if three can be divided by zero, is there? Yes, exactly. So zero divided by three, zero, three divided by zero, okay, divided by zero, not good, right? Yeah. And sometimes your shrinks will go in such a way where you'll have one error out of shrink it and a hidden exception. So your shrink actually reveals a different error from your original, right? As it also happens sometimes with the way the shrinking goes on. Yes. But in this case, this is exactly what happened, right? So by reversing it, we realize that we can't just not even reverse it, we got to deal with, deal with this. So, so over here, so over here, first thing you make, well, not necessarily the first thing you notice, but you see here that when it gives you a fail, it gives you a standard gen, and it gives you these two rather intimidating looking numbers. These numbers are random number seeds that you can use to reproduce that exact one. So if you decide, hey, I want to reproduce this run after I fix it, you can use this and feed into the configuration. And I'll show you how it works, okay? And of course, the original tells you at what point the data fail. This is x and this is y. Shrink narrows it down to zero, and of course, we see here the exception of division by zero, right? So as it was shrinking, it was exception every time, it just kept up the shrinking until it got the smallest number. In this particular case, yes, there may be some cases where you're actually finding a different test failure, and it may shrink to the point where it hits a division by zero and you get a unique one. But in this case, yes, it was division by zero error each time and shrunk to zero, zero. Absolutely. But it was okay. I thought it was only two tests. So it was two random tests. Two random tests. So one succeeded. Right. Okay. Because this is going to hold in a lot of cases. It's just a job. And this also should give you a little bit of insight also into where the narrow range of numbers works, because we had a really wide range of numbers. The chances of us hitting zero or even two numbers that are saying could be low, right? So I've kind of run this a few times. I kind of started seeing like, okay, I kind of see why they want to keep the range narrow. So again, so we see what shrink it does right. And the idea behind shrink it is, it's trying to, it tries to help you get insight into what caused the error by finding an earliest point to which it happens. Okay. So, so the same gen, those, those random numbers we saw, okay, we can use interpreters of particular run, and we have a function called check one where we can actually feed it some configurations to sort of custom, to customize our tests, right? Or if we wanted to, we could just capture that particular failing case as an example-based test. Remember your property-based tests and the example-based tests are not in competition. One can feed into the other. If you realize, hey, this just revealed a case that is an edge case that you didn't realize, you can just say, look, guess what? You're not going to join by my standard unit test because we need you. This is an edge case, right? And you can definitely do this as a discoverability for, for example-based tests. Okay. So, so let's get into like how we actually do some of the features we're talking about. So here's check one, right? So over here, basically, we call check one, we have our configuration object, and then we're able to customize it here. And in this case, we can customize it by setting the particular one we want to reproduce. So let's go over here and over here, we see this is the particular run, right? So I wanted to ensure, because every time you run this, right, it's going to give you different runs, right? In this case, in this case, you can see here, one minus one fail, there was not a division by zero. Shrink and watch it back until it hits a different error. Okay. I want to really revisit this also. So sometimes Shrink and reveal a different error. So if I go over here and I'm like, hey, I want to repeat this, this exact test run I had. Why did one minus one fail? So why did one minus one fail? So, so commutativity tells us division is not commutative, right? x divided by y should not be equal to y divided by x. One divided by negative one is negative one. Negative one divided by one is negative one. So in this case, that happened to both be equal. If x is the negative version of y, it's going to be, it's going to be commutative in that case. Right, right. Because yeah, because, oh yeah, because this check is for non-commutative. It says, oh, yeah. No, no, no, that's fine. Yeah. We're actually testing that in very, that's not good. Yeah. All right. No, no, no, no, no, you're good. No, you're good. It's, it's, most of the time we think in terms of invariance holding. So this is kind of a little bit, yeah. All right. So in this case, so here, so here we say, hey, let's go ahead and set our standard, right? I'm going to say, hey, I want to run this, right? And in this case, of course, we see now that this reproduced the exact run we had before. So if we wanted to, we'd be just safe, whatever the state gen gives us. And now we can always reproduce our test run. So we need to fix anything in our code. We can get that, you know, test runs that are of interest and rerun them. Can I ask a quick question? I'm interested in this. So is it, is there the ability to either run the test until you get a failure instead of, I'm sorry, run the test until some, so if you're looking for non-commutative, you might not want to say it's, you might not want to prove that it's, it's non-commutative for every pair of numbers, but you might want to say for some pair of numbers, it's non-commutative. Is there sort of some way of putting this on a 10 instead of like looking for, making sure everything passes, looking for the case where at least one of them fails? So, so what we do, so, so, okay, so. Was that anything, a use case? So, okay, so you're asking, so, so make sure I understand the question. So, so like, for example, in the case of commutativity, I mean, there's ways of like customizing the test filter so that it limits the range of what's being tested, so that only, only like the variables that satisfy certain conditions make it into the tests, so that like those cases of like, say, you know, division by zero or x being equal to y are excluded from the test, and so that the non-commutativity runs on those scenarios where we know it should hold, so we get out like, exclude those scenarios where we know it wouldn't hold. Was that a question or did I misunderstand? So, so, I guess, I guess my question would be like, let's say, here's a variation on something, like let's say that I have a test for the greatest common, a simple test would be something like, look for the greatest common denominator of two numbers, right, so the Euclidean algorithm, and I might say the invariant for this test is, if I'm coming at it from a completely naive point of view, um, if I, I might want to say that at least for some pairs of numbers, this value is greater than one, right? Right. Because for, for many pairs of numbers, if I took a whole slew of random numbers and they weren't like really well chosen, I would probably get a lot of ones. That's the, right, is the greatest common denominator. So is there, is there a, I just, I'm not, I've used, I think I've played around with that for this check a long time ago. I was just curious if there's a, if there's a use case for, for that where you're saying like, there exists at least some pair of number, I don't know if it would work well in a unit testing scenario, but I just, I don't know, I'm just kind of curious if that's another thing that one might consider or is useful. Okay. There might have been, I, I need to check the documentation, but what you're saying rings a bell. There might have been some, some kind of test that may have allowed this. Don't quote me on that, but I can, I can dig around and see, but that's actually ringing a little bit of a bell. Um, but I mean in general for something like property based testing, you probably want to try to hold more towards like the for all case rather than there exists. But, but then I, I definitely, I definitely see, I definitely see what you're saying. The big, the, the big danger with the, that there exists case of course is that if your random sequence of numbers just legitimately doesn't produce that case, even though it does actually exist, then that's kind of a problem. See the, the danger in testing that way. Yeah. Or you could just, I don't know. I mean, maybe, maybe you could say like run until it fails. I don't know. That might be dangerous too. Um, but Yeah, I definitely see, I definitely see what you're saying. Yeah. It's, it's going to be one of those things that would be a bit harder to pull off just because, you know, if it's one case that exists among say a range of say billions, then that could be one of those one of those situations. It's almost like it works out better if you have like say more of a symbolic checker that can actually, this is implied in a code. And it's almost like something that seems to be a little bit more suited for maybe a theorem proofer. Yeah. You're probably right. I was just curious. I wasn't sure if this, I couldn't think of a good use case except for the, the GCD calculation, but be able to drive the test cases out of the intermediate depth of the GCD algorithm and get a head start sneak up on it. Okay. Partial GCD results and see if it notices. And yeah. And in fact, FS check does give you the ability to customize your data generation. So, you know, so if you do like, um, do decide to go that route, you do have the support to actually try to generate, maybe your data is related in certain ways. Maybe even you generate, you know, a large enough set of data, but if they're related in certain ways, you make it much more probable. Yeah. That could be, that could be a way. I guess what I, when I think about now is, is the definition of what constitutes a property. And if it constitutes a property, then it should be something that you can model. Like, not just like waiting for, for it to emerge randomly. Right. And that's a danger with any kind of for exists, right? I mean, like for all, you're pretty much, you know, kind of least we rely on for that, right? When you say property, do you mean for all? I guess that's, that's the question. And that's generally what the property is sort of, uh, the property is, yeah, kind of, kind of behind the scenes is essentially like a for all, you're kind of looking for all. And again, there might be ways of doing something like this. Cause what you're talking about, Chris, I do remember, and maybe I just remembered Ron, but in documentation, I think I saw something like that, but it does seem to me that in general, it would be kind of, you know, harder and there'd be a lot of like issues around it. Cause in general, the property is kind of like this implicit for all. It could be constrained, you know, by certain conditions, but there's going to be a kind of like this for all, right? Within this range that we're testing. Yeah. So I guess it seems, it's, I'm just kind of curious cause it's, whatever I hear, like a logical quantifier, like for all, I wonder if there's like a four, does there exist, is there an equivalent there exists? But, but I could definitely see, I can see the, I can see what you're saying. I agree that it, you sort of run into halting problem territory. I mean, I mean, I need to play around with like, you know, negating, you know, negating the existence to where it's like, uh, you know, I'm not in the for all, but again, you know, you're kind of still running into like work, you know, when would you hit that kind of scenario? Right. You had a manipulation logic, something that turns for all and then there exists. You should have negated. Yeah. Cause like, well, isn't there, when you say there exists X, aren't you saying like that it's not true that for all X, not X holds, right? So you basically, Yeah. Yeah. So in theory, you could do that in this kind of, but again, you still kind of run into the same question of when are you going to hit that? And that's kind of like, you know, that same sort of concern, but you could do the logical manipulation. Yeah. That's kind of what I was talking about, but like negating it. Yeah. What's the whole idea of property-based testing is picking the test cases smartly and so you're not doing it, trying to cover the entire range, and you're not depending on randomly eating something, but just cutting out whole swaths of the input range and testing the important parts from properties, you know, which is why the four all is important, right? Because at least, you know, you're going to be testing on everything rather than hoping a particular condition comes up right? Yeah. Right. Yeah, generally speaking, we do want to lean more towards for all those tests. That was a really good question, Chris. Thank you. And I'll check and see later afterwards, because again, this is triggering something I thought I read somewhere. So I'll let you know if I come up with anything. Thanks. No, I really appreciate it. It's a good conversation for sure. Oh yeah. I'm digging this. I really like the back and forth. This is awesome. So you're doing kind of like a simple example, but does property-based testing come into its own really and like system-wide things where you have this, you know, 27 things, this thing could be. I want to interrupt you, because I think that's a super important question, but I also want to let Ahmed get through his slide. But because I've been, I've been interrupting too. So it's, I really don't mind the interruption. I like, I like this, but I do want you to sort of be able to finish what you're saying. And then in particular, that's a super important question. Yeah. But, but, okay. But short answer, we're going to get to a meteor example I used in the actual test case. Yeah. Yeah. These short examples are there to kind of illustrate these features. And I'll get into like a meteor case study I used in a market research context. So, so this is, so we're asking about, can we see the types of values or the values that are actually generated? So when you call check verbose, you actually can, you know, it'll actually generate a rather list all of the data that are generated. So you see here the index number of the data colon and then a data generate. In this case, it's generating just, you know, I only needed to generate one data point because check it identity. And from, and I'm going to go ahead and just, you know, you just have a scrollable element to see everything. So over here, you can see that it's a very narrow range of numbers, right? It's not going anywhere near the range of integers, right? Like what we never go higher than, what is it higher than 90? Yeah, it's a team that minus 70 minus 84, right? So we're in a pretty narrow range. And again, we could change the endpoint but by default, you know, this is, this isn't going anywhere near like, you know, all, you know, all the editors. I mean, we could run a few more runs. But of course, you know, the chances of you not hitting the larger number, you know, this run is pretty small. So this is pretty, you know, pretty good bits. You can see from the first post that, yeah, you've got very small, you know, 99 is apparently the highest I was generated in this past. Oh wait, did I actually see a hundred? Did I actually break a hundred? But you got the idea, right? We're definitely not getting high ranges of numbers here. So it's suing pure functions. It's so, yeah, between, I mean, it's not necessarily, it's not going to show. It's not running zero multiple times. Correct. Correct. Yes. Yeah. Exactly. Exactly. And because, and because, you know, try to phrase a property of, you know, you know, some universal property where the property changes after you run it, right? Kind of one reason why we do functional programming, right? Right. So yeah, absolutely. Yeah. It came from a course of the same. Based on friendship. But in that case, because what I'm saying is it's the old adage, you've got zero, one, and many. Yeah. Right. And so it should just do zero, one, and two. And we've done with it. Yeah. All right. Like, like, why is it going to run a hundred times? So we're going to do zero, one, and two. Every time. Computer too fast. But yeah. So, so this gives you an idea of like the narrowness of the range. Okay. And again, you're asking about that customized in a number of tests. Yeah, we can go ahead and call check one. And again, just update, you know, modify config object and tell it that the max test is a thousand, which means, hey, instead of running a hundred tests, run me a thousand again. It takes a moment to run. Again, it could be just from the polyglot notebook. I imagine it's faster. One run actually like in Visual Studio or just directly with command line dot net, but watch as a thrilling spinner spins. And there we go. Right. So tells us, hey, it passed a thousand tests, right? So again, you can customize your test points here. Now we're talking about how the data was generated in a very narrow range, right? So I can tell it, hey, give me a higher endpoint with n size. And I pretty much also changes like the distribution, right? So, you know, I can run this. And basically it's going to go ahead and do and run all of this. But in reality, it does give us like a wider distribution. And later on have an example where I go through both with this. But basically once you do this, it gives you a broader distribution. But again, good luck on hidden zero. Now instead of say randomizing between say a hundred or minus a hundred over a hundred runs, we got a pretty big chance of pairs being equal to zero. We're going to say, what is that, a hundred thousand? What are the odds you're going to say hit zero or pairs in this particular? So that's kind of like the also another, because again, we tested this on non-commutative, right? Remember, non-commutative fails. But when we run it on an n size, it passed. This is the same bad property. But it passed here. And it passed here again, because the n size was so broad, it wasn't hitting those problematic values. You see what's going on here? Yeah. So I kind of give you an idea of like why it kind of makes sense that within this narrow range, you're going to be hitting some of these problematic data points more than likely. But definitely use, again, your understanding of the demand to decide if you could adjust further and what the patterns likely are. So filtering data, right? And this is kind of into what Brian was talking about, right? So Brian was kind of asking about, well, you know, what about if we want to kind of like have, you know, tell it to like, you know, run, but avoid certain, certain, you know, certain types of data, right? So for example, communicativity, right? We know, we know again that, you know, communicativity fails in some cases. So what I could do is I could say, okay, I'm going to go ahead and run this with a filter. And here I have this operator, which basically lets me put in a predicate here. And if this predicate passes, it goes ahead and goes on to the test, which are wrapping the lazy. Now we do it this way, because when we run it this way, at a property-based test, the test runs it. If the predicate fails, it generates another one, but it doesn't up the count. So if it generates, say 100 values, 50 of them fail, it'll generate 150 values to get you to 100, right? That's why we're not putting any kind of predicate, say, in a body where it'll be like, oh, hey, if these two are equal, just pass it, because then a lot of your test runs, you know, will be like, okay, they're going to be bad runs. And you'll see exactly what this means when we start kind of like getting into some details. So in this case, I'm like, hey, so here's my filter, I'm going to pass it a filter and parameterize it. And if the filter passes, we're going to run this, which is my non-computativity. And so here we know that the vision by zero is a problem, right? So my filter right here is, okay, make sure that the denominator is not equal to zero. So I'm like, hey, okay, check quick. Here's my non-computative filter, and my filter is that the denominator is not equal to zero. And go ahead and run this on a division. So now I know that it'll generate 100 test results, or 100 samples. And it's going to guarantee that Y is not zero in any of those hundreds. If any 100 comes up with Y zero, it generates another one, and it doesn't up the count. And so here we run it. And of course, we already know that we're going to fail because we've got other scenarios, right? Minus two and two, of course, X minus X is going to be commutative. And when we shrink zero and one, because again, since we are testing, yeah, exactly. So it's like, okay, well, I need a meteor filter, right? So I'm like, okay, well, let me go ahead and make sure that X is not zero and Y is not zero, because again, commutativity, X will be in a numerator, one point, denominator, and another. Okay, cool. And then we run it. And here now, again, division by zero is no longer an issue, because we filter those out. But of course, we have this case of two and minus two, and two and two, right? And we see Y. Okay, so I can see here, of course, X should not be equal to Y. Okay, cool. That caught that case. But of course, we have minus two and three. Let's make our filter nice and just make sure the absolute values are not equal. That way we catch one is a negative of the other. And now we run it, and finally, right? Yeah, shift it, right? And of course, there's a problem with the function itself. Sorry, I got a little bit lost. So the first argument is the filter, and the second argument is the function of your test. I'm just making sure. Yeah, I just, yeah, I supplied the filter as a lambda. I could have just had it actually separately, which probably made it a little easier to read. Sorry about that. No, no, no, no. I'm used to lots of parentheses. Okay. Just a lot. Okay. I'm used to, I'm used to practicality. I used to R, dude. Come on. Oh, nice. But yeah. And there aren't nearly enough parentheses in R. That is true. That's so true. So yeah. And also, so that's our filter, right? Now I can also run this for both so I could see what is generated, right? So again, I'm running the same thing with the same filter. But this time I want to see what's all of data that's being run, right? And so, and so this one, and I'm going to go ahead and so you see here, okay, minus two, one, cool, minus three, two, looking good and so on, right? Now I want to get down to and let me go ahead and turn the scroll down. Yes. Sorry, I know what I got confused on the non-commutative with filter. I think I miss when you define that function. Oh, sure. So it takes two parameters. Yes. So there, I'm just trying to understand that. Sure. Yeah. So non-commutative filter takes three parameters, right? Filter. It takes the filter. It takes the filter function. It takes the function under test. And it also takes the two values to be fed into the function. And since it's current, right? I don't have to worry about it. Those two values, if I go into property-based test and framework, will provide the range of values more. Yeah. Love current. Yeah, sure. Right. And then over here again, we see that, you know, where, you know, it's going over here and we can see that, you know, what, what is it generated? And we can see like all of the data points that it's generating, right? So over here, we see, now take a look at here. This, so this is important. So we see here at item nine, at the ninth data point, it generated a zero and a minus four. Remember, none of them should be zero, right? So if this fails the filter, so it generates another value and the next run is still nine because it's like, oh, it's didn't pass the filter. Let me generate another value, but it's going to be the same index. That way it's generated a hundred data points. That's why you don't like want to put it in a filter inside the body of your function, because it could be like, oh, well, just don't do it. If it's this, well, okay, fine, but maybe out of your hundred tests, only 50 run. This way it's always going to run. Now, of course, your filter is super tight and your range is narrow. It might tell you, hey, I can't run enough tests. So you kind of want to be careful, but you can see here that that's exactly what it's doing. Generating another one for zero minus four. And it didn't take the test number. I was like, okay, 11 minus eight. Okay, now this is good. Then it took the test number, right? Oh, nine and minus nine. Well, we're not supposed to run that, right? So it didn't take up 10. We're still on past 10. Okay, minus six, 10. Okay, that's valid. Now it goes to 11. And in this way, so it attempts to run more than a hundred, but until it gets a hundred tests, then it can actually run through our filters and pass. So that kind of is your question, Brian, in terms of how you filter these values beforehand. Okay. And again, if your filter is really strict, that's when you may want to start thinking about expanding your endpoint, right? Because then it may not just have enough data to really do that. Okay. So we've made it with some pretty weak invariants, right? But they've actually caught a surprising number of issues, or at least things we're assuming about our code, right? Technically, division eight broken, right? We're just making assumptions. But maybe these are the cases where we can make our code more robust. Maybe someone is calling this and they're not even thinking of division by zero, which I mean, yes, we'd like to think they should, but who knows, maybe we can make it more robust. But this is showing us at least scenarios where are we happy with a halim division by zero? We want to throw an exception. We want to maybe, you know, you know, have like say like something like a maybe object where we tag, you know, where we tag it. At least this reveals cases of how we want, you know, how we want our interface. And again, we can have like, you know, we can have like, you know, stronger, you know, we can have, you know, stronger, stronger invariants, right? For example, you know, when you divide two numbers, right, you get back a Q and R top, right? Quotient remainder, and it satisfies this equation, right? Quotient times Y plus R equals X. And it's pretty, it's almost a spec, but there are a few cases where you can fake it out by passing in stuff like, you know, zero X, but still it gives you a pretty, a much tighter invariant. So we can have a more involved invariant like this, and basically say, okay, well, let's see if this one holds, right? My quotient and remainder quotient times our denominator plus the remainder should be equal to whatever numerator was, right? But again, it's also something that's easily faked, right? If I were to create a, create a division and just simply return zero X, like in this case, you know, it'll pass it to right. But here's the nice thing is that again, if I throw in a weakened variant after weakened variants, we can catch it. I'm like, okay, this seems to work. Well, let me throw the identity at it. And identity flushes out. Oh, no, no, this isn't good. So again, even a sequence of weakened variants or maybe a strong invariant and thrown in a weak one could catch it in some issues. So you've got a choice of making your invariant more involved, which is fine. Or have a succession of weakened variants that collectively narrow in on the range of what it is you're trying to do. And of course, we just tighten our spec like, okay, well, you know, here we are, this should hold. And also, you know, R, you know, R has to be an absolute value of R has to be less than an absolute value of Y to avoid the cases where we get faked out. So again, we can make our spec much more involved. I mean, we can have a very crazy, a crazy invariant, which is fine because maybe we run into an invariant that's actually total correctness. And we actually have an invariant that tells us, yes, this is division, the invariant now defines division. That's kind of like the holy grail. If you find it, then you actually, then you actually are getting depth and width, but you can't always get the sling. And again, with this tighter one, we try it and we see that we can't fake it with our simple fake division because now we've got this and we also are putting the constraints on R and how it must relate to why it's going to catch out that, you know, yeah, this guy, we can't fake it out, right? This fake division fits. Okay. And again, maybe we could be even a little bit more precise about constraint R and this constraints are, but is this a sufficient constraint? You decide. Okay. Well, invariants always have to be ended, be able to be ended together, right? Yeah, yeah, because if any invariant, because if you're putting invariants with ORs, you're basically assuming one of them is going to fail when it should. Yeah, absolutely. Yeah. So, okay. This makes for a messy equation. I mean, it definitely, yeah, the equations can get ugly, but they also get involved. But that's also the thing with property-based testing is you can get some really good experience. Well, you can have one after the other, and that's fine too. Exactly. I was just, yeah. No, but it is a valid trade-off because at this point, it's like, hey, do I go for the really hairy, ugly invariant that actually gets me total correctness, but maybe it's so hairy that I make an error in it, or do I go for the weaker invariants that don't give me total correctness, but we were good enough and they're easy enough to understand. Put your term in each of them, right? Fine. It's the same. If all of your, if all of your simple invariants collectively amount to that complex invariant, yes. Yeah. Cool. So, so you're asking a question about how is this with larger systems, right? Now, I know this really qualifies larger system, but this was actually something I did in prior role. It was in a market research firm, and they had the details, again, proprietary stuff, all sorts of stuff, so it still doesn't anonymize, but it's, but it's kind of like a high level of what I did to kind of show you where property-based testing came in. So, so basically the firm, I was at. And you were using F-sharp in general at this point? So in this case, I actually used an ad hoc, an ad hoc one. Basically, I actually like manually wrote in like the property-based testing inside, inside my testing suite, right? Yeah. At that time, we weren't able to hear. Oh, yeah. Well, that's what I think too is because once you understand the mindset of property-based testing, you realize that if you wanted to, you know, sure, you're not going to get things like shrinking or whatever, but you could like basically write a loop into your test, randomly generate data, and there's a search on those there. And nowadays, when I do property-based tests, it's kind of an ad hoc basis when I realize, hey, I can test this. It's a very useful property. Let me throw this in for a certain number of runs in my unit test. And that's kind of how, that's how I use property-based testing. But yeah. So then this time, it was definitely, yeah, it was definitely very much like, hey, let's just manually do it, right? Because at the time, yeah, because at the time I was looking at, you know, we had like some survey data, right? And essentially, we had this data visualization application. And people could pretty much like choose like, hey, I want to know how many people say in Houston, you know, who are like, you know, those age 18 plus eight-fast food in the past week, or how many people, you know, plan to buy a car and also have an income of say 100k plus, right? As you can ask like all these different questions, and it will give you these numbers, you know, these values back. And here we had the usual unit test, but I also wanted to start testing the behavior of this under some real data. So that was what it's supposed to do. Yes. Was to verify it really did that if you ask those questions. Right. Right. Exactly. Exactly. And so, and so, and so even though there were like, you need unit tests that would do that, I wanted to also check like some actual like live data. I'm like, well, I want to test with some actual data where I'm not basically concocted in some contrived examples. So all of the data is actually atomized, or has it been like you're saying? So do you know about that person there in the zero to 50k range? Do you know their exact income? And then you, you've been it every time you ask these questions. So in our case, what we had is we actually had the actual actual like basically the micro data or responded level data. So we had like every person, we knew all of their habits, we knew the weights, we knew how to weight them. And so we'd have like a like the data would be based on say 1400, for example, people in Houston. Okay. And of course, their identity is protected, but 1400 people in Houston, they have these weights. And so we would project them onto the Houston population. But you would actually know like, yeah, for this one person, I know all of their habits, because this would all be, even though we got the numbers here, these are weighted numbers, we had the micro data represented each individual person. Yeah. Well, I'm asking, I mean, in this case, it doesn't really matter. Yes. But, but if you have zero to 50k is one property or 50 to 100k is another property, then you'd want to do it $49,999 and 50 and a $1 and make sure it's working. Then you're not off by one, I mean, right? And a big deal and $50,000. Yeah. I mean, so in our, in our case, we call it zero to 50k, but in reality, it was like zero to 49.999k and then 50k. So these, so these brackets were actually, actually kind of like, you know, like that. Yeah. But I was most kind of going to this way just because I thought it would look a little bit more pleasant to have zero to 50k right and zero, but you're absolutely right. Because zero to 50k, 50 to 100k, well, did you just double somebody at the end points? Yeah. It depends on what you're doing at the boundary. Right. And if that's a huge problem. Well, no, that raises the question. So, so I'm not clear on if your question is just because the bracket here, aren't listed as mutually exclusive or if the question is, what's your checking system actually checking right at the boundary? Because that is super, super important. So, so my checking system, so my checking system was not checking at the boundary. The brackets were mutually exclusive and it was this mutual exclusivity of the brackets that factored into the property-based test that actually will fit this. Right. I know, and again, well, so for example, if this was like applying for scholarships and there was a hard cut off in a certain income where the law says no or yes, you know, it's critical that you're not up by one or right, right. And test that one. Yeah. But in our case, this was definitely very much zero to 49, 999.99 K. And then it was 50 to 100 K. So, but the end points again, we're just kind of made to look more clean. But yeah, absolutely. It was not, it was not duplicated. Internally, it was not. Oh, no, no, no, but that's about the question because the way I'm presenting it, it definitely looks like it. We're asking about the actual how it was implemented, but let's, let's, let's ask you about that at the end and go on. Because that corresponds with the question that I, that I didn't have. Is that the actual data? No, no, this, this is like, there are only a thousand people, 5,100 K. Oh, yeah, no, this is fun data. Yeah. Those are rich people in a while. Yeah, no, this, no, this, yeah, so this, this, yeah, this, um, yeah, so there's a technical term for this data and it's called manufactured, but yeah, because, because there are actual data is obviously something I don't want to reveal for obvious reasons, but yeah, obviously numbers will be hard. I mean, I really hope there's more than a thousand people saying these things, right? But yeah, so yeah, um, this is totally manufactured data just to give you an example of the kinds of stuff we're looking at. I mean, even, even like the tests themselves, I just have stuff functions just to at least show you the structure of the tests. So yeah, but yeah, but within, within, so you had these different demographics. Each fucking, that's, that's a hundred and that's the test. There you go. That was a property test because it's like, hey, I want to test this. I want to test this, not worrying about the data. I want to be able to actually have this live data coming in. I need to know what universal property holds. The demographics, every bracket, which is usually exclusive, has to add up to the total. So no matter what, and the habit could be like any kind of composition of factors. It could be like eight fast food and plain to buy. And yeah, right? This is like a pivot table than two or a company. Yeah. It's something very similar to it. Yeah. It went a little bit more than that. Yeah. Exactly. Exactly. And so yeah. So, so the whole idea was like, okay. So the test of the structure of the data. Yeah. Yeah. Exactly. And so the testing, yeah. So the testing was basically like, okay, so, and this is just some like dummy code that just kind of shows you what the structure of the test was, right? You have your market, you've got some habits, right? You've got the different demographics. Here I just got a randomized demo sum that just basically returns an incorrect value 10% of the time just so we have a fun failure to look at, right? We've got the total adults, which, you know, basically I just haven't just walked in. But essentially, I was almost tested was like, hey, the demographics of any given like markets, the demographic and the habit, right? When we sum up all of them, they have to be equal to the total of those. And that was really what, what, what that test was, right? And so I would randomly pick a market or if we actually released a new market, I would say, okay, we're going to work on this market, but I'm going to randomly pick some set of say, some set of demographics or even some set of targets, we call them targets, right? Like a fast food, whatever, test across all demographics and run a property based test. Okay. So I'm going to tell the developer who we developed who yell at me about my unit tests for the 90s. When could that fail? When could it fail because of data errors? So basically what happened was when we ran this, the code was fine, but because I was running unlike data, it actually found when it was getting corrupted. It was so successful at that it became part of our data validation process. So this was something else, basically went into like, check it, it was like, Hey, wait, what's going on? Oh, wait a minute, this data spell, this data spell is like, how do you figure it out? I'm like, well, I got this, like, Hey, guess what? This is part of validation type. So you think it then when it comes in your system already? Well, you're getting an individual data item when all the features and well, we, we have, we have some processes in the house. So in addition to getting all the data, right, somebody might have coded something wrong. Maybe there was something where, you know, some, you know, some corruption, you know, happened, something got truncated, someone threw in minus, there were cases where some corruption could happen here. Yeah, because there was a, there's a fair amount of in-house processing out there. So the DBA is hated for finding out on their next day? Yeah. They didn't mind it. They, I mean, they, they, they never, I mean, I didn't have like, a bunch of waiting for them to like pitch forks and, you know, and torches or anything, but I'm client, clients are very happy because they at least didn't get that data, right? We caught that pretty, yeah. The DBA didn't come to you to ask you to write their validation for the triggers, their money? Well, they, well, I, I work very closely for them. And the ones we actually had like doing this, they weren't quite DBAs or more like, would they call them programmers, but they weren't programmers. They're kind of programmers in a survey sense, but they're essentially like, we played the role of the DBAs. But essentially, but essentially, I mean, I'd use this to write the validations form, but I also wrote other validations for them, just we worked actually very closely. They were, they definitely were very aligned. Yeah. This made a lot of, a lot less work for all of us in the long run. So yeah, so this was, so this was a scenario where basically in this case, the property was, you know, wasn't looking at any of the stock invariance, but just insight into what the structure of the data was that implied the property and we ran it and also became a data validation tool, which, you know, was a very, you know, unexpected when, and of course, it turned out that the code was fine too. So now, and I mean, I could have went further with this. I mean, at the time, you know, because at the time, you know, I really, here, I just kind of put in like one filter or maybe just a handful of filters, like three or four filters and them. But in reality, we actually like, you can, you can combine your field, your different like demographics, like say, oh, you know, both, you know, going to buy fast food and going to buy a car or going to take a vacation and write, you can, you know, so in reality, you know, could have went further and just been like, well, why not just go ahead and just, you know, randomize on that too. So if I wanted to, you know, could have just went ahead and just like, you know, create, you know, basically, basically use, you know, fsharp's type system, algebraic data types to produce, you know, to go ahead and just, you know, define our little Boolean, our little Boolean logic system for combining all our demographics, right, or rather our targets. And then, of course, that is what happens. And because it's a type, quick check and generate random Boolean expression, expression languages for us. So far to run this, right, we can see here that, you know, I'm like, okay, it generated this random, okay, well, if I or and, you know, the filter of exercise past week in this, with exercise past week, you know, then this basically should pass. And the nice thing about this too is one, of course, it now lets me check, allow us to test our own internal Boolean mechanisms were basically, they're more like they're more like set theoretically, what I'm calling Boolean, where we're basically making sure we're combining all our targets correctly, right. So that's another nice little test that we would have had if I had done it this way. Also, it gives us another idea for a test, because now as I run this, realize, hey, another property we could do is, you know, you know, I could do like, say, you know, not, you know, like here, for example, well, not not of some filter should be equal to that filter. So I could also have another and just kind of looking at some of the random random expressions, it would generate this way. I could have basically just go ahead and use that as another property and test that this actually kind of going further than, than, than I did, because the system wasn't a sophisticated FS check, but the fact that again, it could generate, you know, trees of arbitrary types. Sure. That one's trying not not a filter back to not, right, right. Well, this notice an error was just not filmed or just got lucky. Oh, well, well, in this case, my error, I have my error just basically being randomly generated just so I can have it just. Yeah, because it's shrinking. And so yeah, yeah, this, yeah, this shrink is like shrink, shrink, it's passing shrink. Oh, it failed. It doesn't know that I'm basically telling it fail type set of the time. So the poor shrink doesn't realize what's going on. Yeah. The shrink needs a beer. The shrink needs a beer and a better client. Yes. So, but yeah, but essentially, but essentially, you know, once you have that, then you, then you can actually start checking, you know, the structure of your filters themselves and that can be independent of your data. So I'm trying to figure out now, this is, this is where I was hoping you would go and expecting that you would go and, and bullions are easy. And I'm trying to think about more, even more complex structures where simple use lists or strings or and my inclination is, yeah, that's all going to work. But, but I'm just trying to figure out, like, are there places where this is going to fail? So it can. So can I run a quick check for that when quick check fails? Yeah. I mean, so it'll definitely generate, like, lists and strings and so on. Random strings and, yeah. Right. Yeah. Empty string it would get. I would like to get an empty string, probably strings a few characters, maybe some funky characters, but, but there's also actually more involved custom data generation where if you're concerned or you know roughly the kinds of data at least it should generate, you could nudge it because like over here for, yeah. In fact, like over like this example is a little bit hairier, but this one actually shows you that you can actually do more custom data generation. Like, like, like so far I've pretty much been leaning on just a type system to give you the structure that I need and maybe a filter beforehand to filter out items, but you can actually guide it and tell it, Hey, look, I need you to generate data where this is related to this and this is, you know, and this has, you know, this particular constraint value and so on. And it does get, it does, it's definitely like not as clean looking, but this is like the most powerful point where you're basically even telling it, Hey, generate like a couple of integers, you know, go ahead and filter, you know, give me, you know, give this demographic and have it. And this has to be, you know, in a set and satisfy this, et cetera, et cetera. And then you run like a prop for all on your generated filter and you run your property and basically run it. So again, there's details on the site and FS check, but it definitely gets a little bit, it definitely gets a little bit more, but this does give you like that ability because the moment you start getting into strings and stuff like that, you might want to have strings that even though they're random, maybe do you want them to look like that, right? And I'm not sure if they have like email, email address generators or that's something you want to generate the job. And this is where you pretty much, that's where you kind of go into like, you know, this is like, you know, advanced mode where you're kind of like, okay, well, I need to start generating my property now and, you know, let's get into that. And you can see here, it starts generating tuples, the numbers are constrained to full and certain ranges, the strings are also constrained. So here it's actually generated, it's actually picking a string randomly from a list, but it's guaranteeing that it's coming from this list. And so you can see here that it does at least constrain the string. Is there any crippling of this for dynamically typed language? Yes. I mean, like JavaScript has one, Racket has one closure, I think closure is back. Yeah, closure is back. I think that's it. Okay. Because yeah, that that I'm having a hard time understanding because I can easily not easily, but I can I can understand statically typed, but dynamically type seems much more complicated. Yeah, because of the dynamic type, because in the dynamic languages, you don't have the static types to rely on, like the fact the fact is like, for example, here, right here, like this example, it's able to generate these arbitrary complex expressions because it's got the type, it uses a type system, it knows, and those other languages, like if you're doing this and say Racket or JavaScript, you have to provide your generator because it doesn't know it's like, well, that's what I am. Yeah. Okay. But it's easier in in dynamic languages, because he's using discriminated unions, right? Yeah. But you can use regular types in F sharp. And F sharp does not do reference comparison. So it actually goes to the values to determine if one record is equal to the other. So you can you can run a process where, you know, if you run the driver test, then the output needs to have a property that is, you know, the score or whatever. And so you start like running test on the properties of your record types, instead of, you know, the just simple example is here is a number. And then you start to realize things for you that like this thing needs to have this property because and it's very easy. I'm not sure about the question or whatever you say, but but in Ruby, you can say like, you can ask if an object has a property is very easy. And so the idea is that you can run your property based testing and then you check the properties of your And every language I looked at, I've seen at least one property based testing framework, I mean, Python's God and it's, but yeah, it's, yeah, so it's, but again, it's like, yeah, the capabilities are gonna are gonna vary based on like, for example, a type system and just how much of a lift that you can get from that. But yeah, that's a good point. Thank you, Miguel. Yeah, and, and I'm pretty much, I'm, yeah, and again, so we're ready to kind of know the summary, right? We're pretty much near the end of the talk. So we got a large number of test cases with property based tests, then unbiased test input kind of helps flush out our assumptions, right? Or even blind spots, you know, we could sometimes have test input or reveals edge cases, we could go ahead and feed into example based tests, right? Sometimes complete specs. Again, it's really the mindset of how we think about a code, right? We're looking at terms of like invariance, what set of invariance defines or narrows in our code, narrows in the behavioral code. And again, you know, in my case, this was a plus, right? The ability to even test that anything, including like like data, right? And if you don't even end to end test, I've done actually property based end to end testing where, you know, there are a few cases where, you know, I was looking at like what was fed into say an input box and I was able to do it in a way we're connecting to the server. But of course, writing tests can be harder. You get breadth, but you don't get depth unless you find a complete property if you're lucky enough. And again, some of the frameworks aren't as easy to use, right? They're not every framework is very straightforward, right? And then here I just kind of a further reading section, which again, I could just paste all of these in the chat or even email them to you if anyone's interested, you know, documentation and FS check specifically as a learning resources business, like kind of where I got the properties to list from what the useful properties to test, but they go into more detail on what these are. I mean, the guy goes into some good detail. And again, of course, Haskell's quick check, like the grandparent of them all, you know, Racket has rack check closure has test that check for those who are, you know, more into skiing, there's a whole book of property based tests in the lecture. It's sad, I really couldn't find any, a lot of books on like property based tests only elixir. I'm going to find another one, which is not for R. So, you know, because, but yeah. I know I have not looked for the R application. Anyone's interested, I could paste any of these links for them. I'll include it when I post the video. Cool. Yeah, so I'll mail all these at Clause so you all can check them out if you're interested in getting further into this. I can't even hope not if you want it. Yeah, because this is basically just a Python notebook just install, just install the polyglot notebook in Visual Studio codes. And then yeah, you're good to go. I have a question. You're not really related. How did you like, did you enjoy working with the polyglot notebook in Visual Studio? I did it a little bit. I played around with it for a little while and it didn't really appeal to me. And I'm just curious, was I missing something? So, I enjoyed it, but I used notebooks, whether they're Polyglot, Colab, Jupiter, always as a presentation and learning tools. So, like I never actually use them for any real coding. It's only if I want to do a presentation because I could have live coding, have changes, share my results. And I love them for that, but I would probably not use them for much of anything else, unless I was writing maybe like a document about something I was exploring. I had a little bit of code sample to support what I was writing about, but really the code is never like the focus in these things for me. Yeah. So yeah, because if you were doing it like, what was your use case like what were you doing? Well, for this, it was an open source project. The guy who started it was using a notebook as the reboot. It sounds like an awesome idea. It is actually pretty good for the people who are users, but for somebody trying to contribute, you don't have to be in shy pay. Yeah, I mean just enjoy it. No, I'm totally with you. In fact, I used to, I mean, in general, I used to like hate Jupiter. And it was only when I started kind of getting into these cases where I was like, you know, trying to like do a little bit of my own research. I was focused on that and presented something. I was like, oh, I get it. It's good for this, but it sucks for all of us. And that's kind of like my view. Use the tool for the thing you do. Exactly. Anytime doing any kind of coding and even in that scenario, I mean, yeah, I'd be kind of like, yeah, yeah. And then if I'm doing anything like, say Python or anything like supporting more like, say Python, I'd probably even go as far as like, just call it like whenever we do our just enough math groups and all codes in Python, it's always cool. I was going to just share the links. That's the thing also to polygop. It's like, yeah, you got to install. It's not rocket science, but still you have to install, you have to install the extension. You have to make sure that your kernels are set up and then I think you have to make sure your version of that and that is up to date. Then you go ahead and use it. And again, it's a little extra work where Google Colab is like, click on the link, you know, do you have a Gmail account? I'm always around on Planet Dust. It's free. Great one. You know what I mean? It's like the sharing also is another good job. No, but that's a good question. This is like my first time actually presenting on a polygop notebook just because that F sharp support was better. Yeah. I mean, I thought it worked pretty well for the presentation. Yeah. Yeah. All right. I want to thank you. Some time for questions. I want to start with Lester because I cut you off with a big question and then everyone can start to chime in. You covered. Are you happy? Are these recording questions? We are on the recorded question period. They're fun questions. I will tell you when we're cutting off the recording and then we can do. That's what the fun starts. Yeah. That's what the fun starts. Okay. I did it. The system went into a test driven development work flow. So this is a really good question. So I've only toyed with like using this in a test driven development context. At least my initial, and this is just absolutely initial impressions are, you could do it, but you are going to need to think more deeply than like test driven development normally does. For example-based tests, you know at least what your samples are. Because a lot of times also with test driven development, I'll say a lot of times, but sometimes you're sometimes even sort of trying to figure out what function you really need. And sometimes you're kind of, you know, so when you're kind of in a case where you're trying to figure out what are your variants of a function where you're not even sure what the function is that you quite need, it could be a bit of a test. But at least my initial one, my initial toying with it, because I kind of toyed around a bit with it, is that it looked like it was promising. And one day I do want to try like an actual non-trivial project, just doing test based property based testing. Because I'd like to know if we can get into how the experience is. But when I at least initially tried, it seemed pretty promising and I'm interested in going further with it. But you definitely have to think a lot more about it. Well, it also isn't really a deterministic testing. Correct, because you're a different ride, because you get different, yeah. You go down and then suddenly it's not working, you know, oh my gosh, it may have been broken for a while and I've been building it. Right, because of random sequence I was generating, maybe it never hit upon that one bad test point. I mean, but on the other hand, if you're doing example based testing and you're picking your test points, maybe you missed some of these errors too, right? Especially if we're kind of writing what we think should be right. Because that's the other thing too about this is that it does force you to be more robust with your functions, because now that it's basically throwing all kinds of data, some of which sometimes programmers may not even test for because they assume they'll never do that. It kind of forces like, well, this fails on zero. Well, no one should run around, well, it fails on zero. What are you going to do about it, right? So it does kind of force you to be more robust. How does this relate to like fuzzing when you're trying to break something? This fuzzing going, fuzzing, you don't really know what to test. You don't really know the boundaries until you discover them accidentally, at least if you're trying to break into a system. But here, you know all the boundaries to test and you can kind of design. It doesn't sound like you're, I mean, you could define the boundaries to test, but it doesn't sound like the system itself has any sense of boundaries, except that like it restricts the range so that it's likely to hit zero and that seems people fuzzing looking for bugs, but they don't, it's a black box. So they don't really have good ideas on where those boundaries are. Yeah, I don't think that fuzzing is sort of a subset of this. In a sense that there are working strings. In fuzzing, like typically you don't know each thing about what the hell this is to be. You know what it's supposed to be, right? But you don't know what it's doing. You don't know the boundaries. You don't know the boundaries. You don't guess them as time goes on, but you learn them. Yeah, you learn them, you just learn them, right? But a failure is a good sign, right? The whole idea of fuzzing. And my understanding, and this is actually more of at least a scenario fuzzing that I'm familiar with, there could be several different scenarios, is where folks also will use fuzzing to try to test the test quality, right? So they'll kind of like ride those, kind of like, hey, did you write a meaningful test or did you throw in something? Let's perturb this and see, you know, if your test is worth it, right? But it is definitely kind of like in that there's an automated element to randomize a test that's very similar to what's kind of happening here. This question comes from just what I'm going to do today. But imagine you have a really large range of possible valid values, but it's got giant holes. I'll be more specific. Think about how Unicode works. You know how Unicode works, right? Nobody knows how Unicode works. Imagine. Just imagine. But in general, right, there's like this, there's a, there's one level, like most, almost all of the values in the basic multiple plate are filled, right? But what you can do on that, you've got these giant areas where there's like these holes. So the problem I was thinking about today is like, I'm taking this bag of vines and I'm turning it into a random territory. And it's been asserted that this is a UGFA. And so it isn't, you know, I should be able to check whether it handles value UGFA. The problem is, is it, how do I generate the appropriate range of values, right, to test for the right side of it, right? I don't think there's actually a really good answer in property-based testing. But my example-based testing kind of, it is, it's kind of sucks. I mean, the first thing I do, when you do something right, by the way, is it generally going to be like put a poop-a-lobe to you, a character from linear B's syllabary, and then another poop-a-lobe to you, and it's part of your test. Yeah, well, the series about this, there are strings like that, I don't know. Tests and strings for you to test. JCJ testing. And then there's ones with a turquid character, it looks exactly like me, even my regular character, but it's not. That's true. I'm working on the vines. Right? And so the point of that is, in UGFA, when you get involved in the basic political play, right, you have these values that are, it can be very difficult to deal with if your underlying system is actually using UCS-16. And because I know the framework that I'm using, under the coverage, you UCS-16, but still I'm supposed to handle it in the correct way. What I'd like to be able to do is generate a whole bunch of bags of vines that are actually valid UGFA. But that's a non-trivial thing to do just filtering though. Well, yeah, in a way it is, but I mean... If you send somebody that's a valid UGFA encoded, it just happens that you give something that doesn't exist as a character yet. The system just says, I don't, sorry, I don't understand that, right? Is that what you're testing? Well, partly I wouldn't know that if I get invalid by trying to interpret it or something. Yeah. In general, I mean, it's a reasonable test to say pick for this whole set of values, right? Because even if it's not active, if there are no characters out there, it should be able to do something with them, right? The problem is you can't tell them, because when you turn it back in, it's you got to show a question about it for all of us, like literally the rule is if you get an invalid, you know, you actually learn what to turn it into a different thing. Okay, so you would like to test one out of the non-existent and all the other things? Well, I would really like you to be able to say, okay, just give me like pick random unicode value, whether I thought it's important or not, I don't know. The reason I suggested those characters is that they have really good font support. Only they're not really good font support, so I can see that I'm getting the right value, right? So there's lots of characters that I know are found in unicodes, but you have to be very careful that the font you use to be able to tell or work you visually, right? Part of the problem that you're running into in that case of like, I don't understand the return of a question mark is that it's a total function, right? And you can't test that, right? Because the total function works. And it returns a different question mark character? Yeah, well, if it's really valid, it should be converted that into a specific supply. Not just the regular question mark, but that's the failure that you always get when you're on the web. Yeah, but I don't know what's the code behind that. Well, so here's the thing, but that also is what's supposed to happen if your font doesn't support if your font doesn't support it, it's supposed to give you the lines, but show the other character. But if it's an invalid character, it's supposed to convert the invalid character into a special value that is that that also is represented by the question mark, which is why visually we can turn part of the test that's going on. It's not really important to kind of figure that out. Well, okay, so I guess that was this question of maybe that I didn't catch here. When something fails, what is it catching? Is it catching an exception every time? Um, so it's catching. I mean, so it's catching. So it's catching an assertion failure. It can be evolving. It will also catch exceptions too, but exception is thrown that is considered a test failure as well. So it's about catching exceptions. It's a Boolean or an exception. Yeah, so yeah, because the assert and when the cert fails, is a cert genuine an exception when it fails and probably based test in framework or rather than any test in framework? I think it does. It should. Yeah. Okay, so why just be testing catching exceptions in general? Just exceptions. Yeah, which is messed up when you're with success is an exception. So I was reminded as a sociologist, I will say success is almost always an exception. No, so I'm sorry, I apologize to everybody, but this is something I have ran it about literally for 25 years. The first time I remember complaining about this to 99. Thanks for coming. Every message you exist if I've ever used, when you're reading messages out of the queue, maybe you have successfully read all the messages. You don't have a text. They throw a freaking exception. It's like, that's not a problem. Dude, I succeeded. I'm the reader. You're finished. All right, I'm done. I can let everyone do more, but every single one of them for the last 25 years throws an exception. It's a little wrapper function. So what wrapper library are you in? Okay, so I think we are past the time. I just came in. Where did the party just start? I don't know, this is the poster. I just record questions. So I'm going to ask if anyone, especially online, has a question that they want to ask to be recorded, type up now, or I'm going to stop the recording. And then we're going to have to after-party. We're interested in questions. I was just going to ask a quick question. So one of the things that was kind of, I noticed was that a lot of the values that tend to fail aren't just like, you know, random values. They're like values that satisfy certain algebraic relations, right? So like they either satisfy some linear relationship or some, you know, polynomial relationship. So, and this probably has to do with the fact that like most of the time we're worried about like, a lot of the time we're worried about divide by zero errors, right? And divide by zero, you know, tends to happen when a bunch of, I guess a bunch of monomials in a denominator equals zero. Which is only satisfied when you satisfy a certain polynomial equation. So like it seems like the best way to do this kind of random testing is to not select evenly, you know, especially when you have like multiple different numbers is not select evenly, like a uniform, from a uniform distribution of numbers is to like select from, sort of prefer to select from numbers that are, that satisfy low sort of small coefficient polynomials. Is there anything that like, that does that? So I mean, when I say polynomial, I mean like x equals, if you're selecting a pair of numbers, you might want to prefer to choose x equals y or x equals negative y or x plus y plus z equals zero or something like that. So for this, so for this, I think you'd probably be leaning on, am I still sharing? Yeah, you still, I'm still, yeah. So I'm thinking you'd probably lean on the custom data generation because at this point, this is where you can pretty much not only generate like, you know, you know, you're generating a custom data, but you also have like some of the data that's generated be a function of a previous, a data that was previously generated, right? It's just kind of like going through the pipeline and doing that. So I think in a situation like this, this is probably kind of like one of the use cases where you want to, you know, specifically do something like this. That would be my guess is like, yeah, because at some point, yeah, and just more generally speaking, right? Anytime, you know, there's some kind of relationship among the inputs you want to enforce and you just don't want to do it by way of a filter because maybe it's a little bit more involved or maybe you're going to get so few hits on a filter, you're going to run out of, I run out of tests, right? I say you want to ensure that by design, your data has those relationships, whether again you got a polynomial relationship or really any kind of relationship. I think you probably are going to be using the custom data generation here. Yeah. I mean, I'm just thinking like from the point of view of like a random, like completely random generation, if there's like, there's probably a research problem out there trying to figure out what the best sampling strategy is for completely random testing. I just feel like there's probably when you're looking at tuples to have, you know, even if you're looking at like a login or something like that, it probably makes sense to test, in my case, first name Chris, last name Chris, right? Because that might fail somewhere in a validation system or incorrectly, you know, design validation system where they say first name can't be the same as last thing or something like that. So I'm just kind of curious if there's any prior art on that or if that's something that's even a consideration. I mean, I was just thinking about it. I think it's like a lot of interesting deep problems in this area. So I think it's really cool that you gave the talk on it. Well, thanks. No, I think I think you've got your question is really interesting, right? I think a lot of it would depend also on the task, right? Because the distribution that you're using, like for example, let's say you're trying to test an application that maybe among its inputs extends, say, a user's weight, right? And we know that weight generally is going to be in a normal distribution and we know roughly what the meaning of the weight will be and that sort of thing, right? So we kind of say, hey, we know what the distribution looks like for that. We want to draw from that distribution when we're doing a random testing, right? But then if you're dealing, say, with, you know, different type of data that can think of on top of my head and something to say with like a heavy tail distribution, right? Then you got a difference. So a lot of it's going to be like task specific. So no, I definitely agree with you that there's probably good distributions for tests, but I think they're more domain specific because different domains would just require like different distributions. But it's a really good question because, yeah, it'll be really cool if there was a way out of the box where you can sort of like specify some stock distributions, like, you know what? Yeah, dude, let's go ahead and just make it a normal distribution with the mean distance standard deviation that. Yeah. Or to just like, you know, sample the first 50 from a uniform distribution and the next 50 from a uniform distribution or something, you know, I don't know. Yeah. Yeah, no, that's, that's a really good point. Yeah. Taking the distributions into accounts, we really need. Yeah, that's really good. It's really insightful. Yeah. But it definitely be like domain specific, but it would be cool. And I wonder if maybe like, it'll be something where we just write our own custom generators, at least for some stock distributions and then throw them into this pipeline. Because I'm guessing we could pretty much throw in some of these custom generators and they just have like these distributions we can just, you know, lead on. But yeah, I just want to kind of speculate on what we can do here for the job. Next, really nice talk. Very much appreciated. Very, very good. Okay. I'm going to cut off the recording.