 So, next up in our list of continuous practices, we have the topic of continuous experimentation. So what this means is that we continuously run experiments with our users. And the way we look at experiments is in the scientific sense of the word experiment. So, what you have, for example, when you do drug testing, when you do clinical trials in medicine, like you currently see with the COVID vaccines, that you have two groups in the easiest case. This can get very complex. One of the groups gets the actual drug, the other group gets a placebo. None of them knows what they get, so they are not sure whether they get the real drug or the placebo, and then you somehow measure the effect. So you try to see whether there is any kind of effect of the drug over the placebo. And this is pretty much also what we would like to do in software. The difference is, of course, that we don't test any drugs, but what we do instead is we test features. So we could, for example, test feature A here, and here we test a slightly different version of the feature, feature A prime. So we want to see, OK, if we implement it in one way, is it better than the other and measure the outcome? So that's essentially what the idea of continuous experimentation is, and we do this continuously at runtime in our productive system. So that's the big difference, maybe, to a clinical trial. We don't do this with a special group, but we run it live, so to say. There are different versions of doing that. So there is, for example, something that is known as a cannery release. The name comes from the old mining times when the miners would take a cannery bird in a cage down into the mine. And if the bird would die, then there was an indication that there was some gases released that are toxic, so they would try to get out of there. That's essentially what it is. It's kind of a warning sign, something is wrong. And what you do is, basically, you have a new release, so let's say a new feature that you release, release N, and you have the current version. And you try to keep pretty much everything as it is, but a certain percentage of the users, maybe 5%, there's a common number, 5% get the new release, the others get the old version, the stable release that you're running. And this means if something is seriously wrong in your release, there's a bug you haven't found, then you will, pretty soon, probably, there's a good chance that you will get some kind of warning. There are lots of bugs happening, there are lots of errors happening here, people are complaining, it's some kind of early warning, but 95% of your users are fine. So you are making sure that you don't crash your business, for example, by releasing it to everybody, but you just get an early indication, so that's a so-called cannery release. Then there's something else called AB testing. This is essentially what I mentioned here, that you have a feature, or you have a version A and you have a version B, this could be different features, it could be the same feature in two different variants, it could be the old version, the new version, and instead of doing 5%, 95%, you do 50-50, so you test different, kind of half the proportion of the users goes to A, the other goes to B, so that's why it's AB testing. So that's another option you can do, ideally, of course, compared to the cannery release, you do this when you already have an indication that these things are stable, they're not so many errors. If you would do this with the experimental release N that we have up here, there is a fairly high chance, if there are errors, that a large part of your user base gets really, really annoyed, because things don't work. And then finally, to mention, we have something that is called a dark launch, this is where we actually don't really expose the new feature, let's say our new release to anybody, we send 100%, everybody uses the version, we are still having the stable version, but in the background, you are also replicating, kind of duplicating the traffic, whatever actions the users are doing, you send also to the other version just to test. So the user never gets any answer from that, they never see that anything is wrong, but you are essentially simulating that the users are using your new release, and you see whether this would cause any errors. So the good thing about that is that the user doesn't see anything of that, it doesn't feel, for example, slower to the user, you might do this on a different server, but you are getting a real user usage scenario, you're getting the real actions that the user is doing, you don't do anything artificial basically, so that's a dark launch. These are some of the variants, I'm sure there are many more, but you should have heard definitely the first two, I think the dark launch is a bit less common, but it's good to know them. Now, if we talk about all of this, what is important to know? I think what you might first be interested in is the technical aspect, how do we actually implement this? This looks advanced, well thought through, it hopefully is, but in the implementation side it could be pretty simple. So something that we often have is, for example, a so-called feature toggle. A toggle is a switch that is kind of on off, and a feature toggle is literally just a statement or a flag in your configuration or anything like that, that switches a feature on or off. So this could be as simple as an if statement in the code. If a certain variable is set, then enable feature A, otherwise don't. Or if, for example, here you want to have 50% here, 50% here, you just choose a random number between 0 and 1, if it's smaller than 0.5, you send the people to feature A, if it's larger than 0.5, larger equals, you send them to feature B. It's just an if-else statement. So it could be as simple as that. Of course, you can do this more advanced. You could do this on a network level that you have, for example, different nodes. This is on one node somewhere in the cloud. This is on another node. And you actually have some kind of routing that sends 50% here, 50% there. So there are different ways to do this, but this could be on a very basic level. So it doesn't have to be advanced necessarily. I think if you have a lot of these, at some point you might need to consider performance. If there are a lot of ifs and else and different loops just to make sure which features you enable, that can be questioned, but generally it doesn't have to be very complicated. So that's necessarily, that could be enough. The other aspect of tech is this one here. How do we actually measure measurement? And there are nowadays a lot of tools that help you with this. So that kind of automatically collect usage data, for example, but you might also have to implement this yourself. And that means going into your code and actually adding things, using libraries for measurement, starting with basic logging maybe, but you have to have some kind of way of measuring. And we will a bit later in this module also quickly look into measurement theory and software engineering. Just to give you a quick introduction of what is it we might want to measure here. So you have to have some kind of way of telling what which feature is actually better or is something wrong. And I've thrown in a number of words already. I've mentioned, well, there are lots of arrows. So can you measure lots of arrows? There are maybe here ways of getting a similar answer. For example, you run a survey with the users, did they like it? Or you just look at how many people are leaving the website, for example? How many people go to the next page or not? So different ways here. But then finally, and this is very important, there is the scientific aspect here. So this is actually, this is, that's why I mentioned medicine here. These kind of experiments have been studied in detail. There are a lot of guidelines on how to do them properly. And if you ignore them, there's a good chance that your results will be pretty bad. And that for a number of reasons. So for example, we have discussed this in elicitation and requirements that you want to get different stakeholders, for example, if you do interviews, this is the same here. You want to make sure that 50% and 50% are maybe similar groups. So this is about sampling. How do you make sure that you get similar people here? And one thing you do in clinical trials that you should maybe do here is simply random selection. So it's not that 50% that get feature A are all in North America and the other 50% are in Asia or Europe. So you need to be careful here. Similarly, if the 5% that get your new release and the cannery release, if those are sort of the early adopters or the power users, maybe those are the ones that don't care that much about bugs. So then they won't tell you a lot about the kind of issues you're getting. So this is something you need to be very careful about. And the same goes if you start mixing different features. So if you do feature A, B, and C here and feature B, E, and F, then you get into controlled experiments and there are very different structures of how to do them. You need to be very careful that you actually measure this properly. This is something I cannot quite cover in this course, but I would definitely recommend you if you want to do this on an advanced level that you actually read up a bit on experimental science, how to actually conduct these experiments in a meaningful way. Because this is not as simple as it might seem in many cases. When you see these results and when you see these sketches, it always looks very basic, but this very quickly becomes extremely complicated. So definitely I would recommend you if you want to do this, look up the details.