 So, I want to talk about hypothesis testing. First really quickly, if you were here yesterday, you saw this, the important, relevant part of my background. I was a technical lead at IMVU. For a little over a year, I ran the technical side of the marketing team, which meant five or six product managers showing me project proposals, and our project proposal started with a hypothesis. And so, a lot of what I was doing was discussing how to best test some of our best marketing ideas, and I ran that team from when we started, we were losing 400,000, something like that of a month, and when we were done, we were 500,000 possible a month. So it was a huge, huge, huge success, lots and lots of testing. And from that, I also, when I was at Canvas, I was doing a lot of product management because when you're in a startup with only a few people, you have to wear a lot of hats. So I've done a lot of hypotheses, probably on the order of 300, something like that over the last six or seven years. You can find me at TimothyFitz.com. If you can't find me right after this, feel free to just read questions. So there's a really famous experiment that Google did or Infamous, where they tested 41 shades of blue. And this is not what I'm going to talk about. This is actually a really useful tool or technique to do. It's called MoliArm Bandit. If what you're trying to do is optimize for a really quick testable variable, something like did users click, did users sign up, was there an instant action, you can just use Google Analytics. It'll just do the right thing. I'm not going to talk about that. I want to talk about hypotheses and how you learn about products and how you learn about your users. So good hypothesis is documented, testable, actual, interesting, and surprising. All right, you're good. You got it. I don't have to say anything else. Oh, okay. I guess I'll talk a little more then. So this is, you don't have to read this. It's just like, you know, we tend to think of documentation around projects as having an objective and requirements and maybe user stories, and these often grow into many, many pages. And then developers go and they implement those things and we ship it and we call it a success because we checked off all the boxes. When you start to do hypothesis-driven development instead of project-driven development, you have to completely throw away this document. The document needs to start with the hypothesis, and you need to drive all of the things that you're developing from that. It was a pretty big change. So write down the hypothesis, write it down up front, share it with everyone. Something really surprising here, and that's that the details really, really, really matter. Tiny code changes can affect the outcome of your test, where you collect the metrics, how you collect them, how users interact with it. All of these things can give you slightly different answers when you're trying to test a hypothesis. The only way to prevent this from screwing up your hypothesis that I've found is to teach everyone on the team how to run hypotheses and how to test them. And to get all of the developers who are actually going to work on implementing the analytics or the test itself to understand why you're running the test in the first place and all of the motivations. Because inevitably, they're going to come across some small bug or some small issue and fixing it or not fixing it is going to make the difference between a successful test and an unsuccessful test. And worse, it can make the difference between getting a bad result, a result that tells you the opposite of what's true. So your hypothesis needs to be testable. There's a bunch of questions you have to answer up front. And remember, you have to document all of this up front, because you'll probably miss some of these things. Are you collecting the data? Can you collect and analyze it quickly? Is it statistically significant? I don't have enough time to explain statistical significance. There's a whole other topic, there are whole books on it. It's very, very important that you understand the accuracy of your tests and that you're getting enough users in to test your hypothesis. Or else you might just have an error term that's larger than what you're actually looking at. But quickly, I really, really want to talk about a bunch. But I think it's important and people don't get this. So there's a term for it now. When I started doing hypothesis testing, we didn't have a term for it. And people would think, well, let's develop for six months and then run the hypothesis of did what we build work. That's not a good hypothesis. High tempo testing is a much, much better system. So this is a growth hacking website. And they started doing high tempo testing. And it almost immediately started to show results. Now, a lot of their hypothesis are not being confirmed. Not every shot they're taking is working out. But when you're doing three experiments a week, you're learning really, really fast. So I want to give you an example of what I consider a bad hypothesis. So imagine you've got a one-on-one chat app. And you say, OK, let's add group chat. It's going to take about 12 months of development. And we think it'll result in a 25% increase in a three-month pretension. Well, great. When are you going to learn whether or not this is true or false? A year and a half? How long does it take to get three-month pretension numbers? And how accurate are you with this is actually going to cost 12 months to develop? This is a really hard to test hypothesis as written. Here's a better way of doing it. Also, these hypothesis that I just put up are actually kind of bad. But I can only fit four bad ones on one slide. I can't have four good ones. But imagine you break that test up into four smaller ones. And you run the first one. Do users want to chat with strangers? Yep, they do. Good. So we continue to develop this feature. Do users want to chat with strangers about specific topics? Oh, we know. We invalidated one of our hypotheses. Now we just don't do the rest of the work. We get to say, OK, what have we learned and how does that affect the thing that we're going to build? This doesn't mean necessarily abandoning the original idea, but it means using what you've learned to change as you build and not after the fact. There's nothing more frustrating than spending a year of development time and then running a hypothesis test and finding out that what you just built did not materially affect the product. And that negative result and how bad that feels causes product managers to just lie, to just find any way to make those numbers work out. The results of testing your hypothesis have to cause an action. Otherwise, why did you run the hypothesis test in the first place? So what people usually think that looks like is something like this. Adding a new payment method will increase overall revenue by 5%. If this hypothesis is confirmed, then great. We're going to keep the new revenue method. That's just barely actionable. It's not really interesting, because the action you take has to be important and dependent on the outcome of your hypothesis test. So what does that look like? Well, here's what I find happens in practice. And this is, I think, the number one most common mistake that people make when they switch from doing project or plan-based development to putting the hypothesis first and actually testing things out. The hypothesis is that adding a new payment method will increase overall revenue by 5%. If that hypothesis is confirmed, then we'll keep the payment method. If it turns out that we only get 3% bump in revenue, well, we're going to keep the payment method anyway because we spent all that time developing it and we have it now, so why would we throw it away? Why did you run that hypothesis in the first place? Now, no one's going to write this down up front. So if you don't have the activities that you're going to do, the actions that you're going to do in response to the results of that hypothesis written down, then you run the risk of just doing whatever you're going to do anyway and ignoring the results of the hypothesis. And I've seen that over and over and over again. And why? Because good product managers can rationalize any action from any outcome. That is an amazing skill and a problem. Here's what I want to see. Adding a new payment method will increase overall revenue by 2%. If confirmed, we're going to keep the payment method. If denied, we're going to delete the payment. Written down up front. Now, 2% is probably like a really low bar for this feature. But maybe that's the break-even bar, where we think that the cost of maintaining this payment method over time is roughly going to pay for that. And now, obviously, the product manager wants that 5% and is probably pitching the project based on the idea that it might get 5%. But this realism in our hypothesis and realism in the actions we're going to take is very, very important. So that's how we get interesting. But I promised you surprising. So I probably don't have to explain it. It's pretty obvious. This is a Bayesian formal definition of surprise. I love that actually they named a unit wow. The total number of wows experienced and simultaneously considering all models is obtained through the integration of the model class. Pretty straightforward. It's actually really interesting. You should read about it. But in a nutshell, their mathematical definition is formalizing over the intuitive idea that the amount of surprise something gives you is related to the change in your bullet. If you really expected, let's say you bought a car. You ordered it online. It's going to show up. And you really expected it to be red, because you clicked the red button. And it shows up blue. That's really surprising. And now your beliefs about what kind of car you're going to drive for the next five years are very different. That's what we're looking for. We're looking for the result of the hypothesis to change our beliefs about our users, about our product. So if the results of your hypothesis tests, not just tests, aren't frequently and significantly changing your beliefs, then your hypothesis aren't useful. They're not providing you with the data, the information that you actually need to be driving for. And this will actually cause you to seek out very different types of hypothesis. So one of my favorite types, I call it a coin flip hypothesis. If you get everybody in the room who knows about your product and you ask them what they think the result of testing that hypothesis will be, and half the room says one way and half the room says the other way. Running that test will teach you a lot about your product. It also means that if you have a really strongly held belief and you test it and it comes out opposite of what you believe, that's a really strong and powerful indicator that there's something you don't understand about your product. So it's very surprising. And you should follow that. You should seek it out. You shouldn't try and find a way to make it look like it's actually just dirty data or a problem. These surprising events are the most important results from your hypothesis. So having said all of that, another really, really common mistake, people that a single hypothesis test invalidate their whole vision to say, OK, we're going to do a startup. We really believe that transportation needs to be revolutionized. So we're going to walk around college campuses and ask people what they think they need in terms of transportation. We asked 10 people, and they all go, I don't care about transportation. I walked from my dorm to the classroom. That's fine. All right, I guess transportation doesn't need to be revolutionized. We're good. We're going to focus on something else. Now, if you believe something really strongly, then you should have strong evidence to counteract that belief. So if you get an invalid hypothesis, that's interesting. You should seek out another hypothesis to confirm it. You should test it again and again and test it more specifically. Because odds are you didn't invalidate your whole vision. But it's very possible that you need to change something. About the way you're going to implement your vision. So people criticize Lean Startup because they see Lean Startups that try 19 different big visions over the course of the year. But I think that's actually a really terrible way of implementing it. Have your vision. Feel strongly about it. Be passionate about the people that you're trying to optimize for, that you're trying to build for. And then use those hypotheses and the results of them to slowly change that vision into something that better maps to the world around you as you learn about your users. Good hypothesis is documented, testable, actionable, interesting, and above all else, surprising. Thanks. Questions? Do you want to just yell it? Ideas better than ideas? Not parallel, but sequential. So you'll start out with this great big grand vision. You'll do one test and then get really disappointed and sad about that vision. And then try a new great big vision. And that leads to an incredible turn and frustration. And it's not the right way of doing it. I understood doing things in parallel, set-based development. Set-based development is great when you can afford it. Most startups can't. If you have two or three people, you really don't want to try two or three things in parallel for very long. However, if you're at a larger organization, if you have more of an understanding of like set-based development is great when you have a problem that you understand, but you don't know the right solution. So like one of my favorite examples is the Toyota with the Prius had something like 12 engines they were working on in parallel and then they picked the winner. It's not great if you have 12 problems in parallel because you're not going to find out which problem is the bigger problem now. How do you test something which is not there in the world yet, right? And people don't know about it. But you have a strong feeling that this will work. For example, touch screens. If you would have asked someone, they would say, no, I don't want touch screens. Keys are so fine, comfortable typing, right? And but Steve Jobs or people had vision that this will work and how do you test that? That's a good question. So one of the precursors to the Eric Reese concept of the lean startup was Steve Blank's idea of customer validation. So this is a really good book if you tweet at me else on your lane. But one of his key points was that you should find someone he called an early evangelist. So this is someone who has a problem that you wanna solve and is already trying to solve it poorly themselves and wants to pay you to have it solved. And those people exist for almost any problem. And if you can find them, you can just give them your product for free and they will pay for anything you could possibly imagine by telling you all of the things that they actually need. So if you wanna create something that doesn't exist in the world, you need to find the person who wants it. And if no one wants it, then maybe what you're trying to build is not quite right. Often what that actually comes out as is those early evangelists want something slightly different from what you're trying to build. And if you target them first, that'll get you further along the adoption curve that Amy Jo's just talking about to the point where you can actually build the more mainstream product that you actually want. Sort of like Tesla building the sports car version for the really rich people and then eventually they'll build the electric car for the masses, we all think. With hypothesis testing, one of the biggest challenge I run into is confirmation bias. How do you deal with confirmation bias when you're doing hypothesis validation? Questions, how do you deal with confirmation bias? Can you give me an example of exactly what you do? So an example is that I look at, in my product, I look at the too many steps to do something, right? And I'm already convinced that there are too many steps to do something. So I'm gonna now design an experiment which is gonna look at how many people drop off during the steps. And then it's gonna confirm that people are dropping off and hence we should fix this problem. But I don't think the story ends there because you need to look at how many people are coming back and doing it in spite of it being long or other kinds of things. So sometimes we kind of jump ahead and we have a confirmation and we just use the data that we've got to prove that that's the problem. Yeah, that's an interesting problem. One answer is that there's almost nothing you can do. At the end of the day, sort of everyone's biases are reflected in the product and you have to accept that the whole point of running a hypothesis test is to figure out what your biases are and what's wrong with them. But if you're not aware of them. Right, right, if you're not aware of them. The other side of that is having a cohesive thesis about how your product will work. So whether you say like, okay, well, we're gonna make money by, here's our funnel and so many users are gonna get here and then they're gonna do this action and that's going to keep them around for this many days and we're gonna charge them this way. Here's a model. Now we're gonna collect data to see whether that model is right or not and how accurate it is. And then every hypothesis we do has to tie back to that model because that's how we think of our business. And then you can't really get into the position where you're running a hypothesis that doesn't matter. Or you might, but then at the end of it, the result will be oops, that didn't matter and you'll be able to say, okay, that was the wrong thing, it was too specific or we missed the bigger problem. So it really just means like take the 1,000 foot view and have that somewhere and then tie everything back to it. So if I understand correctly, you're saying there is a high level hypothesis objective and then everything else that you're doing at a feature level, you might have hypotheses around that and they need to tie it back to the bigger thing. Yeah, exactly. I mean, like with the chat one, you wanna add group chat, it's a year and a half. Actually, that should be a bunch of small hypotheses. I mean, you see this in science. It's not like we confirm a hypothesis from one study. There's many, many hypotheses that are all related. And when you do a meta study across all of them, you can say, okay, given 20 confirmed hypotheses, we have really strong beliefs that this aspect is true. Other questions from anyone not at that table? Okay, back to the table. Okay, so the rich question. Yes, we obviously we never stop at one hypothesis and we need to continue do more than one hypothesis. But even if we do that, like for example, what if we don't have data? I mean, what if we have like very few customers and we have already built a product that has reached a sufficient stage but we don't have hardly any customers. We probably are using it for a niche subset of people, very small, tiny, or ourselves. And then all these hypotheses about what will make an eventual or potential user or mass when they come, what will make them stick around. If you don't have data, is there any point in doing hypothesis testing? If you can't get data, you can't validate or invalidate hypothesis. I mean, statically significant data. Right, yeah, and that's important to know if you're in that position or you only have one or two customers. But at that point, I would say that your job should be to go find more people. Because if you only have one or two people that want what you're trying to build, then that's in and of itself, validating if you don't have the right idea. Correct, and most of the time, the people that I have worked with, they seem to think that it's okay. Once we build a product, people will come. We know people will come. So let's build it first properly. I know I can throw money. Tomorrow if I throw $100,000, I know my site will go down. So make sure it works properly. Yeah, I find that that type of overconfidence of correctness, of vision, ends like after the second hypothesis test. And then you get the opposite. You get the, oh, everything I thought was wrong. I'm terrible at what I do kind of problem. Startups are, yeah, a lot of emotional whiplash. Thanks a lot. I'll do one last question. If you're building a product that's scratching your personal itch, if you're solving your own problem or your dog fooding, do you still need to do hypothesis testing? It really depends on what you're trying to build. There are many people that are really happy and successful lifestyle businesses that have a small number of customers who are very delighted and are scratching their own itch and they're happy building what they wanna build. And they don't do hypothesis testing. And I'm not gonna say that's wrong. That solves a lot of problems in the world. If you're trying to build something big, if you're trying to do it at scale, if you're raising venture capital, if the stakes are high, then no, you have to do a hypothesis testing. Did I answer your question or did I go back? Absolutely, yeah. Cool. Thank you very much.