 I want to introduce the next speaker, Sean Ellis, and their talk, the mega-study approach to behavioral science. Hi, I'm Sean Ellis, and I'm a research project manager at the Behavior Change for Good initiative. Which button advances it? There we go. OK. So in the last decade plus, the application of behavioral insights to public policy has spread across the globe. Now, this policy advice should ideally be based on field experiments. But running field experiments requires huge fixed costs and is slow. Even when we have field studies to look at, comparing effect sizes across studies is like comparing apples to oranges, because no two studies are measuring the same objective outcome. And it's not always clear which behavioral insights are actually robust, because of the replication crisis, which we heard about earlier, as well as because of the file drawer problem. There are lots of studies that have been run that found no results, and they were never published. So we don't actually know that no results were found. So what's the solution? At the Behavior Change for Good initiative, we have come up with the mega-study. A mega-study is a very large field experiment in which smaller sub-experiments are run simultaneously with the same dependent variable, where a traditional field experiment might test a handful of different ideas or interventions. A mega-study will test anywhere from several to dozens. And the benefits of this is that it allows for comparability of results across studies. So we're comparing apples to apples. It lowers fixed costs, because there's a single central organizer that is helping to implement all of these different studies. It reduces the risk of learning nothing useful from a field experiment, because we were testing many things at once, which also helps to eliminate the file drawer problem, because since we are finding some significant results, we are able to then ensure that all or no results are also published. This can be run as a tournament of bringing together interdisciplinary teams from different fields, so that we're not just testing the ideas in one discipline and getting the silo effect. Mega-studies also allow for behavioral phenotyping. So we're not just going to find out what works, but we're going to be able to drill down and see why it works, and then for whom it works. And lastly, mega-studies can vastly accelerate the pace of scientific discovery, because instead of running one study after another over several years, we're able to run several studies simultaneously. So where might this new tool for testing many ideas at once be most useful? And one of the most prominent topics of the last few years has been vaccination uptake. So before COVID-19 vaccines even came out, there's a large chunk of the US population that simply wasn't interested in getting any. And so why? It's been a massive scientific accomplishment that we were able to create COVID-19 vaccines as quickly as we did. That's only half the battle, because we need people to actually take them. So that's with my remaining time today. I'm going to briefly discuss two mega-studies that we ran where we tried to increase vaccination uptake. Our first was with Walmart Pharmacy in kind of early months of the pandemic. And we were trying to increase uptake of the annual flu vaccine. And we did this with about 690,000 pharmacy patients. In this mega-study, we tested 22 different text message strategies. Some of these were we asked, we text patients, and we asked them to commit to getting a flu shot. We told them to get a flu shot to protect family and friends. And we also told them that a flu shot was reserved and waiting for them. What we found was that all of our 22 text message interventions did successfully and significantly increase uptake of the flu vaccine relative to the business-as-usual holdout control, so folks that didn't receive a text message. And we found that our top performing intervention, which we had an initial text and then three days later, a nudge, was our waiting-for-you language. And that invoked a sense of ownership over the vaccine and possibly even a sense of loss for people if they didn't go and get their vaccine. And so some subsequent studies by Hengcheng Dai and Matesh Patel found that this messaging also applies to the COVID-19 vaccine. So we were interested with the rollout of the new COVID-19 by VanLook Booster this fall to see if we could build on the successful intervention and see if we were able to increase vaccination uptake even higher. This is a very recent study of ours. We're re-partnered with a large pharmacy chain in the United States. And the central question that we kind of wanted to see that built on top of the waiting for you language was whether or not offering free rides to and from the pharmacy to get a vaccine added value. And the reason we were interested in this is because small transaction costs, like the cost of having to take a lift or an Uber or the bus, can matter a lot and can prevent someone from actually going and getting a vaccine. We also know that vaccine accessibility has been a widely discussed challenge, so much so that there were large investments in mid to late 2021 to provide free rides to folks so they could go get the vaccine. We also know that those who live further from vaccination sites are less likely to get vaccinated. So in this mega study, we randomly assigned about 3.6 million pharmacy patients to one of nine experimental conditions, eight treatments, and a control with our key outcome variable of interest being the receipt of a bivalent booster within 30 days of getting their first message. And so in addition for kind of our baseline waiting for you message, we also tested the free lift ride. For some other patients, we informed them that they lived in a county with high COVID infection rates, and for some others, we provided resources to combat misinformation. Now what we found was that, again, kind of like our flu shot study, all of our text messages significantly outperformed our holdout control. However, providing a free ride to and from the pharmacy had no added value to our baseline. Our top performer and to others, however, did. And the top performer suggested a plan to go get vaccinated. So the day of the week, the time, and the pharmacy where they last received their vaccine. Now the next thing, kind of going back to what I mentioned at the start of the talk, running field experiments and especially running mega studies, they're time intensive and they're expensive. So what if we could just get laypeople or experts to predict which of the different intervention ideas we come up with would best perform? Because if they are good at predicting, perhaps we don't even need to run any of these studies. And what we found was that laypeople vastly overestimate the effect of these interventions, and they're poorly calibrated to determine which interventions would outperform others. And we found the same with experts that they were also poorly calibrated, but they were a little less extreme in their overestimation. One of the other strengths I had mentioned early on about mega studies was it's not only that we get to determine what interventions work, but what interventions work for whom. And so for this COVID-19 by valent booster study, we found that our interventions had a stronger effect on men, older patients, those who had already gotten a COVID-19 booster, and patients on Medicare. So really the key takeaways we took from this particular mega study was that forecasters, both laypeople and experts, anticipated that the free rides intervention would outperform all others, and it did not. And they're also poorly calibrated and over-optimistic about our interventions in general. We also found some new communication strategies that appear to be most promising, and all three of them are a type of personalization. As I already mentioned, suggesting a plan with the date, time, and location, matching a patient's last vaccine, communicating that infection rates are currently high in a patient's county, and then also sending the message on behalf of the patient's local pharmacy team. That brings me to just kind of wrap up. Mega studies, as I've briefly demonstrated, can accelerate scientific discovery, compares apples to apples, reduces risks and fixed costs, and can facilitate interdisciplinary collaboration. In the two mega studies I briefly touched on, it was economists, psychologists, statisticians, computer scientists, medical doctors, and public health experts that all collaborated on the studies. And mega studies can also lead to policy recommendations that are better than what we can just generate based on scientific intuition alone. Mega studies, however, do have limitations. They're more difficult than most field experiments and often more costly. They require tremendously large sample sizes, which can be difficult to get a hold of. They require centralized coordination, which means, in many cases, a staff to actually run these. And there's also some statistical complexities to estimating and analyzing multiple and dozens of treatments and interventions at once. Most prominently, what's known as the winner's curse that you can overestimate your top performing, which also then has an issue at the bottom half of interventions of underestimating your bottom performing. Thank you. And thank you to our team scientists, funders, staff, and partners. Do you have any questions for Sean? John King, National Student Aging. What are the, a couple of some of the advantages of some of these things are the fact that you can sometimes take advantage, as I believe in the pharmacy study, of information in the system already about individuals. So, for example, you're able to report out instantaneously that Medicare was a predictor of taking up the thing, which, well, one question is whether that's a proxy just for age. In other words, do you feel that, does it increase like the risk would, which is one point. But another point is that, is your system, they're able to actually implement sort of fancier designs like adaptive designs. You'll find out that your bottom 11 are all losers. And so you might as well re-randomize people to something else, or as it goes along. It's similar to using factorial designs. I'm trying to think of ways to sort of, you know, get tamped down the total end you need here, because the great thing is you've got almost perfect power. The bad news is that you pound some of those into the dust. You don't need to know that much. No, yeah, we're actually kind of looking at that for future studies of being able to kind of like analyze data in real time, so that exactly like you said, for interventions that just aren't performing, we could then re-randomize people to the interventions that are so we could better power them, which would be really important in, you know, studies where we don't have such a large sample. Thank you, Sean, thank you.