 Hi, my name is Jasper Yang and today I'm excited to talk about efficient multi-wave sampling in R with the Optimal R package. R is a number of great tools for the analysis of survey data, including Thomas Lumley's survey package, but less tools are available for the design and implementation of surveys. Optimal is an R package for the survey design process of stratified sampling service. It is particularly useful for multi-phase and multi-wave designs. The package's objective is to streamline the design and implementation process of stratified survey sampling in R. To accomplish this, Optimal contains features that automate basic survey steps, including defining strata and selecting samples, simplifies steps for optimum sample allocation with name and or write allocation algorithms, reduces the potential for error and enhances reproducibility, and provides a structure for organization of multi-phase and multi-wave surveys. Optimal is currently available on GitHub and it can be installed with the installed GitHub function from DevTools using YangJasper forward slash Optimal. It will be submitted to CRAN soon. To illustrate the utility of the package, I will introduce some of the notable main functions in Optimal, beginning with basic ones that may apply to any server. First, we have split strata and merge strata, which can be used to define strata based on local within stratum quantiles, global population level quantiles, or specific values of continuous auxiliary variables. It can also be used to define strata based on levels of categorical variables. Next, the optimum allocation function, which is the namesake of our package. Using name and allocation or write allocation, the latter being more exact when the solution is limited to integer values. This function allows users to determine the optimum allocation of samples to minimize the variance of a sample. The output of optimum allocation is a data frame, specifying how many samples should be taken for each strata. And this or any other manually specified allocation can be executed using the sample strata function to randomly select specific samples. So to enhance these features, Optimal includes a shiny app that users can use to reactively observe the effects of different strata cut points on strata size and optimum allocation. Once the desired cut points have been determined, the app prints the code to replicate the strata creation process so that it can be easily implemented again in the workflow. This image shows a screenshot of the Optimal shiny app. On the left-hand side, there are drop-down menus and sliders for adjusting inputs to the split strata function, which in turn affect the optimum allocation shown on the right side. So in addition to these broadly useful survey design functions, Optimal contains tools specifically created for use in multi-wave designs. For those unfamiliar, these designs conduct sampling over a series of iterative waves, adjusting estimates to better approximate the true optimum allocation after each wave. The function allocate wave facilitates this iterative process by determining the optimum allocation for sampling wave, taking into account units that have already been sampled in previous waves. The allocation that it produces is approximately optimal for the current wave, even if some strata have already been oversampled. So these basic functions can together automate most of the multi-wave survey design workflow. But even still, survey designs involve many phases or waves can be difficult to organize. For this, Optimal provides a framework for survey organization that also simplifies calls to basic package functions. The system is based on what we call the multi-wave object, which holds the survey information for every phase and wave of the study in one place. And this includes metadata, samples, sample data, and sampling designs for each wave. And this also illustrates how each of these survey pieces are held together in the multi-wave object for simple organization and replication if necessary. In summary, the Optimal R package is a new tool for streamlining the design and implementation process of stratified survey sampling in R. It is designed to be useful for surveys ranging from simple to complex. To learn more about the package, see the package documentation, our GitHub page, or our paper preprint, which is now available in archive. Also, please feel free to contact me with any questions or comments. Finally, I would like to thank my co-authors, Brian Shepherd, Thomas Lumley, and Pamela Shaw for their guidance and contributions to this project. I'd also like to thank Gustavo Emmerin for his useful feedback during the review process. This work was supported in part by the U.S. National Institutes of Health and a Patient-Centered Outcomes Research Institute grant. Thank you all for listening.