 Hi, my name is Juno Prasimonis. I use they-them pronouns, and I'm going to be speaking today about the LDATS, our package. LDATS stands for Latent Deer Clay Allocation, coupled with time series analyses. This work is done in collaboration with a team of researchers based out of the Wee College Lab at the University of Florida. Our motivation in developing the LDATS package is to create analytical tools to study multidimensional ecological time series. Specifically, the project was started to analyze dynamics of the Portal Projects Rodent Community, a group of 20 species of rodents pictured here in the desert southwest that have been sampled monthly for over 40 years. This time series shows each species counts individually over time and makes evident the changes in temporal patterns. Just eyeballing these four lines, we can see periods in the time series where both the species level and the community level dynamics change. For example, it's just really in this last period where we see the advent of very large magnitude cycles. A main goal of our work is then to associate changes in ecological dynamics to external stressors like climate change, invasive species, and human-mediated landscape alteration. For example, this plot here shows the five-year precipitation and temperature averages at the portal site since 1980, indicating clearly that the site is experiencing warmer and drier weather over time. In evaluating the impact of these stressors on ecological dynamics, research indicates that we really need to explore and consider the potential contributions of stochasticity, autocorrelation, cyclic dynamics, gradual changes, and abrupt shifts known as regime shifts in ecology. This work was instigated by Erica Christensen as part of her dissertation at the University of Florida, and she and co-authors established the methodology underlying the LDACS package, which was published in this 2018 reports in ecology. However, in that paper, the name LDACS doesn't appear. It was coined later to refer to the methodology as latent deer clay allocation in time series analyses. However, we've been generalizing and expanding the methods, and so we're considering changing the name to something more general, like linguistic decomposition algorithms, coupled with time series analyses. But in general, in turning the Christensen et al method into the LDACS package, we were particularly interested in standardizing, generalizing, and optimizing the methods, and encapsulating them in a simple LM-style top-level API, where the analysts can really just input a couple of arguments and have the full model run, with the idea being that we keep all of the horses under the hood, so to speak. And indeed, we have developed LDACS as a package. It's presently on CRAN as version 0.2.7 that gives the user a simple top-level API as LDA underscore TS. We're engaged in ongoing work to expand the method, but the package allows the user to conduct analyses akin to Christensen et al with some additional options. To dive a bit more into the statistical underpinnings, LDACS uses a two-stage approach, where the first stage is focused on reducing the dimensionality of the data, and then the second stage involves fitting the reduced dimensions in the time series analyses. As you might imagine, given the name of the package, the initial dimension reduction technique that we've been using has been latent-deer clay allocation, which is shown here in plate notation on the right, and derives its terminology from an original application of the method to analyzing textual corpora, where M is the total documents in the corpus, N is the total number of words, and then each word belongs to a term, which belongs to a topic, and then those identities are described via a couple of parameters that allow simple decomposition of the document by word matrix into topics. So you have a matrix decomposition, a document by topic matrix, multiplied by a topic by word matrix. What's really important here is that words can belong to multiple topics. In the ecological parlance, we think about decomposing the sample by species matrix into a matrix that is sampled by guild multiplied by guild by species. Species can belong to multiple guilds, but now we have a reduced dimensionality for the response variable of guilds as opposed to species for the time series. And as the rodent data highlighted before, raw ecological time series can display a range of possible dynamics. And this is also true after decomposing the data, which you can see here, where we've used LDA to decompose the 20 species down to four topics or guilds. But again, we only see this high magnitude cycling near the end of the time series. And so in order to analyze these kinds of time series, we need methods that have multivariate responses and flexible predictors that allow for qualitative and quantitative changes. First focusing in on the response model, rather than reinvent any wheels, we're leveraging existing multivariate regression tools for the initial LDATS model. And what's published in Christensen et al. We use a softmax based regression via the end net package, which is part of the maths family of packages. However, we have hit some limitations with the softmax in applying it to some new data sets. And so we're incorporating additional response models into the package that aren't quite ready yet, but we'll be coming soon in the next release version 0.3.0. And in particular, we're looking at simplex based methodology via the compositions package, turning our focus now to the predictor components of the model. Following Christensen et al., the LDATS methodology uses Bayesian change point regression based on an indicator approach. As you can see here in a figure from Ruggieri's paper describing the method, where it's applied to NOAA temperature data, the model can identify qualitative changes as these change points in the face of additional cyclic dynamics, which is really what we're interested in here. Because this is important in the context of our ecological data as well, where we expect there to be potentially none or many change points with potentially variable dynamics between them. These dynamics are then governed in a regression sense by covariates and regressors, and those regressors might change between these change points, these qualitative change points. So following established methods, we take a sequential approach to fitting time series, where we first estimate the posterior probability of change points and their locations, unconditional on the regressors. This also provides estimates of the regressors, but they are conditional on the change points. So then the second part of the sequence is deconditioning the posterior probabilities of these regressors, given the uncertainty in the change points. We can see this play out here in the rodent data, which we've decomposed and then analyzed and we see the model has selected four change points, and there's variable cyclicity, the strength of the cycles, the magnitude between each of these change points. One major challenge in fitting this kind of model is handling sharp edges in the likelihood surface, which can cause optimization routines to get stuck in suboptimal parameter space. Like we look at a two dimensional representation here, where we have these high density areas that are gray. You can have multiple local optima that the search algorithm could get stuck in. And to address this we use a modeling approach known as parallel tempering Markov chain Monte Carlo PT MCMC over the top of the multinomial regression within each segment of the model. And we're able to use this method to explore the parameter space and get unstuck because PT MCMC includes these auxiliary chains to the search algorithm that are hotter and are able to then explore the parameter space better. This methodology allows us to get unstuck and much better search the parameter space in the context of change points in particular. We've coded the application of PT MCMC for all that's within the package itself to make it easy for folks driving the truck to not have to worry about the horses. And again, we've got this idea with the L that's package that you have this top level API where the user can simply run this LDA underscore TS function and have it do a full suite of analysis. So notably users can run compare and select from multiple models from a single model call or single function call using, for example, the formulas or and change point arguments. They're all fully factored. And that allows the user to have one function call. All the details of the algorithm, the search algorithm parameters, etc are handled in control lists, which allows us to have a very robust API that has relatively few arguments. There will be one backwards compatibility breaking change coming up in version 0.3 point oh, where we generalize then one of the names of the arguments from n seeds to replicates, but otherwise we don't anticipate the top level API changing in the future. The package comes with a set of vignettes, one that shows a full application of the package soup to nuts, the simple analysis. One that shows a comparison of the current methodology in the package to what was done and published in the Christianson at all paper. And then the third vignette that details all of the package underlying the hood for folks that are interested in seeing sort of how all of the parts of the model are put together computationally. The vignettes are part of the package on Brandon are also included on the package website, which is part of the get.io and this hyperland in the presentation. I don't have time, obviously to delve too deeply into the package, but I did want to highlight some important functionality and ancillaries within it. For example, the all that function computes a number of internal summaries that are available via simple list component selection. So here we have an all that's model that's run and we want to look at row which is the change point location, and we get a MC MC table that describes the change point parameters. So we want to look at Eta, which are the regressors between the change points. And because it's multivariate model we have a lot of parameters here, but we get a nice simple summary table. Similarly, we have a top level plotting functionality that just run, you just run plot on an object that comes out from the model. You get figures that are akin to what's in the Christianson and all paper. The decomposition on the left and the analysis of the time series on the right. Now what I just showed here is what's available in version 0.2 point seven of all that which is presently available on cram. Very well tested. It's archived on Zenodo, but like I said, we're also doing a lot of active development. This is on GitHub as a work in progress PR right now. We're including additional aspects of simulation of data. The predict functionality is going to be introduced in this next version and a really big addition is flexible model selection methodology, for example, via hold out cross validation methods. A lot of this development of the next stage of LDAC is being driven by this larger project known as maths, which stands for macro ecological analyses of time series structure. And it the math project gives us a massive compendium of ecological time series that we can use to evaluate the presence of regime shifts very broadly speaking, but it also allows us to test the robustness of our package. And so we're currently applying the methodology to a whole suite of data sets to see where it fails and where it succeeds. If you're interested in joining our team working on LDACs we are more than happy to have you jump on board. We've got a pretty active repo on GitHub with information on how to contribute there. And like I said the packages up on cram. And with that I'd like to say thank you very much acknowledge our funding and my collaborators. Have a wonderful day. Take care.