 All right. This looks like you can see my slides. Thanks for the welcome to get Daniel. Good, let's get to start it. I'm talking about the sensor package. Tough act to follow. And usually, so I have analysis is applied to data from situations like that and I kind of assume that this is what most people here are working with. And when I talk to more general audiences, I try to get across where the sensor data is and to please you survive on analysis for it. But I don't think I need to do that here. So instead, I'm gonna bring a little reminder that not all sensor data are variations of time to death. And I came across one of my favorite examples in a previous job at a data science consultancy where one of my colleagues chatted to me about his latest project and it was a newspaper company that wanted to know how many copies they should deliver to the various outlets and sort of he was musing that if there were copies left at the end of the day, they would know how many were taken before none left. They would still know how many were taken but they wouldn't actually know how many more could have would have been taken if they have run out. So in that case, it's the observation of the demand for the newspaper that a sensor and not the time that's censored. And with us a little reminder of the wide applicability of sensor data and survival analysis. Let's get to why I'm here today because I'd like to let you know that we're extending support for survival analysis and tidy models. If you're not quite sure what tidy models is, it's a framework and a collection of packages for modeling and machine learning which uses tidy principles principles. So Max and Julia just gave a keynote on tidy models at our studio conference and sort of talked about what they mean with saying that tidy models should make your modeling effective, safe and ergonomic. And if you wanna know more about that I suggest watching that the videos are out sort of the part that matters most here for us is that tidy models also gives you a consistent interface to various models. You're probably well aware of that R has an abundance of riches in different types of models, also survival models and it comes with nearly similar abundance of riches and interfaces sometimes. And while the first is great, the second can be a little challenging if you're trying to focus on your modeling problem and not wanna spend your cognitive energy and looking at what that particular argument for that package and that function was called. So that's kind of the place where we started with extending that support for survival analysis, the models, the models themselves. And that part in the tidy models framework is sort of covered and captured by the person of package in general. That is designed to give you consistency both in how you specify and fit models and how you predict and what you get back. So censored is a person of extension package specifically for survival models. And I'll be talking here about how these design principles are playing out for survival models. So the first part, specify and fit. Let's start out with a quick reminder of how you specify models in person because you need three things. The model type. So in my example here, that's in the forest. And then the second part you need is a computational engine. What do you use to fit that model? In our case here, it's the Ranger package. It often is another package but it can also be a tool outside of our, like TensorFlow and Keras or Stan for vision models. And the third element in a person model specification is the mode. So here I set that to a question you could also use around the forest for a classification problem. And these three elements are what makes a person model specification and you will encounter these in censored as well. So if you sort of already know that pattern that will be very familiar. So what's different or what were the additions that we made for survival models? For one, we have a new model type for proportional hazards models in a new mode for censored regression to distinguish that from quote unquote regular regression problems. And the third part, you may guess it, we're new engines. So for existing models, but for that new mode, but the species and all more. And they cover parametric models, semi-parametric models and tree-based models. And the last bit to mention sort of in the specification realm of this is that we have formula interface for all the models that allows you to specify stratification where it makes sense. And with that, let's pull out some data. It is not hospital data, but it's orphans and whales living in captivity in the U.S. It is adapted from a tidy Tuesday dataset from 2018 and sort of the response here is the age of the dolphin and the event variable tells you whether they are still alive or not. We have some other predictors or explanatory variables to species, to sex. How many times it was transferred between any of the facilities that it was kept at the animal and whether it was born in captivity? And let's start fitting. So we can see that syntax and action. Let's take the new model, proportional hazards. That's sort of the baseline that we start with. We're going to set the engine to using the survival package. And then we set the mode, which is that new sensor regression mode. And these three elements give you the specification. This isn't a fitted model yet. That's what you do with the fit function, where you give the object dataset and the formula specifications or like your server object is the response and then we'll use species section transfers. And if you want to just clarify this model, you can add that to the formula. That's how the survival package then uses it. You add a strata term for born in captivity. And sort of this was the whole demand. And this looks a little long to you because you're used to just using the survival package directly. Then yes, it's a little bit more reverse. So if you use survival directly, you need the coxpn function, the formula and the data and you're good to go. And that's great. And if you're happy with that, you obviously have no need for censored. The idea of censored is making it a little easier going from one model to another model to another model type. So let's switch from proportional hazards model to a penalized version of this model. And I've just sort of moved all of the code to the left. So we have that for the survival package. And if you want to fit a penalized proportional hazards model, you can do that with the clean that package. But it's gonna go a little different obviously since I picked it as an example. So the biggest difference probably in terms of syntax here is that the Glimmer package does not have a formula interface but rather a matrix interface. So you need to prepare your data as a matrix for the predictors. It's not enough to just pull out the columns of your data frame. There are some nominal variables in there and those need to be turned into dummy variables. I do that with model matrix here. So that's a little bit of extra work. And the responses that serve object. Obviously, if there's no formula interface, if you want to stratify that and you need to put it somewhere else and Glimmer's choice was to stratify the response with the stratify serve function from the net. If you've done that to your response then you're good to go and take it to the Glimmer function. Give it your data in those two objects, set the family to Cox and set the penalty parameter lambda here. That's all doable, but obviously it's also a little different and sell a little bit of different work. And if you have to prepare your data differently for different models that's maybe an overhead you're not too keen to do because it distracts you from thinking about stuff that you might find more interesting. And so that's one of the things that we've made easier in my opinion. So when you take the piece of code that I showed you initially about how to fit that and censored, that's what I have on the left. And then basically the same thing on the right but the main thing that changes is that line set engine. So instead of choosing the survival package to fit the proportional hazards model we're choosing the Glimmer package on the right. The only other difference is setting the penalty parameter for Glimmer. It's just called penalty and you don't have to remember which quick letter is the right one here. The rest is the same, so the mode is the same. And also this sort of comes along pretty boring and unassuming but I'm quite happy that it works. Then you can use a formula interface to specify that model and you do not have to prepare the data in a different way. And that's obviously not the only sort of switch in model type that we wanna facilitate and make easier. The rest sort of follows on the tagline more models, same syntax. So even if you wanna go to something very different then pick one of the tree-based ones and say, well, I like boost a tree for a survival problem and I'm gonna fit that with the end-boost package. Then these are the two lines that you need to change. And that obviously sort of leads us to a question of what is actually available. They're all for that new mode sensory regression but it's the proportional hazards model that we've seen initially as the semi-parametric ones. The parametric models are in the survival reg function and then for tree-based, you can have the scissor trees of different sort of flavors or engines fitting them, bag trees, random forest boosted trees. And that's sort of the collection of models that are available in sensory right now. And that concludes the specify and fit part of the models and let's move on to predicting with these models. And then again, a few words about sort of what tiny models already does and then how are we gonna change that or build on that for the survival models. We sort of give you a quote unquote tiny models prediction guarantee which means your predictions will always be in a table, not matrix or vector or array, it's a tip. It will always be a table. The column names and types are unsurprising and predictable. And the number of rows in new data and the output are the same. So if you have missing data in there that you might not be able to predict on, you just get in an A and it doesn't automatically disappear. So that's the same, that's the same for censored as it is sort of what are you used to in the tiny models framework. So what to do? Those are additional prediction types. The first one that we're gonna look at is survival time. So I've taken the first three rows here as a toilet as a side or a small one that we can predict on. Just spice things up a little bit. I'm making that first entry for species in A. And what do you need to predict is the predict function, your model, the new data that you're predicting on and you need to set the time. So for survival time, that type is called time and you get a table back. It has a column dot-pred underscore time. They're typically named dot-pred or dot-pred underscore the specific type. And because it was three rows that we had originally, it is three rows that we get back even though one of them is in A. The other important new prediction type that comes with censored is survival probability. So getting that looks very similar. You just set the type to survival, but a survival probability is always calculated at a specific time point. So you need to provide those time points that you wanna predict that with the additional time arguments. Here I'm doing that for ages 10, 15, 20, and 30, sort of 40 years. What do you get back? It's a table, it has a column dot-pred. And we're sticking to the one row in the predict object corresponds to one row in the new data. So three rows here, we obviously had four time points to predict on for everyone. So this is a nested list column with the tables inside. You see that there are size four rows, two columns. If we pull one of them out to look at them, see that the first column is dot-time, and that will always be the name. It gives you a reminder of the time points that you predicted on, and the other column has the prediction in it, dot-pred, underscore survival. Here that's an A because we couldn't, but for the second one, it has the probabilities. We're not working with surfered objects here for the survival curve, but if you do want to plot them, that should be pretty straightforward. So that vector of time points, you can obviously do more than just four in my example. So here I'm doing that sort of over one to 80 years of survival for those dolphins and whales. And then I'm gonna un-nest that column. So it takes all the tables and puts one underneath the other. It makes like a big table of that. And because I wanna keep the information of which observation this belongs to, I've added in a little mutate statement before adding a factor column called ID. And then that un-nested gives us that long, tiny data frame that we can then send to ggplot make our visualization of this. So to collect the thoughts on prediction, you can predict survival time and survival probability for all of the models that are consensored. And depending on the engine, you may also get predictions for the hazard, the quantum event time distribution, and the linear predictor if your model has that. So, Cessar is here to provide you with a consistent interface to various survival models. And that consistency applies to both how you specify and fit, and how you predict what you get back with this. And I'd love for you to try it out and give us feedback. Best feedback menus are probably GitHub. So Cessar.tinymodels.org has like a link out to the GitHub repository, leave an issue there, or post on our studio community. And that's my conclusion. Thank you. Thank you very much for the presentation. Are there plans to add relevant metrics to yardstick like the C-index? Her plans, that's sort of our next step in sort of extending support. Does this handle just the Cox objects, or does it also handle the multi-state objects since they're different? Yeah, it does not so far. Okay. Yeah, if I recall correctly, I haven't looked into this in detail yet, and I think that changes the structure of the data, but in specifications. And so, no, we haven't made the framework more flexible to accommodate that yet. Okay, that's another question. Is there a link to the slides? There will be. Currently, there is a link out for a very similar version of this. So I gave it our studio, Conf. That's on my GitHub, and yeah. Recording of that talk is also out so he can compare the- Okay. Yeah, the heart is the same because the state of the package is the same since last month, so. Any other questions? So what plans or what else are you planning and developing with this? So we do aim to make survival analysis with first-class citizen in tiny models. And as was the first question, the next big step is adding metrics for this appropriate swath of metrics that will allow us to sort of set up the tuning bit because obviously, no metrics, no tuning. But then we're trying to, like my idea of where we're going with this next then is to make the whole workflows move. So currently, you might have to do a little dance with recipes and you need to know how to unlock the potential workflows if you want to bind pre-processors and models together. But yeah, sort of the big step is the metric and tuning and then making it feel like any other model, at least with the same ease in tiny models. Okay, great. Thank you very much.