 Hello everyone, from wherever you're joining, thank you so much for being here today for this NCAR Explorers Series lecture, How Ensemble Prediction Systems Can Tell Us What We Know and Don't Know About the Future, with Mohaka Ramti and Jeff Anderson. I am Dr. Evie McCumber, and I am an educator here at the National Center for Atmospheric Research, or NCAR. NCAR is a world-leading organization dedicated to understanding earth system science, including our atmosphere, weather, climate, the sun, and the importance of all of these systems to our society. I am really glad to be with all y'all today to learn more about ensemble prediction systems. Also, I'm really, really excited to be hosting our first public in-person Explorers Series event at the Mesa Lab since 2019. So you guys are our first live audience since 2019. For this event, you will be able to ask Mohaka and Jeff questions following the lecture, and that will help moderate so that we can ensure that we hear from both our in-person and our virtual audience. If you're in-person, you can raise your hand and ask your question. If you're virtual, you can ask your question using the Slido platform. If you scroll down on this webpage if you're online, you will be able to see the Slido window just below where you're seeing the livestream video event. If you haven't already, and that also applies to all y'all, go ahead and click on the green Join Event button on Slido, and you can ask questions on the Q&A tab. Mohaka and Jeff also have a few poll questions for all of us, so for both our in-person and virtual audience, you can respond on Slido. You can use your phone or laptop to go to slido.com and enter the code hashtag NCARMSR, and definitely be sure to join Slido to add your thoughts to our WorkCloud question. What do you think of when you hear the phrase on summer prediction? Because we are definitely going to get to that really soon. This lecture is also being recorded and will be available on the NCAR Explorer Series website. With us today, we have NCAR scientists Dr. Mohaka Ramte and Dr. Jeff Anderson from NCAR's Computational Information Systems Laboratory, or CISL. Mohaka is a scientist too at CISL and NCAR. He has been an active data assimilation researcher since 2010, working on various data assimilation topics, including variational and sequential schemes, common filters, and smoothers, in addition to parameter estimation. His work focuses on proposing novel data assimilation techniques that can help improve earth prediction systems using ensemble filtering. He is a member of the data assimilation research section at NCAR, which you will hear a little bit more later on. And his research encompasses formulating, smart inflation, localization, procedures in addition to their application in the data assimilation research testbed dart. To date, his active research areas of application are subsurface hydrology, marine ecosystems, and surface flooding. He received his bachelor's in geological engineering from the Middle East Technical University in Ankara, Turkey. He obtained his master's and PhD degrees from the King Abdullah University of Science and Technology in Fawal, the Kingdom of Saudi Arabia. Jeff Anderson completed a PhD in atmospheric and oceanic sciences at Princeton University in 1990. After a postdoc at the National Centers for Environmental Prediction, he spent the next decade at the Geophysical Fluid Dynamics Laboratory in Princeton, building atmospheric models, developing software infrastructure for climate system models, and exploring ensemble predictions. Since 2001, he has been a scientist at NCAR, leading the development of the data assimilation research testbed, a community software facility for ensemble data assimilation. He has developed several algorithms that facilitate high quality ensemble data assimilation for geophysical problems. Anderson is a member of the American Meteorological Society and the American Geophysical Union. Before I turn it over to our speakers, who are super excited to talk to all y'all about ensemble data prediction, I want to see what your thoughts are on this word cloud. So, Nick and Brad, could you please show us Slido and see what y'all said about ensemble predictions? Almost there. This is live. Yeah, Jeff and Moha's great. Okay, excellent. So these are our thoughts. And with that, I'm going to let our scientists get into what do they think about what y'all said. Okay, I missed her there. Moha, you can talk about the word cloud too. So I like the orchestra thing because in some sense it is a whole bunch of different players talking about what's going to happen. Uncertainty quantification is a big part of this. We're going to talk about more. So the key point is that you can't tell everything about the future. You can just get some ideas of probabilistically what may happen. I like multiple predictive models. We won't quite get to that tonight, but it's a key research area. Prediction with uncertainties down there at the bottom. Exciting. Yeah, that's what we like to think about it. Not so long ago, we used to be told that if you said data assimilation in a crowded room, it was illegal. But no more. Data assimilation is the cool thing. And we'll tell you a little bit more about that. And that leads to all the ensemble stuff we're doing. Weather forecasting is a key application. That is where data assimilation got its start. Moha, do you want to jump in and comment on a few of these? I mean, I think something to do with forecasting is a great answer. Complicated is not too far. Probability is a great answer, too. So no, good job from the audience. This is a great cloud, yeah. And we're ready to move to a talk. Cool. We're learning this as we go, too. OK, so Moha and I are really excited to be here with you tonight. One of the things that a talk to a more public audience like this is it gets you out of the technical details and back to remembering that this is, in fact, exciting and fun. Sometimes you get bogged down in too much math and too much other stuff. It is exciting and fun what we do. And what we do is we tell you what the future is going to be, sometimes better than other times. And that's what we'll talk about tonight, why sometimes you know more than others. And again, apologies to many of you who are here in person, because if we'd given you a forecast of what the weather was going to be like outside when you came from the parking lot with good probabilities, you might not have come at all, but no one got too wet. OK, so predicting the future, so there's some pretty simple examples, flipping a coin, what's going to happen? Well, we don't know exactly what's going to happen, but we can say something about the relative probability. 10 heads in a row probably not going to happen, but it could. Same thing with the dice, we can say something about the relative probability of rolling two dice. Probability of rolling a two, quite a bit less than the probability of rolling a seven, but it can still happen. What's the probability there is ice on the hill? I ride up and down this hill every day, 4,100 times now. And so this can be a life-preserving piece of information to know. We'll talk about that a little bit more at the end of this talk, but how well do we know? What's the uncertainty? Thinks that this are important. Now, I'm holding my favorite red racquetball with a smiley face. I've already picked one person I'm going to pick on here, one of our summer internships here. So are you ready for this? The throw may be bad. OK, back. Thanks, Robin. Robin just did a prediction of where this ball was going to land, and it was a pretty good one because he got his hands there before it arrived. The entirety of what we do is just the problem of predicting where a ball is going to go. Now, we may not know exactly where it's going to go. You could have dropped it. People do. Who else is here? Oh, John Klein's here. This is a longer one. Let's see how this goes. Oh, good. Although it's throwing off my next slides that are going to come to him. Nice throwback. OK, so there was a prediction problem there. The second I threw this ball to John or to Robin, they didn't know exactly where it was going to go. That's an idea. But they had to refine that idea as the ball moved through the area. And we'll talk about how that works. So we're going to start out with a very simple problem here. That's me. I'm throwing the red ball. It's in two dimensions, so it's nice and easy here. So it goes up, it goes down. Where is it going to land? Now we're going to teach a computer how to figure out a prediction of where that ball is going to land. Now, all of us already know that. And we'll talk about that in a minute. So there's three things that are required to do this type of prediction. And the first one is a prediction model. The model tells us, if we know where the ball is now and how fast it's moving, where's it going to be in a little while? It's going to be back in my hands in one second. For the ball, that model is really simple. It moves in a parabola, and the equations are there. I'm using x for the horizontal position, y for how far up and down it is. U's for how fast it's moving this way. V's for how fast it's moving in the vertical. And these are equations from basic high school physics. Your brain learned them a long time ago. We have probably our youngest visitor over there, probably already learning in that gray matter that objects on the Earth move under gravity in a parabola. It's one of the most basic things that humans learn is how to do prediction of something traveling under the force of gravity. Now, the computer doesn't necessarily know that right off. The other issue we have here is, even if we have a good model, and this model is good and simple, you are uncertain when I threw the ball about exactly how I was going to throw it. You were probably worried I was going to throw it over your head. The AV guys are worried that I'm going to throw it into the screen, which I'm not going to do. I'm not going to hurt the screen. But we don't know initially exactly how fast the ball is moving or where its position is when it leaves my hand. And so if we have a computer we're trying to do that with, really the computer may know something about me. And with new AI stuff, you may be able to go online and find out a whole bunch about me. I actually don't look quite like that picture, but that's my best artistic ability. So you can probably find out online how tall I am and get some estimate of my arm length from some picture of me there. If you search hard enough, if our computer searched hard enough, it would find out that I am a former mediocre tennis player. It might even find videos of me tossing a ball for a service. And that's not always such a good thing in the past. And so there's quite a bit of uncertainty about what might happen when I throw stuff. So as an example, you don't need to get up yet, Moho. You're going to have to fetch these in a minute, though. If Moho were over there, oh, yeah, you can go there for now. We'll do it again in a minute. So if I were throwing a Moho, I might do a throw that looks like that. Or I might throw, you can throw it back. Or I might do a throw where I get a little bit nervous and it goes like that. So there's a lot of uncertainty for our computer here in trying to say what the heck Anderson is going to do with those balls. And so it might look something like this. We're done with you for now. I'll get you back up in a minute. So it might look something like this. Yeah, come on. There it goes. So given what it knows about me, the computer model might say, well, he may throw it straight up. I might. I might throw it against the wall. It may bounce before it gets to Moho. There's a lot of uncertainty here. And that goes back to the world cloud. Uncertainty is part of this. So without knowing anything else except that I'm going to throw the ball, there's a lot that you have to guess about. So what are we going to do now? Well, Moho is going to stand up now and help me. Again, we need some assistance. The model and knowledge about the throw is not enough. And so again, if I throw to Moho, oh, that was a bad throw. Let me try one more. If I throw to Moho, OK, thanks. I'm going to get some extra information now. And that's things I'm going to refer to as observations. Another word for observations is measurements. These are things that an instrument, like a thermometer in the atmosphere, or a camera in this room, might take to tell us what's happening to the ball after it leaves my hand. And in this particular case, you can see on that throw, the ball is leaving my hand. About half a second later, you can see it's just above my head in the stick figure there. At one second, it's up there. About 1.5 seconds, it's down here. Moho is getting his hands in position. And in about two seconds, Moho catches the ball. So these observations give us more information as the ball is moving towards us. It helps us improve our prediction so we can adjust where our hands are. And the same thing can happen for the computer. But these pictures we just took, they're not precise. We don't know that we're taken exactly at half a second, one second after the ball was released. We don't know exactly where that camera was located and what the angle was. We don't know exactly how far the ball is in this third dimension towards the camera. All those things make that observation inaccurate or have errors. And this is a fact of life. Anything you measure has errors. And I'm not talking quantum errors now, some of you may be familiar with it. The facts of life are instruments are imperfect when we measure stuff, there are errors. So what we're going to do now is try to combine these observations that are not exact with the information we had. And the observations in this case will represent them if that's the true path of the ball in the solid red as I threw it to Mohawk. And the solid red is where the ball was after half a second. The plus up there is what our camera observation says, this is where I think the ball was. And associated with that I've showed these circles which represent the uncertainty in the observation itself. Yeah, the ball really wasn't there. It was kind of close to there. And we have these observations that go sort of every half second for this experiment through time, up to the time where we catch it. And each one of them has a slightly different error. More or less in this case, maybe they have the same error characteristic. So the size of those circles of uncertainty might be the same. There is some math that underlies this. Now, Mohawk said complex. And the comment on the word cloud back there was complexity, complicated. And on the surface it is, but the beauty of this stuff is that in fact it's vastly simplified. You can bring the problem down to exactly this two dimensional ball problem we're doing. I teach this when I have a little bit more time than we have tonight to high school students. And they can master this math. But so you'll know that there is math, and I'm not going to discuss it. This is a Gaussian or bell curve or normal distribution of the equation up there. And it's plotted over here on the screen. Basically what it's saying here is that red plus in the middle again is where the camera observation suggests the ball is. And as we move further away from that, if I have a guess from one of those blue throws of where the ball might be, if it's closer to the observation, that means it's more likely that that really is where the red ball was. So that blue one could be the red one. In this particular case, the one that's closest has about seven times more probability, relative probability, than the one that's the furthest away. And so if we have one of the blue ones that is at that position, we say it's much more likely than the other one there. So now we've got this information. It's not great. And we've got predictions that are not perfect. And we're going to combine them. And so the third thing we need is called data assimilation. It takes this uncertain information from models. It takes uncertain information from observations. And it does statistical magic, pretty simple magic in some ways, but statistical magic to get more information. We learn more than we knew from either source in itself from the data assimilation. So in this particular case, we can pick one of those blue trajectories for the ball, one of the blue paths. The computer can generate many, many, many of those, given what it thinks about my throwing capability. And we can see how close that one comes to our observation of what really happened. And we can do that for subsequent observations in the time. Because this is an and, what is the probability that the ball was this close to observation one, this close to observation two, this close to observation three? And and probabilities is all about multiplication. If you roll two dice, the probability that the first one is a 1 is 1, 6. The probability that the second one is a 1 is 1, 6. The probability that they're both 1s is 1, 6 times 1, 6, 1, 36. These are more complicated than 1, 6 up here in this multiplication. But we're just going to multiply these expressions that tell us how likely the ball was, for this example of the blue, how likely it is compared to the actual red truth in this expression up here. So now what I'm going to do is I'm going to come back again to this notion that I had a bunch of different guesses of what the blue throws might have looked like. So one, you don't need to go for this one. We'll just let them go. So one might have been pretty close to what I observed with the red one. And in this case, I'm going to do plots of this. There are many, many examples on the screen. There's 500 shown. We actually did 10,000 and selected the 500 that were closest to the observations. The closer you are to an observation, the bluer it is. So that was a pretty blue one. It was close. This one is not very close. And so it's a pretty light blue one. And so you can see right in the middle where the ball is close to going by the observation, you get dark blue, you get lighter blue out. And then there's 9,500 other even lighter blue ones that I didn't put on this plot that would be even further from the observation. So after half a second after I threw the balls, when we have this observation that comes from the camera, the computer can say, OK, now I can combine these. And it can do this computation. And so it gets these darker blue, more likely, or lighter blue, less likely trajectories. And then we can use the model to go forward, one and a half seconds. The computer can do that really fast, so it happens almost immediately, way before the ball gets there. And this is our estimate of where the ball might be at the time we about need to catch it. OK, it's a lot more certain, a lot less uncertainty than just that original picture that I'll find from my hand. But there's still a lot of uncertainty. The green one, by the way, is an estimate of the most likely place. So maybe where you want to put your hands. After a second, I get another observation. So now it's only a second till I have to catch the ball, but I have two observations. And so I can combine those pieces of information. And from that combined ensemble, that's what we call this group of estimates of what happens, we can make a forecast. And you can see that the uncertainty has reduced. The dark blue ones are in more of a cluster. The light blue ones are not as spread out. And we can repeat this half second later, at one and a half seconds, run a forecast now only a half second lead time when we call that. So I've got to be pretty close to getting my hands there. The computer has to be doing the same. And finally, we could take the information from that final time too and use that. And you can see here that the blues have been getting closer and closer. So to recap this, and you'll see here, those of you who are familiar with weather forecasts, almost everyone is, the analogy here. At time zero, when I release the balls, we know very little. There's complete uncertainty about where it's going to land. Half a second later, we can make a forecast out to two seconds that has much reduced uncertainty. A second later, we can make a forecast now only one second lead time that's more certain. Another forecast at 1.5 seconds, even more certain, with only a half second lead nail. Just like weather forecasts tend to get more accurate as you get closer to the event, that's the same phenomena here. Finally, another interesting phenomena, if you were reading the top of the slide, I've been saying what we know and don't know about the future. But there's a question too about what we know and don't know about the present. So when we get to two seconds with this computer model, we actually don't know exactly where the ball is. We had these four observations. We had uncertainty in the throwing. And so we don't know exactly what it is. In the same way, we never know exactly what the temperature is on the mast on the top of this building. We just have an approximation. Now, that being said, observations don't give us everything. They do give us some other pretty cool things. And that is that observations tell us about the unobserved. So in this example, you remember that the model was all about x and y, but also u and v, how fast the ball was moving in those directions. We're only observing with these snapshots from the camera the x and y. But those examples that the computer is running of what the blue balls may have been doing has a velocity in it too. And so we can do a plot like this. After we get the first observation at half a second, the horizontal axis here across the bottom says how fast the ball is moving in this direction. And the vertical says how fast it's moving up and down. This is a forecast for what it's doing at two seconds. And you see negative values here down to around negative 8 on the vertical axis. The ball is moving down into your hands. After half a second, the suggestion is, yeah, the ball is moving down. It's not a bad estimate of the real velocity, which is shown by the red dot. Now, after one second, this happens. I have an estimate of the velocity again. Well, about half of them are going down, and half of them are going up. What does this mean? Audience participation for the in-house people here. Why does something suddenly get so weird? I don't know whether the ball is going up or down, any guesses? Postdocs are required to guess if no one else does. Visiting students, why are half of them going down and half of them going up in this prediction? OK, I actually hear people rumbling for the right thing. That's almost. It's whether it's actually bounced. So we don't know if the balls bounce at this point. And so as we get more observations, you can see that we continue to have some uncertainty about whether the ball has bounced or not. Another key point here, and this is for people who actually look at forecasts, we'll come back to this a little bit later. Remember, I told you that the green thing is the sort of best guess of what will really happen. In this particular case, it is still, in a certain way of thinking about it, the best guess. I don't know if the ball is going to be going up or down, one or the other, because the chance that it's exactly bouncing at two seconds, very, very small. But this estimate in the middle that says, oh, well, it's going to be minus four. It may be the best estimate in certain ways, but no ball is possibly going to be going minus four at that time. This is a reminder that when you're looking at forecasts from ensembles in this, you have to think about what these different values mean. It's not always obvious. OK, so I told you that two dimensions is all there is. And we'll come back to that. But the real world has three dimensions. And so really, a ball, it stays in two dimensions here, but there could be a wind blowing in here. It could be going all over the place. And so three dimensions is really important. So we've gotten a lot of help from a lot of mathematicians through the centuries. The big example is Gauss. Earlier I mentioned Gaussian as the bell curve, named after Gauss. One of the things I used to say earlier in my career is I couldn't come up with a single thing I knew that Gauss didn't. I think I actually know a couple things after 40 years now that Gauss didn't in the 19th century. But I'm not sure. I can't talk to him. We got help from a mathematician named Kalman in the 1960s. Kalman was interested in three dimensions. In particular, he was interested in the same problem we just discussed following the trajectory of something through space, but three-dimensional space. And in particular, he was interested in where the Apollo spacecraft were going in three-dimensional space. And so using Gaussians and other math that Gauss developed, Kalman had access to computers, tiny computers, to what we have now, but vastly more than Gauss had. And he was able to develop systems that were crucial to getting the Apollo spacecraft to where they were going and back. The observations were very similar to what we had. They were radar pictures of where the spacecraft were at a particular time. And the model was basically a trajectory through three-dimensional space. Now, since that time, MOHA and I and a number of other folks in the last 25 years have worked much harder on this problem, too. We have really big computers. And one of the key things we've done are these two points here. The first thing is we've come up with more efficient ways to select guesses for those blue trajectories. So we don't have to guess 10,000 of them to see what a ball is doing. We only have to select tens of them, maybe 100. The other thing is that we've done some little fiddling with the math underneath to get rid of that blue shading. Now when we produce forecasts, they're all equally likely. And we'll come back to that question of ice in a minute and show how convenient that is for trying to understand what we do and don't know about the future. Now, weather prediction. That was on the slides, too. Weather prediction models are vastly bigger and more complicated than that two-dimensional trajectory of a parabola for a ball. So basically, they are composed of huge grids of the values of temperature and wind east-west, north-south, moisture, other atmospheric contaminants, on a huge layer of shell of grids over the atmosphere, something that looks like this picture over here. And there may be millions in the most recent models, even billions of numbers that tell you what's going on in the atmosphere on one of these grids. And the models can then tell you, if you know that million or billion numbers, what those million or billion numbers are going to be in the future in the same way that the ball was moving through space? You have to use huge supercomputers like NCAR Cheyenne or NCAR's new derecho to do this stuff, but you can do it. Now I'm going to switch gears for a minute. Here's a 2D plot again, and there's a ball moving through it. Now, the ball is clearly not moving on some nice simple parabola trajectory. This is a mess. What this is actually a plot of is the temperature on the mast at the top of this building and the relative humidity on the mast on the top of this building during 24 hours on the 24th of May, so about two weeks ago. Down here, this is temperature here. So if you get to the right, it's getting warm. If you get to the left, if it's cold. If you go up, it's getting wet. If you go down, it's getting dry. And I'll replay this. So what's happening here? It's starting at midnight on this day. And at that time, it's still kind of warm and kind of dry. As the night goes on, it starts cooling off. It stays relatively dry. Suddenly, it gets cold and wet about midday. That's a long period of rain, some of you may remember. It ducks back over here around 8 or 9 in the evening. That was another thunderstorm that passed through. And Mohawk will talk about those events in another way. But for now, this is a two-dimensional plot. It's a very, very complicated model of how the ball moves, but it's just moving. And we have observations of that, which were coming in this case from the mast on the thing. We can use exactly the same techniques we used for the ball problem. It's just that the model is more complicated. Now, the other problem is that in that particular case, I've labeled the axes as the temperature here and the relative humidity here. But really, that's not enough, right? We talked about how big these weather models are. So I need some more dimensions. Three dimensions is easy. Maybe the temperature out at the airport is another dimension where this ball is going to be moving. This thing is called a phase space snail. It has axes that are not the physical space we live in, but they're the axes of other quantities that we can understand. Now it gets crazy. I can't think in four dimensions. I actually know people who claim they can. I can't do it. The east-west wind at Reno, the temperature one kilometer above Salt Lake City, a billion other values in a weather model. There's all these dimensions. I can't picture it, but the computer can. And what's really going on is that there's just a red ball moving through that space, really, really high-dimensional space that says what's happening in the real atmosphere. And then we can play the game where we have estimates of that, blue trajectories moving through that same million-dimensional space. That's what a model looks like. Now we have observations for global numerical weather prediction. There's about a billions of these each day, but they are what they are. They're measurements of temperature. They're measurements of moisture. They're measurements of the light coming up from the Earth to satellites. Not everything is measured. It's a big planet. But we saw that we don't have to measure everything when we use these techniques. If there's stuff that's observed, that's great, but there's stuff that's not observed, we can still figure out the relationships between them in the same way that we figured out the relationships between this position and the velocity of the ball moving in the parabola. Now we're going to switch back quickly to the question of what about ice on that road? There we go. So the question was, if we have a forecast, it says five forecasts below freezing and five forecasts above freezing, how likely it is that I'm going to kill myself if I go fast down the hill? And somewhat likely is probably, even if there's not ice, it's probably somewhat likely I'm going to kill myself going down the hill the way I bike. But yeah, this is not a bad guess. I think that highly likely and highly unlikely got zero votes, and I think that's right. The key point here is that we have to understand many, many things about what really generated these forecasts before we can really know what they mean. One of the key points I made earlier is that in the techniques we use in modern data simulation ensemble prediction, we try to generate forecasts that are equally likely in the same way that it's equally likely that you toss a head or a tail or that it's equally likely that you get one, two, three, four, five, or six on a dive. And in this case, we might want to interpret this that if five out of 10 forecasts, half of the forecasts are below freezing and half of the forecasts are above freezing, maybe we want to interpret it as about a half chance that there's ice on that road. There's many other things we could discuss over a longer period. Models aren't right, observations, we don't really know what the uncertainty is. All kinds of other crazy stuff happens. But that was good. Can we switch back now and I'll finish off on where I'm going? This is just one more example I'll show of how, even with huge weather models, it really comes back to this idea of a ball moving through space. This is an example that was run with a large weather model, NCAR's Weather Research and Forecasting model. The data assimilation was done with Dart, which we'll talk about in a minute. And this just shows a forecast of 100 estimates of where the eye of Hurricane Earl, the storm in 2010, will be as you move into the future. So the different colors show 24-hour time series. And in particular, you could ask yourself, what is the probability that this hurricane is going to strike Cape Hatteras on the edge of North Carolina there? And you'll see that one out of these 100 just barely clips it. And so you could say that the chance is pretty small, but it's not zero. Again, that's how you learn to interpret these. I'm going to skip the details on this slide. This is global numerical weather prediction, something called a spaghetti plot that some of you may know about. This one was produced with a global community atmospheric model that's produced here at NCAR with Dart again. Quickly what is Dart? The data assimilation research test, then. That's what we do. That's what we make. It's computer programs to do what I've just described for any model and any observations. And as I said at the start, you forget how cool this is. What we get to do is really cool. We get to make crystal balls for anybody who wants one. Most of the people we work with are an atmospheric, Earth science, other stuff that NCAR does. And Mohawk is going to come up a minute and give a couple of really good examples, 30 more seconds. But other people take this stuff from us, and they predict who knows what. You may remember in the 2016 elections, there was a lot of controversy about who won the Senate and other things. People at Princeton had taken our software and actually used it with models of who's going to vote on what and observations from polling to predict what the Senate races were going to look like. You can predict all kinds of social system stuff like that, political stuff, other things. But we love the Earth system, and it's really cool. So Mohawk, take it away. OK, well, thank you, Jeff. So in this next section of the talk, I'll be talking to you about real life essentially applications about ensemble prediction systems. I'll give you two examples. We'll start by flooding and a hurricane example. And then in the second example, we'll talk about a full regional Earth system prediction. Now to begin with, let's talk about hurricanes. So hurricanes are thunderstorms or tropical storms. They are formed generally in the Atlantic of the West Coast of Africa. In the Pacific, they are called typhoons. And in the Indian Ocean, people refer to them as tropical cyclones. They are actually, there's a fun fact about hurricanes. They are the only weather disaster out there that it's given its own name. So they are very special. They are usually 300 miles wide. And the eye of the hurricane, which is usually very calm in the middle of the hurricane, is 20 to 40 miles across. Most of the action you will see along the eye wall, where you won't expect to see very dense clouds and strong winds. Around this part of the world, the hurricane season is between June and November. So we are officially in the hurricane season. And on average, 40% of hurricanes that occur in the United States hit Florida. So hurricanes are measured based on the saffron and Simpson scale for wind speed. And they are only considered major when they are category three, four, and five, where the wind speed can reach up to 160 miles per hour. Another interesting aspect about hurricanes is that their speed does matter. You often find a slow-moving hurricane causing a lot more rainfall and damage than a fast-moving and more powerful one. And statistically speaking, the deadly US hurricane on record is that of September 8, 1900, in Galveston, Texas. It was a category four hurricane that actually killed 8,000 people, completely destroying the island with 14 feet waves and 130 miles per hour winds. Now, on top of their destructive wind speeds and storm surge impacts, hurricanes are actually capable of unleashing 2.5 trillion gallons of water a day. According to the National Weather Service, torrential rains from hurricanes can cause actually rivers, lakes, and streams to flood into their banks and the neighborhoods within just minutes. This phenomenon is called freshwater flooding or will also refer to it as inland flooding. Now, I'm showing to the right there, these are the flooded neighborhoods in Texas and the aftermath of Hurricane Harvey in 2017. In fact, Hurricane Harvey brought in almost 20 trillion gallons of water. So inland flooding can cause catastrophic damages. It can lead to landslides, destruction of crops, and more importantly, loss of lives. Now, we can go to Slido and see what the audience had to say about one of our questions. So on average, how much damage does a flooding event cost? That's actually not the right answer. It's unfortunately 5 billion US dollars. That's the average. So it's pretty sad. And now, going back with having said this, I'm asking a question now. In order to save lives, limit damages, protect infrastructure, and minimize these socioeconomic impacts of flooding and caused by hurricanes, can we actually build a system that can predict floods and using our ensemble prediction techniques and give people warning signals well ahead of time, such that mitigation actions or evacuation can be performed? Now, to answer this question, I'm going to talk about a case study from last year where Hurricane Ian hit Florida. So a little about the hurricane. So Ian was a Category 4 hurricane. It made landfall on September 28. It was statistically the fifth strongest on US record in terms of wind speed. Precipitation exceeded, in various parts of the state, exceeded 20 inches. In fact, the highest total rainfall happened in Grove City just north of Fort Myers, and it was estimated at 27 inches. Now, unfortunately, around 150 people lost their lives, and damages were estimated at a whopping 112 billion US dollars. And just to put into perspective these devastating impacts of the flooding events that happened, I'm showing here to the left four different pictures. So starting from the top left, we see essentially a bird's eye view of an entire family escaping their flooded neighborhood in Cape Corral with their van. And next to that, we see a completely destroyed bridge connecting Pine Island to Fort Myers. Also, we see some flooded highways in Fort Myers. And then in the bottom left, that's actually from Cuba. One day before landfall happened in Florida, we see that a family of three escaping their neighborhood on feet because of the flooding. So having seen all of that, here at NCAR, we have actually built an ensemble prediction system that is mainly designed to predict high flows and flooding. This was done in collaboration between CISL, my own lab here at NCAR, and the research application laboratory. Now to talk about the system, I'll start by showing you the model, the hydrologic model that we use. So the model is called Worf Hydro. This is the research compartment of the National Water Model. And it's actually a community-based system that provides different prediction of major water cycle components, such as precipitation, inundation, snowpack, soil moisture, groundwater, and stream flow. It actually provides reliable stream flow estimates across scales. And by across scales, I'm talking here, especially going from zero-order headwater catchments into continental basins and timewise going from minutes all the way up to seasons. Now on the left, you see here a simulation of stream flow and continental United States for the water year 2019 and 2020. And the units is CFS or cubic feet per second. You will be able to see the changes that's happening over that year. And you can recognize one of the major rivers. Perhaps you see over there the Mississippi. Now from there, in order to model the hurricane and the flooding, we take a cutout of that domain. And we focus on a channel routing module of that domain. And I'm showing here on this map to the right the different rivers or streams. You can think of these as rivers or streams. The different color signify how much stream flow is on each river. So the bluish is signify low flow, whereas when it's yellowish or more reddish, it signifies high stream flow. And think about warfighter as the brain that is giving me the value of stream flow at every single reach in this domain. Now I'm also showing that the major cities that were impacted and also these small red circles. So these red circles indicate the location of the observations which I collect stream data from. So we have the model. And now I'm going to talk about observations. And you see what I'm getting here too. So Jeff talked to us about a prediction model, observations, and when we combine them together. OK, so coming into observations, so we collect here stream flow observation from different gauges. And these gauges are provided by USGS, or United State Geological Survey. USGS is a governmental agency providing services in many disciplines, including geology and hydrology. Now the stream gauges are these metal structures you see in the middle panel. With an antenna usually, they are placed on the side of a bridge or a highway. You've probably seen one while hiking or driving. They measure the amount of water flowing in the river or the discharge. The data is usually recorded every 15 minutes or 60 minutes. And in the case of flooding event, actually, the data can be recorded a lot more frequently. Nationwide, the USGS operates a network of more than 9,000 USGS gauges. In fact, you can go on this website if you scan the QR code and actually check the change of the stream flow at any gauge in the country. And as an example, I'm showing here the state of Colorado and see the location of these gauges. They are mostly on the I-25 corridor and on the western side of the state in the Rockies. In fact, I can zoom in and look at Boulder County. And particularly, I'm seeing here Boulder Creek. So I'm looking at the stream flow data from the last week of May in the bottom left. And you see that beyond May 23rd, we have these daily rises in stream flow. And that's because of the rain events. And actually, this is highly correlated with the humidity that was shown on Jeff's plot. And at this point, we can bring up the other question that we asked for the audience about the flooding events that happened in 2013 in Boulder. So how likely we'll see that again? I mean, to be honest with you, I don't know the answer. I really don't know. So I would say somewhat likely is a good answer. In fact, I went to the Boulder County website and looked at the flood risks. And I guess the numbers say that in the next 30 years, there is a 26% chance that we can get another flood. Hopefully not, but who knows. And then going back. OK, so we talked about the model, the observations, and now the data assimilation system. As Jeff mentioned, we are using for this study DART or the data assimilation research test bed, which is an ensemble community facility for data assimilation. DART supports different kind of models. And these models are I'm showing them in this schematic to the right. So the different names are associated with different models. And you see they are distributed according to their functionality in the environment. So we have a large number of models that model the atmosphere and then a big suite for the ocean in addition to different land models. The animation you see here is a kind of a low order model that we use to develop data assimilation research. This is, in fact, called the Lorenz model or the Lorenz Attractor that has this butterfly effect. If you've heard about that before, it's pretty chaotic model. And it's very useful for us to develop research. Our software is freely available and distributed in GitHub at this link at thegithub.com slash anchor slash DART. OK, now we talked about the model, Warfighter. We talked about the observations, and now the data assimilation research test bed. We want to combine everything together and build our ensemble prediction system for streamflow. And this is what I'm calling here HydroDART. So HydroDART was first built and tested for the flooding event in North Carolina in 2018 due to Hurricane Florence. For this study, we are using 80 ensemble realizations. So these are the different realizations that Jeff was telling us about. I'm showing 20 of them here. Now these can be sampled in many different ways. You can sample them from the model climatology or think about climatology as a long historical run of the model. We assimilate data from these stream gauges that we saw every hour. And there are 200 of them in the domain. Now the idea is to see whether our prediction system is able to outperform the model and its own in predicting these flooding events. Now the prediction system, HydroDART, utilizes really advanced and state-of-the-art ensemble techniques and uses novel inflation and localization methods for enhanced performance. Inflation, so this is, you can think about it as the stream network that I'm looking at. Inflation is used to enhance the model's representation of uncertainty in the ensemble. It's applied right prior to the update. And it actually, we increase the ensemble standard deviation over the variability of the ensemble at each point or at each river in the domain. Localization, on the other hand, is used to limit the impact of the gauges only to nearby streams. Think about it this way, like the weather here should not be impacting the weather in London. They are too far away. The same thing here, but this is more interesting because we refer to the method as along the stream localization, such that only streams that are upstream and downstream from the gauge are being updated. And because of that, we end up with these interesting, intriguing, tree-like shapes that we see. Other interesting fact about this method is that it removes any correlation between the gauge and streams if they are in different catchments or in different watersheds, as you can see in the inset of the figure there. So with that, I would like to show you some results about the flooding that happened in Florida. So I'm gonna show you different time series or stream flow change over time, or as we call them hydrocrafts, at three different locations. So we will start at the eastern part of the state in Deer Park. What I'm plotting here over time, starting from September 15th to October 14th, the change or the observed stream flow that is given by the gauge. So you'll see that the major event is happening around September 29th or September 28th. Now on top of that, I'm gonna overlay the estimates suggested by the model. So this is if I run the model on its own. Now overall, you'll see that the model is in generally is in agreement with the observed stream flow. Yet one could argue that the flooding period is a bit tighter and the peak of the flood is quite underestimated compared to the observed stream flow. On the legend you see over there, OL is open loop, this is how we refer to the running the model on its own. And RMSE is root mean square error, or think about it as error or mismatch between the estimate and the observations. And the number is just the average of that error or mismatch over the entire simulation period. Now I can go ahead and show what my ensemble prediction system is giving me. So I'm gonna show first the priors or the forecast estimates from the data simulation system. You'll see that these gray members are my ensemble, my prior ensemble members, and the black one is the ensemble mean. You'll see that there is a big improvement in the sense of the flooding period. And when I overlay the analysis estimates or the procedure estimate, we see that there's clearly a big enhancement to the prediction skills suggested by the model, especially for the falling limb of the hydrograph. Now looking at a different location, we go to perish now. Again, this is the observed stream flow. This is what the model is suggesting. You'll see again the model is speaking quite early and the recession in the hydrograph happens within one or two days. The priors, as you can see here, they are showing much enhanced performance. They are looking more like the observations. And when I look at the analysis estimates or the posterior, they are essentially resolving the entirely resolving the observed stream flow. And finally, at fourth long sum, we're looking again at the observations. And this is what the model is showing me. So unlike the previous two cases, you can think about this as kind of a false alarm scenario. So the model is suggesting a big flood but the observation is saying, no, I'm not flooding. Almost the magnitude of the peak is almost three, four times the observed stream flow. This is what I get from my prediction systems for the priors or the forecast. And this is what I get for the posterior. So what that means is that my ensemble prediction system is able to accurately estimate the stream flow, the observed stream flow, suggesting big enhancements, big improvements compared to the model estimates. Now, we wanted to see what would happen if we summarize the estimates from all of the gauges in the domain. So instead of looking at certain locations, we look at that 200 gauges. And this is what I'm showing in this plot. So these are box plots. Essentially, I have the errors that we saw. They are summarized from 200 different locations in this box plot. So the error is on the y-axis. Red is for the model and the blue is for my ensemble prediction system. So overall, if you look at the entire flooding period, we have a significant improvement in the prediction skill of the data simulation system compared to the model. On average, if we look at these numbers, I think that data simulation system is suggesting around 65% enhancement in prediction skill. Now, if I go ahead and break that number into high flow and low flow periods, one could argue that the model is actually doing a decent job during low flow periods. But when the flood intensifies, you'll see that this is where it becomes very important to start assimilating data in order to improve the estimate and increase the accuracy of the model. And we'll last plot on this application. So far we've been looking at the estimates that are available every hour. So after one hour, so we're looking at a lead time of one hour. I guess what's more interesting is, starting from the priors and procedures given my data simulation system, what would happen if I run those many hours ahead? So you'll see that the red is coming from my data simulation system and the blue is the model. You'll see that the gains or the improvements suggested by the data simulation system is stored after five, 10, and up to 18 hours compared to the model. Now, what that suggests is that a hydro dart is capable of providing more accurate estimates compared to the model, 18 hours in advance. Now, this is significant and crucial, especially when each minute could matter, when each minute in these kind of circumstances could mean life or death. And with that, I'm switching now applications from Hurricanes and I'll talk to you about the Red Sea Initiative. So a few years ago, our team initiated a multi-year collaboration between ENCOR, CAOS, or King Abdullah University of Science and Technology in Saudi and Scripps Institute of Oceanography in San Diego. So the goal of the project is to build an advanced high-resolution regional coupled atmosphere ocean biochemical forecasting system for the Red Sea. For this, we build a full earth system here. So we're using WARF, or the Weather Research and Forecasting Model, to look at the atmosphere. MIT General Circulation Model to look at the physics of the ocean. And EMBLING, or this is an ecosystem model that looks at the different interactions of microorganisms and how nutrients is living in the ocean on top of a wave model. Again, we use our in-house data simulation system, DAR, to estimate all sorts of different data from the atmosphere, including radiances into the ocean. We estimate physical data that's coming from drifters in addition to satellite ocean color. And the whole idea is essentially to build a system that can help us identify, for instance, if there's an oil spill, how can we track it? Just like the cases we have in the Gulf of Mexico here in the US. So if there is an oil spill, it can cause you a lot of money. You need to track it and then predict where that spill is going. So talking about this system, I guess the motivation for doing atmospheric data assimilation on top of ocean data assimilation is quite interesting. Because as for the weather in the Middle East, or especially in Saudi Arabia, it's just warm. I mean, nothing interesting, right? Yet certain regions in the area actually suffer from rare yet very extreme rainfall events that actually cause massive flooding on the eastern shores of the Red Sea. It causes flooding in cities like in Jeddah. Actually, I, for once, experienced this flooding while I was living at Calv's. In fact, my apartment was completely flooded, and the ceilings literally fell off with the incoming grain water. So I guess we are interested in finding a way in predicting these rare events with high accuracy. So to do that, again, we use dark. We assimilate atmospheric data, including radiances, aircraft data, and balloons, or radiosands, as you can see in the middle panel, every six hours. On the right, you see there that I'm showing you the ensemble means of specific humidity from anested atmospheric domain. So we have a lower resolution, more wide domain on the top, and a more high resolution zoomed in domain at the bottom. And here on the bottom left, I'm showing you a typical data assimilation increments that we get for temperature when we utilize the system. Now, as far as the ocean data assimilation component goes, the motivation like, why do we have to do this? Well, we have to look after the Red Sea. Now, the Red Sea is home to the second largest coral reef system in the world, and thus monitoring the seasonal and decadal variability of its ecosystem indicators of significant societal benefits. We carry out ensemble data assimilation again. But for this kind of work, we use very, very high ultra high resolution models, especially in here. So if you see the ocean, this is not the typical resolution that you see for these models. But because the Red Sea is kind of small, we're able to use very high resolution that you can resolve many different aspects and eddies in the Red Sea. And because of this high resolution, this actually poses kind of a software issue that we had to think smartly and improve our data assimilation software to be able to afford these high resolution models. And with that, I would like to summarize. So today, Jeff and I talked to you about exciting research we conduct here at NCAR. We strongly believe that data assimilation is a crucial tool for predicting events that affect our daily life. The work that we do is ensemble prediction. So ensemble prediction systems are based on probabilistic forecasts. They combine several estimates or several realizations of the model with observations. They can be used to predict the water, the circulation in the ocean, and different interaction between the land ecosystem and flooding for hydrology. Even the pandemic. In fact, for the last two years, we actually work with several collaborators and an intern student here at NCAR. And we actually interface that with an epidemiological model. So we used different infection data and to try to predict different waves of the COVID-19 virus, which is quite interesting. Other applications include forecasting, re-analysis, model improvement through parameter estimation, predictability studies, and observation design. And speaking of observation design, last year we started working with USGS on designing strategies for replacement of their new next generation smart gauges in the Delaware River basin. So with that, if you find what we talked to you today exciting and you have an interesting project, please get in touch with us. We would like to work with you. You can reach us at our dart email, so dart.ucar.edu. Or you can check out our web page, dart.ucar.edu. And with that, I'd like to stop. Thank you for listening and take any questions. Take some questions from the audience first. If anyone has a question, please raise your hand. And we do have a mic. So if you'd just raise your hand, I'd like to walk the mic too so people online can get a question as well. We also do have a couple online, so we can start with them first. Oh, he has a question. So one really exciting aspect with forecasting is also being able to partition uncertainty. So in the context of hydro dart and the Red Sea applications, have you guys been able to partition the sources of uncertainty within your model, the prediction framework, and find out what that uncertainty source that contributes most to your forecasts and how that could be used to improve the forecast moving forward? That's an excellent question. Do you mind if I take that? You definitely should take that. So as far as the hydrologic model goes, I guess the major sources of uncertainty is that we found the forcing. So essentially how precipitation is coming, actually, to the, you have water coming to the channels from overland and from subsurface. So when you don't have a good estimate of this forcing and how much rain you are getting, your forecast wouldn't be that great. The other issue, which is less severe than the forcing, that has to do with how complicated is or how complex the geometrical representation of this river. So we kind of consider a compounded channel. But in real life, these channels could be a lot more complicated, a lot more complex. So we have a simplified representation of these channels in our model. And these could also add to these uncertainties in the model. I'm going to take a question online from Erin. Definitely no relation. How do you select which weather model to use in your ensemble predictions? Are they based on different input parameters or are they completely different models? Sure, I'll take that one. So we've already seen some examples here of some different models. Mohawk talked about a model that was just being used over the Red Sea, so at various fine scales and not a global model. We have other models where a global prediction might be important. So if you want to know what's going on around the northern hemisphere for 10 days in advance, you might need a global model. There are models that do specific phenomena in the atmosphere. For instance, if we were interested in pollution in the atmosphere and not just basic weather, maybe predicting the smoke along the east coast today, we would not need a model that specifically knows about how to move smoke around and things like that. So the choice of the models depends both on the phenomena that are of specific interest. Different models do different phenomena. And the scales of the phenomena, there's models that are made to be on very large scales. So globally, models that are made to be on very small scales, I think you're down to 100 meters or something on the Red Sea. So those grid points are literally, you can see the next grid point over there. As far as the global example, I didn't show you, but I had a slide for the next grid points in Kansas. So these are the things that kind of determine what you need. Excellent. Any more in-person questions right now? So you mentioned the uncertainty within observation data. And so do we have a sense of idea of how uncertain each type of instrument is and is that put into the model uncertainty? You should take a minute to take it. OK, I can't take it. I mean, good. So again, excellent question. I guess different instruments have different errors. Like for the case of the atmosphere, for instance, stuff that we get, like GPS data, is usually very, very highly certain. We trust it a lot. Other kind of data that we observe about the speeds from the satellites, they are not that accurate. So it really depends on the instrument. Do you want to add something to that, Jeff? To Moham mentioned on this last slide there, studying predictability and observation design, it turns out that there are ways to extend the methods we have here so they will actually estimate the errors in the observations too when you're doing this process. We have not done a lot of that in dart with the applications we have. There are people here at NCAR who have done research on that. And you could easily incorporate that into dart. Many of the applications we have, that uncertainty in the observations is what we would call sort of a second order error. We work in a lot of cutting edge models where people don't know how bad or good their models are. And one of the things I've learned doing this for 25 years now with about 50 different major collaborations now is that every modeler that walks in is vastly overconfident about how good their model is. And every observationist who walks in is vastly overconfident about how good their observation is so that there is a need to evaluate these things and it can be automated if you really need to do that. You had talked about 40% of the hurricanes hit the coast of Florida. So with our weather changing and the modeling, where do we see the future of the hurricanes hitting? Is that changing? Is that staying the same? And then, second part of the question is, the other 60% the same thing. Do we see a change in the model from the modeling of the other 60% of where the hurricanes hit? So I guess, so we're constantly advancing our models. Now, can we use our model to change the track of the hurricane? We can't do that. It's just that we can learn from all of these hurricanes. We can learn from the history of these hurricanes on how can we improve our models and improve the estimates and how accurate our models are. And, can you add something to that? Yes, I'll just follow up too. So your question is on sort of longer time scales about as the climate is changing, I believe how these things are changing. And what I will answer is that there are people at NCAR who have lots of expertise on that. They sit in this part of the building behind us in the Climate and Global Dynamics division and we are not them. And so we've followed their research, we're very interested in it. Sometimes our tools can be helpful for them, but I would say a good request going forward might be to see if we can get a CGD speaker to talk about climate change on things like this. I want to bring up a question online from Elder that has to do with your hurricane steam. Are you familiar with the hurricane model? And if so, how well are we doing with that? With example, the forecasting code. Hurricane model for something. I'm not so much familiar with the hurricane model. I haven't used it for myself. I know people who've used it. Essentially I'm more interested on the impact of the hurricane such as flooding. But there are different atmospheric models that actually model hurricanes. And again, talking about the models themselves and how accurate they are, I guess I'm not the scientist to answer that question. Very similar to the previous one. I'll just follow up quickly, too. The operational prediction research and other stuff is done by parts of NOAA. We support that in many ways. And one of the key things is that they do ensemble forecasts and the reference to the forecasting cone here is about forecasting uncertainty. So there's a lot of interaction between the research we do and the application processes that they're doing. The bottom line is that it is very, very hard to improve models of anything in the atmosphere. And I think NOAA has been making consistent progress on doing that through a number of innovative things and that continues to be the case. I would run, but I can't. So if we're interested in taking, I'll say if I can phrase this more coherently. So for decades now, meteorologists and the public alike have gotten these deterministic forecasts. Open my phone, I get the exact high of today even though we don't actually know. So how do we adjust both the professional and the public's kind of perceptions and workflow to deal with this deluge of all this new data, right? If we can even handle the amount of data, right? If we drop the dynamic core, we can have ensemble sizes in the thousands. So how does the meteorologist process all that data or how do they even get that data? With GRIB, it doesn't work. With NCDF, it's not gonna work because it's just too big of data, right? So how do we deal with the big data aspect and then how do we deal with that final mile of communication of that? I don't know, it's kind of two big questions, but. I heard on this one. Yeah, so I guess in order to, I'm pretty sure Jeff would have a better answer than Bob, but what I would like to say here is that with more data, we need bigger machines. So remember, Jeff at the beginning said, computers can handle this, but at the end of the day, computers are limited as well. So with more computational power, we will be able to handle a lot more data and even more frequent data. In fact, some of the instruments and new instruments that can actually give us data every 30 seconds. Now, if we are able to assimilate every 30 seconds, then I mean, we solve the problem, right? You know, I don't think we will ever get a wrong forecast, but that requires big machines. And, you know, big machines means a lot of money. A lot of money means, you know, yeah, the government need to support that. I'm gonna take a very different approach to answering that question. Perfectly reasonable answer, Mohawk. But my feeling is this, the general public is not capable of dealing with this amount of information. And many members of the general public are not capable of dealing with the idea of probability distributions and other things. It's unfortunate in my sense. We should be teaching lots more statistics in high school, but that's a different story. But what we do have that we're seeing all over in the press right now, there are vast amounts of data out there and there are increasingly sophisticated techniques to mine and produce information from that. My vision, and I'm not a machine learning person, although the machine learning people at NCAR do collaborate with us and they're doing great work, is that the real answer in 10 years is gonna be this. I'm gonna go online and say, what's the probability there's gonna be ice at my location right now? And it's gonna mine the data and it's gonna compare it to observations that have previously existed to deal with model error which we didn't talk much about today or whatever else. And it's gonna come back and it's gonna give me an answer that's gonna say, I'm probably gonna be able to configure. For me, it's gonna come back and say 23.2%. And I'm gonna interpret that. Someone from the general public is gonna come back and say one in four or something like that. I really think we're heading to a world where we have complete capability to do on real-time tailored products of stuff like that. The fact remains that there's this idea of calibration and validation. And AI can do part of that calibration. At the end of the day, validation making sure that forecasts do what you think they do is still something that I think is an individual responsibility. We all see weather forecasts from our favorite source and then we have something in the back of our minds about when I see a forecast, it says 70% chance of stunder showers this afternoon in Boulder. What does that really mean to me? It means something different to everyone in the room and that meaning adapts as you go through time. So I see a synergy between artificial intelligence approaches for understanding these things and people's own individual intelligence approaches moving forward. To piggyback on that question, we have something from Jenna online. And this is from Oha. It says, you mentioned cost constraints regarding modeling and is there a cost to run models? Did I mention cost constraints regarding models? Is there a cost to run models? Well, I mean, we're obviously paying a computational cost here. So in order to run a forecast that requires me building a mathematical model and putting that through the computer and there's a price to pay when you are running a model. So the more ensemble members you need to run the model, the more expensive it gets in terms of computational complexity. And if I don't have a big enough computer then I wouldn't be able to run a large ensemble. Now, there are different models. If I run a simplified model like one of the toy models that I show on my screen that is very, very easy. That's actually three variables. Going back to what Jeff was talking about. I can run easily three variable model but when I run a hydrologic model that has thousands and thousands of variables, it's definitely gonna be slower than the toy model that I had. And obviously, if you're running a full-air prediction system, if you're combining an atmospheric model with an ocean model with an ecosystem model and you wanna combine all of that together and run it at the same time, that would be super expensive. And the resolution here plays a big role. If, as we said, if my boxes in the model are one meter apart compared to 10 kilometers apart, it's obviously gonna be more expensive. Yeah, it lights out things what they used to do when we gave data simulation talks in the 90s. Isn't it closing time? What is happening? I find it interesting you were talking about projections in the US, but also in developing countries with a Red Sea example. And I'm curious, one of the problems in developing countries, particularly in Africa, is poor density data networks, both spatial and temporal. And I'm just wondering if you were advising, say, the world meteorological organizations, say how much density needs to improve to make reasonable, depending on what kind of reliability of a forecast, could you do that based on what you know? So I guess this is somehow related to the observation design I have on that slide. So actually using our techniques, we can study. You can come to me and tell me, oh, okay, I have a new instrument that, or I have a drifter, where should I put it in the ocean? Or I have, should I bring 10 of them or 100 of them when, what is the cutoff point where if I add more, I won't get any more improvement? So you can do these kind of things with ensemble prediction systems, with the kind of stuff with the data simulation algorithm that we talked about. But then at certain point, it becomes quite hard because it becomes a combinatorial problem because I can place this instrument here now and see what is my accuracy. And now we're saying, okay, if I add the instrument, another instrument here, what my accuracy becomes. But then you can actually say, if I change the position of the first instrument and then does that gonna change the solution? So it becomes a very big problem. But we can do certain, these kind of things. And of course, the other part of the problem is, again, going back to, this would cost money. Flying planes, putting satellites out in Spain, this is all not cheap, so. Do you wanna add anything, Jeff? Does that answer the question or? We have a question from Tallulah online. And I'm sure they thought that I was ignoring this question, but it's very important. Can you use ensemble prediction models to forecast whether the nuggets will win the NBA finals? Yes. This is all you. Yes. I'll take it first. Well, you go ahead first, because you're the nuggets fan, and then I'll finish it off. Yes, and I already ran the models. We are winning four one. We're ending the series four one, so we're good. So, like we said, if you have a model that tells you what you think's gonna happen in the future, you can. In particular, we had a postdoc in one of NCAR's flagship postdoc programs, named Chris Riedel, who uses similar techniques. He is a huge NCAA fan. And so he does basically all the NCAA basketball games throughout the season, has been for years. And I keep telling him he should do this for money and endow us, but he's more interested in science. But yes, you can certainly do that. Who's gonna win? Nuggets, four one, I mean. That's a no question about it. Yeah, do you have the score? Yeah. Okay, don't tell anyone. But with high probability, the nuggets are gonna win. We just heard that, so we won't give you the score. I have a question for y'all. If there are any students that are listening or watching this event right now, what advice do you have for them if they want to be scientists like you? If they want to do the work you do, what advice do you have for them? Jeff, you've, I guess, supervised and worked with a lot more students than me over the years. So definitely you'd have a better advice than me. I am, however, very old, so I lose track of my own stuff, but. So I think really the real thing is a passion for the actual processes you wanna study. So having the right math, having the right physics and everything else, if you have enough passion, you can work around that, even very late in your career. So we've worked with some students who have come in and we would call them completely unqualified from what they've done in high school and undergraduate stuff, but they have a real interest. If you have a burning passion to do this stuff, you can learn all the math you would have learned in high school and four years of college, working hard over a year. If you have the passion and everything else. So I think the real thing is if you wanna get into this field is have a passion for what you're doing. Yes, it certainly helps you get, tend to get everything else to take mathematics. And in particular, my feeling, I mentioned this a moment ago, statistics is extremely valuable in so many ways and that's the place where many, many of our students, they come in very strong in physics, very strong in what we would call differential equations and other things. And in many cases, very, very weak in statistics. I think people who are strong in statistics, which sometimes does not involve quite as much complexity in math, can be really competitive in these fields. So, but again, I think the real thing is if you wanna do this, find people at a university, even at your high school, opportunities out to find people who are interested in mentoring you when you can still do this stuff. Yeah, I mean, definitely talk to people, talking to people is key for me. I used to be shy and until I talk to Jeff, so. So. Any more questions? I get a lot of questions from the public. I do the public-facing thing, so I get to pass those questions on to you all. So when you're looking at the forecasting models or when you're forecasting something and you're looking at the previous observations and conditions that go into that forecast, how far back do you go? How much data do you look at? I'll take it for a minute. So that's a super interesting question. And so I didn't go into it very closely, but you saw a hint of that in the ball example of data assimilations actually cyclic the way that it's done. And so I had this guess of what was going on when I threw it and I moved that forward in the model and I used the observation half a second and then I moved that forward and used the observation at one second and so on. So it turns out that in real weather prediction, that basically never stops. And so the leading weather prediction centers that do this for real, say the National Centers for Environmental Prediction in the US, the European Center for Medium Range Weather Forecasting, the UK Met Office, their systems have technically used observations over decades in one long stream doing this through time. Now, the information, the value of those old observations disappears. The term forgetting is actually a formal statistical term for how that happens, is that these things forget the past, but in these methods, there's no necessity to say, I'm not gonna pay attention to that, just let it forget it on its own and keep going. And so that's kind of how it works. It's kind of magic that you just keep going and going. Did you wanna do it? No, this is not perfect. Well, I just wanna take a second to thank both of you for being here and for sharing all about ensemble data assimilation. And I also wanna thank all of y'all for being our first three-dimensional audience that we have had in this room. This has been our first public event and I know I said this before since 2019. You can do the math. It's been a minute. And I really, really, really, really also want to invite you all again for our next event on August 2nd. It is going to be about how do we determine when to evacuate during wildfires and that decision-making process. So if you're interested in more NCAR Explorers series events, definitely check out our website for upcoming lectures and conversations, as well as to view recordings of past events. We have had events on tropical cyclogenesis in Spanish. We have looked at wildfires in the Sahara Desert. We have looked at data visualization. So there is a wealth of knowledge in that website. And I have to do the spiel about the survey. So if I could get the keyword code and the screen for the survey. Nope. There were no. Yeah, just please talk slower. It will appear. I trust in them. If you're 18 years or older, please take a moment to fill out our three to five minute anonymous survey to help us better understand the impact of the program and how we can improve our next event. The survey will close on Monday, June 12th. You can find the survey using this QR code. You can also ask one of us if you would like to take the survey using one of our iPads. I really hope to see y'all next time. And just let's just take a second and give our speakers a hand because they did so great breaking down that concept. Thank you so much everyone. I hope to see y'all have a lovely rest of your day. Thank you for coming.