 I'm very delighted to introduce our next lecturer. Andrea Molot is a civil servant and research scientist in the global modeling and assimilation office at the NASA Goddard Flight Center. Her research focus is on land ocean surface and boundary layer interactions. And recently she has been leading the GMAO efforts to release a chemistry transport model called GEOS. Welcome and we're very looking forward to your talk Andrea. Hi. Thank you very much to the conveners for inviting me to speak here. So far the talks have been incredible. I'm looking forward to lots of interaction. Let me share my screen. Are we good? Yes, we can see it and it's not yet in full percentile mode but now it is. Thanks, great. So I am now the lead of the GMAO Seasonal Prediction Development Group. Here's a list of all the different people inside of GMAO that have contributed in one way or another to that piece of what I'm going to talk to you today. GEOS is the general name of GMAO's Global Earth Observing System Model and we have the GEOS S2S System version 3 that that's our latest release. And just briefly I'm going to give you sort of my take. It's going to repeat some of what others have said on the scientific basis and the history of sub-seasonal and seasonal prediction. I'm going to talk to you about some of this back and forth between couple models and couple data simulation and motivations. Some of this is going to contain the justification for why we choose to run a full earth system model for the sub-seasonal. And then I'm going to give you get you a little bit into the weeds of what our GMAO seasonal prediction model data simulation looks like. But these are details that every system out there makes different decisions about. So in terms of the scientific basis, we know that the climate system is a forced dissipative nonlinear dynamical system and because it's chaotic there's a finite limit to weather predictability. However and this however is a is a big piece of this. Tropical flow in particular are so strongly determined by the underlying SST that they show little sensitivity to changes in the initial conditions. We've also got this idea that the ocean and the land evolve more slowly and so this extends predictability as well. And so basically if we can predict the SST we should be able to predict certainly the large-scale seasonal tropical circulation sub-seasonal as well. And there are early studies that the SST actually depends not so much on the initial state but on the overlying atmosphere. The key to this whole thing is the signal-to-noise ratio as others have mentioned and so that's where this potential predictability comes from. Scientific basis a little bit more on seasonal scales and so is the biggest driver but there's other low-frequency modes as well and the Atlantic SST is important as well and the Atlantic birdianal mode is a strong low-frequency variability in the Atlantic and then in terms of the extra tropics it's mostly these canonical teleconnection patterns that you can find. This is one version of them that you could find you know on a lot of websites. The CDC is where I got this from and then here's another diagram similar to the one that Frederick showed that just illustrates this idea that on the weather timescale we can the predictability drops off quickly 10 a little bit more days but if you start taking weekly averages for instance on sub-seasonal scales you get a little bump in prediction and that drops off and then the seasonal it's monthly means etc. So this is a little bit of my take on the history of seasonal prediction starting in the late 1800s the Indian meteorological department started to forecast statistically obviously forecast monsoon rainfall based on the Himalayan snowfall in the previous winter. And then the Walker circulation and the Southern Pacific and North Atlantic oscillations again statistical model to predict seasonal scale precipitation based on this. In the wake of the dust bowl that was the beginning of some of the early funding. Rossby and and Demias and others working on wave propagation so that's the teleconnection start to the extra tropics start to come into play here. And then the El Nino stuff started in the international geophysical year. They noticed this seabirds in Peru and they started to get the idea based on these measurements that it was connected some some kind of larger scale phenomenon in the 70s the southern oscillation in El Nino the teleconnections you know things started to pick up the 7273 El Nino was a huge economic implications and so the importance of being able to predict these things. However 10 years later as it says here the biggest El Nino of the century at the time was going on and there was still no real consensus that it was even happening. And then some of the early modeling efforts. And then the research in the 90s started a tropical focused on tropical SSDs. Now these days dynamical prediction systems have been operational. There are many global producing centers across the world and a few different multimodal ensembles as well. One note here about what I'm calling the practicalities. The big question is what is it that's predictable at seasonal or sub seasonal lead times. Time averages, spatial averages, probabilistic measures. We need ensemble forecasts and it has to assess the reliability and that's this connection between your ensemble spread and your error. You want to know that if your ensemble spread is small that you can trust that answer much better than if it's wide. Calibration is the issue the simplest of which is like a bias removal but the calibrated forecasts are more reliable in this sense. Multimodal ensembles help. I'm not convinced that I understand or that others understand exactly why the multimodal ensembles help. But it reduces over confidence which means I have a small ensemble spread but I'm going to make a big mistake. In general this little diagram is starting to illustrate that the longer the lead time for whether we're talking about instantaneous we get to sub seasonal we're talking about week long averages that we need to look at monthly for seasonal and this is all about signal to noise. So a few ideas here about coupled models and coupled DA. The first question from my point of view is I'm going to show you an example of why we need coupled models to do sub seasonal or seasonal prediction and here's an example of a very high resolution simulation with the Geosatmosphere coupled to the MIT GCM. Here's the specs for the for the resolution that we ran this at but there was a study Strabac et al 2019 published a paper that talked about a feedback mechanism. This is three to five day oscillation going on so very relevant for sub seasonal timescales that's an interaction between the ocean surface and the overlying atmosphere. On the left is a diagram of the lag correlation but on the right is the basic physics for the mechanism positive sst anomaly it increases the instability additional buoyancy but that mixing drought drags high momentum air down to the surface it accelerates the winds higher winds have you more outgoing surface heat fluxes which tends to cool cool is more stable it reduces the wind it reduces the upward sensible heat flux and this goes around and around the lag correlation that the signal of this in the lag correlation plot is that blue curve from the coupled model that's showing negative correlations at negative lead and positive at positive lead that's suggestive of an oscillation. If you look at the blue the green curve here from Mera itself from Mera 2 itself there's a hint of it but it's not really cooking and if you look at the red curve which is the same model but an atmosphere only version of it you don't see any sign of this oscillation so you need the coupled model to capture these interactions even at short timescales. Why coupled DA? You know we have well established systems for DA in each fluid but we're not making the best use of some of the satellite observations that are sensitive to both fluids we use altimetry for the ocean DA not so much for the atmosphere we use some of the scatterometers for the atmosphere not so much for the ocean and so the idea is that if we were doing this coupled we could really make use of the data better. Other motivations for coupled DA is if you're running a coupled model you've got to initialize it from some spun up state physical consistency and also that the errors may be highly highly correlated with errors in the atmosphere in the ocean for instance so I'm talking about coupled data simulation and so I just want to give a little sense of what I'm calling the different flavors of coupled DA. Totally uncoupled would be in each system the atmosphere the land the sea ice the ocean running an independent data simulation driving an independent model making the first guess for the next independent DA so everything uncoupled this way. Something called weekly coupled is the essential coupling is through the model itself so the data simulation is running independently for each system but sorry but those are being used to drive a coupled model where all of the the components interact with each other and then use that to create the initial the initial states for I'll leave my hands off of it for the next atmospheric DA or land DA. The strongly coupled the gold standard is the data simulation and the model and everything are all running coupled but we need and that this could be called quasi strongly or strongly or different flavors have different names but for this we need coupled error covariances and that's no small task and so I'm going to talk to you all of that coupling coupled DA sounds nice but there are some cautionary tales here here's an example of cautionary tale with strongly coupled DA and that has to do with you know there's a lot of quotations here from a lot of different papers that are basically saying the same thing that's a nightmare those coupled error covariances very very difficult to estimate and essentially if you try to do it you end up doing a worse job in some cases than if you just did the weekly coupled so um cautionary tale there another cautionary tale related to strongly coupled is also from the european center and if you took a look at the left and the right that's zero sad it's it's a weekly coupled or strongly coupled or quasi strongly coupled you know the flavors are all different but if you take a look at the heat content exchange between the atmosphere and the ocean which is the black line the coupled and uncoupled on the right is the ocean only uh assimilation they're very similar to each other but if you look at the breakdown of what piece is doing what and what pieces of the budget are dominating in what place and things like that we see very big differences what's the correct um not particularly clear from the ocean da point of view i'm going to give you an example also of what i call a cautionary tale for coupled modeling here's here's from our system on the right is the coupled on the left is the atmosphere only and here's the difference from in the the cloud rate of effect at the surface and so the first thing that we see is that the errors the biases are quite different this one in the coupled where we have a very large cloud rate of effect has a particular bad impact on the model so what's going on here well we figured out that this is related to the way the turbulence parameterization works the turbulence parameterization in our system is sensitive to the near surface stability and so the idea is that if the cloud effect is a little too large just a little the ocean gets a little too cold and it increases the turbulent mixing and so you open this pathway for the evaporation to reach the cloud top and make more cloud and so we increase the cloud and make more cloud and make it colder and this spins away and and gets us when we first saw this got us a cold tongue that ran all the way to the maritime continent so these kinds of bad feedbacks i would say are possible um and so let me switch gears a little bit and talk to you about our system this is our system we're not an operational center and so in some ways either people don't know that we're doing this or we have to kind of explain to you why nasa is in the middle of this in the first place and the central motivation for us is to be able to have a state-of-the-art system that we can use it to demonstrate the use of nasa and other satellite data to improve sub-seasonal and seasonal forecast skill and so for that we have we engage in the model development the analysis development the initialization strategy couple the simulation strategy we also produce in near real time uh couple da and forecasts validate and engage in some predictability studies as well so this is a slide with a lot of detail maybe of interest maybe not but just a couple of things to point out our new system we've gone to very high resolution ocean we've gone to quarter degree ocean 50 levels is moderate to low vertical resolution um there's a new parameterization in there i'll say one one or two things about this that despite relatively low vertical resolution in the ocean it parameterizes the the diurnal cycle quite well interactive sea ice model we're running a weekly couple da and so the forecasts and any retrospective forecasts are going to be initialized from there i'll tell you a little bit about what we call what our flavor of that is and i'll show you a little detail about the observing system that we're using in the ocean um and so this is a timeline because this reanalysis is going to for us is going to run from 1980 to the present um and so this is in situ observations that go into the ocean analysis and the timeline you see the huge kickoff that happened when the argo floats kicked in um and just a little bit of a geographical perspective i'm going to run through a few of these quickly um this is our in situ observations for 1981 this is temperature um and so basically we're looking at these expendable bathy thermographs and the the conductive temperature density probes basically it's stuff that's being thrown off of ships um and so you see a lot of the ship tracks here get to 1990 we see this density and we see some of these moored arrays popping up in the pacific these are the tower arrays when we go to 2000 again relatively the same density of the bathy thermographs and stuff we're starting to get a smell of argo here the green ones and also the moored arrays in the atlantic the parado arrays and the tower arrays 2010 there's the argo that's kicking in and all of a sudden the coverage between here and here is like night and day and the argo are continuing continuing to populate so doing something in the 80s and doing something today are quite different in terms of available observations in the ocean um and here's just a quick um summary for satellite observations um it's basically altimeters that we're looking at for the ocean um when i when we look actually at some of these counts of the observations this is the number of casts hundreds of thousands atmospheric da folks would would have a heart attack with how few of these observations there are relative to what's available in the atmosphere um so here's our flavor of coupled da um and so we're using its weekly coupled system we're using the model the coupled model to to do all of the coupling and so we run our coupled da in two segments and two sequences one is this green line across the top that's our predictor segment that's just a straight coupled model forecast our data window is five days and so it's a straight coupled model forecast along the way we drop off what we call these are instantaneous states of so that we can compute the observation minus forecast so this is the forecast every six hours that we're dropping off the sequence of all minus s after five days of doing this we collect all the oh minus s together do the ocean data simulation ocean analysis itself back ourselves up five days and compute we compute a set of increments of this is how much the data is telling us the model needed to move or change so with those increments we run a predictor segment with all the increments inside of there and that bottom one that's our coupled da sequence line and again we do this for five days kick off a bunch of forecasts and then do it do it continue to do it forward so everything here is done with a coupled model um and so that's where the the weekly coupled da is coming from um i'll explain a little bit of this also but we're running uh nine month forecasts we're doing them every five days and the number of ensembles i'll explain to you but for the short lead times for the first couple of months we have a 40 member ensemble and 10 members running out for the seasonal um and one of the things also that's unique for us is we're running an interactive aerosol model with all of this um partly because we think that there's useful skill in aerosol optical depth or the pollutant pm 2.5 is the stuff that gets into your lungs um it also under certain conditions may increase weather or sub seasonal skill you can think easily of the impact of dust on tropical cyclone development um also in the aftermath of a large volcanic event and in general sub seasonal forecast skill so that's our motivation um this is our aod skill on the left and the pm 2.5 anomaly correlations on the right um you know in general terms 81 um correlation with observations this is a little inflated because we're using merit to aerosol optical depth as validation but that's analyzed in our system um on the right we can see pm 2.5 anomaly correlations even up to a couple of months lead up in the 60s and 70s which is certainly quite respectable um I talked to you about this atmosphere ocean interface layer um it it really helps us with the vertical resolution of the system so we went with this for our version 3 and the idea is you're taking the top level of the ocean model and breaking it up into a cool skin layer a diurnal warming layer and then a decay down to what we're calling the foundation temperature and so it allows us to capture the diurnal cycle there's a tech memo Akela and Suarez that describes all of this one of the motivations for going to the high resolution a few motivations for going to the high resolution here this is the 50 versus 25 kilometer ocean difference in bathymetry so you see that the bathymetry is deeper with the higher resolution and it it's basically it's importantly getting us better resolution in some of the the through flow areas the Indonesian through flow the Florida Straits etc um and so one of the other help that we're get also from the resolution is resolving surface currents on the left is the 50 the middle column is the 25 and on the right is um observational estimates and so you're starting to get loop current resolved here in Florida okay on the bottom you're getting the eddies in the corrosion that start to resemble we're not getting eddy we're not resolving eddies here we're not making any claims but it's starting to look more like what we think it should look like also related to getting some of the through flows better resolved the salinity is is is looking really nice salinity biases that we were getting on one side in the other than in the Indonesian through flow are gone it's helping ocean transport as well um one of the other new new elements that we have and this is a one of the again the motive one of the donations for for NASA is assimilating sea surface salinity our baritual ocean system that's running now is assimilating sea surface salinity for us we were raining too much there and so we were too fresh the salinity made it saltier and it improved the um mix layer depth and it it ended up damping the propagation of the the kelvin and and rosby waves running across the equator and getting us a better um El Nino forecast in in instances of very strong El Ninos um yeah I'm just watching the time motivation for the change in the ensemble strategy we developed a whole new forecast ensemble strategy the old system was under dispersive early over dispersive later meaning our our error our spread wasn't big enough early on and it was too big later on extra tropical skill was lower probably because we had a small ensemble size and the key to this here is and I'll show you this in the next slide we saw little evidence of additional skill from ensemble size beyond a few months and so we saved a lot of time by doing this sub sampling strategy and so the first issue that that changed is the forecast ensemble strategy our system now uses something that they call synchronized multiple time Schubert et al is a tech memo that that lines out the details of this but this sounds something along the flavor of what Joe referred to as the random field perturbation so the idea is perturbations for combined forecasts are randomly selected from one day through 10 day differences in the atmosphere ocean data simulation states the spatial structures turnout to be and the tech memo shows it closely related to the optimal perturbations that you would get from the singular value singular value decomposition that Joe spoke about um and we're presuming that they're sampling preferentially from the perturbations with the biggest growth rates which is what you want to show you what some of these spatial structures look like we did a some EOF decompositions this is one day and if you look across the tropics it it it has the smell of something that looks like variations in the thermocline um if you look at 10 day differences we're talking about something that's starting to look like tropical instability waves in the ocean because we do this both in the ocean and in the atmosphere um the atmospheric states one day you're looking at you know relatively small variations and small spatial structures you go to 10 days you're starting to look at something that may or may not be looking like a mad julian oscillation near the surface in terms of the scales of the variation in the winds um in the extra tropics uh if we look at here we looked at the 450 millibar potential temperature but again at one day we're looking at something that has a smell of synoptic scale variations at five days we're looking at something longer related to teleconnections um the other issue with the ensemble strategy is the size this diagram from scape and smith illustrates very clearly the ensemble means skill as it increases with the number of ensemble members um we were sitting down here someplace and our version three now is it brings it up to 40 so this we're expecting the skill to increase quite a bit now if we look at it so almost every case that we looked at looked at what we have here at the top here we have the the um this is the rms error so this is skill and this is rms errors so you want it's a flip thing we want rms error to be small but confidence interval with ensemble size the confidence interval comes down smaller but you're not getting any skill by increasing much beyond five or ten ensembles uh every once in a while we would see this where the rms error would reduce a little bit but almost everything was this and so um we decided that for and so scale beyond beyond three months or so we really only need 10 ensemble members so to do that we developed a subsampling strategy the basic idea is that you run 40 ensemble members out for three months do a clustering pick representative numbers of ensemble members from each cluster and extend the forecasts out with a smaller sample um so this has saved a huge amount of computer time we're getting the whole data simulation and forecasts retrospective forecast for 40 years done in about nine months time largely because of this um and we're running a lag burst ensemble so again if you can just think of these every five days we kick off an unperturbed and a series of perturbations based on that random perturbation uh method and at some point three months in we do the subsampling and only a certain group of them are continued um and just very quickly um these are the kinds of things that we have to look at in order to validate a system like this um i'm a big proponent of validating the mean equilibrium climate of the model that we use because that for us is the saturation level of the error we reach our saturation saturation level of error in about five or six months so for seasonal time scales the equilibrium climate errors are the target forecasts this is a little bit of a laundry list there's a couple of things that we're looking at that are relatively new the genesis potential index the cryosphere the aerosol optical depth and frederick spoke about the sun's stratospheric warmings we're also running with a two-moment cloud microphysics so we have the aerosol indirect and so we're also looking at a cloud drop number i'm going to run through this very quickly and could you wrap up so that there's some time for questions maybe in a minute or so yeah yeah yeah we're there so i was just going to kind of skip through this um this is low frequency mode indices that we're getting and if you look on the right in month two the higher ensemble size really bought us some nice respectable predictability for some of these other modes we also look at the observation minus forecast from the ocean um and this is basically it the summary of the characteristics and evaluation part maritou ocean is going to cover 82 to present public release middle and next year we've started the calculation already the upgrading the ocean resolution really bought us a lot in terms of the transport and salinity the glacial runoff is in the right place the diurnal cycle so and we also have the the salinity and the big change in the forecast strategy is lots of ensembles for short range and fewer for longer range um and so yeah thanks so much Andrea for this comprehensive talks and your insights about um coupled errors and error propagation and how the different components um impact each other