 OK, so when I had to prepare this talk it was not easy to put all the slack together because I think yesterday we went a lot through a lot of the material already of seasonal forecasting. So there may be a bit of a repeat today. So, first of all, we start with seasonal forecasting. And I will go a bit to the history of seasonal forecasting. The history is quite recent, I mean relatively recent in work on seasonal forecasting come from the late 1970s and early 1980s. And particularly this person, which is Kikou Miyakoda from GFDL, one of the first ones we really opened the door for seasonal prediction. So there were several studies by Shoukla, by Sparita, 76 Miyakoda, which explored the predictability at the seasonal time scale. So they tried to go beyond the dynamic predictability, which was assumed to be around two weeks at that time. And they recognized that the seasonal prediction can be seen as an initial value problem with external forcing, boundary value problems. So as I mentioned yesterday, it's a bit a mixture of NWP and seasonal forecasting. So it's called predictability in the midst of chaos. That's Shoukla and Zopalmer. So one of the first case studies actually with the seasonal forecast, which was demonstrated by Kikou Miyakoda, was an event which took place in January 1977. That's the newspaper from the time, the Miami News, which was a snow in Miami, which is of course extremely rare event. And Miyakoda tried to predict this event. So the top panel shows the circulation, the climatological circulation for this period of time, for January, for all the months of January. And the bottom left is observations for January 1977, which is a very different circulation than normal, very abnormal situation with a strong trough, with a very cold hair coming towards the Florida here, and bringing the very cold hair and the snow in the area. And the bottom right is forecast from GFDL, Moodle, that Miyakoda was at GFDL at that time. We showed quite remarkably those forecasts for day 10 to 30, which shows remarkable similarities with observations. So it was deemed as a very successful forecast. And we met a lot of people very excited at that time, so we may be able to predict whether up to 32 days. At that time, actually, there was a lot of interest at SMDUF, too. I mean, at that time, Kuba, Shkibaldi, Molteni, tried also to, in the 1980s, make some experiment to extend the deterministic forecast to 30 days with the Global SMDUF spectral model. They published some reports on it. And they did some case studies. They used fixed SSDs. And the results were quite mixed. At the beginning, they were very excited because they found that the data from day 10 to day 30 were generally more skillful than climatology. But the problems then they found after that, in fact, if you persist the forecast at day 10, you tended to get actually more skillful forecast. So there was no point burning CPU time beyond day 10. And this shows that actually at this time range it's much more difficult to be persistent than to beat climatology. So for a long time, the seasonal prediction went down. And it's only in the 2000s that the interest for seasonal prediction came back. So all that to say is that seasonal prediction is still easy and fancy. It's much more recent, I would say, this seasonal forecast is much more recent than seasonal forecasting, which is 10 to 20 years, but it's 10 to 20 years earlier, in the early 1980s. And 10 years ago, a couple of professional centers were producing dynamical seasonal forecast. I mean, they were, that was ECMUF on JMA basically. Now, most of the global producing centers that were mentioned in the previous talks are producing and issuing or experimenting seasonal forecast. So this is, for example, an example I got from Aon Kumar from NCEP CPC, which is, I think, quite a nice slide. It shows the NWS National Weather Service seamless suite of forecast and on the left column you get a more, that's more the ready set go concept I mentioned yesterday. So for the long range, you get the outlook, then guidance, suite assessment when you get closer, set forecast, watches and then you alert and you take action at this time range. And then you have all the forecasts for minutes, hours, days, three weeks, so where NWS issue forecast. And then you have this time range two weeks, two months, where there is no official NWS outlook, no for weeks, three, four time period, for instance, now, whereas there is some outlook for season on years. So currently already some centers like NCEP are not issuing, operationally, official outlook for this time range. Now, this is a plot that didn't show well for some reason, but I have the equivalent here. And that's an experimental product from NCEP, where they start to produce forecasts for week three and four, which they present under the form of above average temperature on the left side, which is for the orange colors, or blue is below average temperature favored. So they present the information as a per side for week three plus four. And on the right is for precipitation, dream is above average precipitation and brown is below average precipitation. So if you are familiar with this, it is very similar for week two already. And this plot is already produced not directly from the dynamical model, but by using MGO forecast and INU forecast. And basically it is a canonical response of those events. They plot this type of forecast. They produce this type of forecast. But they have plans also not to use directly the outputs of the dynamical model to do this. So all that to say that some operational centers are just starting to issue forecasts at this time range. And if I come back to the table of S2S models I showed already yesterday, most participants have developed seasonal forecasts. So we have already over 10 models which have dedicated seasonal forecasting systems. Only METO Francier doesn't have of the 12 GTCs that was mentioned in the previous lecture. Only two are not doing seasonal forecasts. One is the CPTEC, the second one is METO France which are only seasonal forecasts. Some of them use the same model as their seasonal forecasts but with more ensemble members. So the strategies, as I mentioned, are very different between models. There is no consensus there. Some models, seasonal forecasts are used the same forecasting system as their seasonal forecasting systems. What they do is they use the exact same system. For example, it is the same CFS S2 system for seasonal forecasts but you have much more ensemble member or much more frequent start dates up to day 45. For UKMO, it is the same system as their Zaglo C5, same system as their seasonal forecasting system but they have twice more ensemble member up to day 60. Some of the centers, seasonal forecasts is an extension of the medium range weather forecast, for example, the RTCMDUF under the Environment Canada. The RTCMDUF, for instance, the 46th, it is not 32, it should be 46, 46th forecast, the medium range EPS sensible system is extended to 46 days twice a week so that the exact same system as we have for medium range forecast is a different system from system 5 which is used for seasonal forecasting. And in other centers is a separate system which contains characteristics for seasonal forecasting. That is the case at CEMDUF before 2008. That is the case also at GMA which is a separate system. As I mentioned already yesterday, there are very different configurations for forecasting systems. There is no consensus on the optimal configurations. A frequency of forecast, some are daily, weekly, monthly, ensemble size, some are large, ensemble run once a week, some are small, ensemble run daily. The reason for those choices four times a day with four Ansible members is because they have a chunk of their machine allocated permanently to seasonal forecasting. So that's a way to optimize the use of the machine. Whereas at CEMDUF we can allocate a large, almost half of the machine to seasonal forecasting system for a period, for a short period of time. The model resolution can vary from 250 kilometers to typically 50 kilometers. Most of the Ansible members are less than 100 kilometers or typically less than seasonal forecasting. The time range goes from 32 to 60 days on different model setup. And that's something I didn't discuss yesterday. But some models have a coupled and ocean atmosphere. Some also have an active sea ice. And this difference comes mostly from the fact that some models, the models which usually are, as I mentioned, using the same system as the seasonal forecast system, usually have a coupled ocean atmosphere system. And they use also an active sea ice model. For the models we are always an extension of medium range forecast. Often the medium range forecast are not coupled, as was discussed in the previous talk. And they often don't have active sea ice. So usually the seasonal forecast produced from those models are usually no coupling or no sea ice. So if you look at the list here, I put in red the one which is not in the system yet. And in terms of ocean atmosphere coupling, we can see the majority of the model are coupled to an ocean. Only three of them are not. And a few of them, one, two, three, four, five, have an active sea ice model. And those are, as I mentioned, seasonal forecasts which are similar, which use the same system as the seasonal forecast systems. ECMWF and BOM are active sea ice models, which is planned for the next year. OK, so the next step is, as we showed, the ECMWF Ansible Forecasting System, which is based on the IFS version 41a1, which is at a running at a 32 and 64 km resolution in a single edged two-table. And it uses a couple to name, which is a one-by-one degree ocean model at a zero to three degree near the equator with a 42 vertical level. And we use HTSOL, which is for the land surface model. So IFS is coupled to HTSOL. And the initial condition we use a four diva variational data simulation for the initial condition for the atmosphere and three diva for the ocean. So the perturbations are EDA perturbations with singular vectors. So we come to that more in today's time. We use the five ocean analysis for the ocean. And the model, the couple GCM, so we run for 51 times at a once again resolution T639, which is about 32 km or 64 km up to day 46. And that creates the Ansible forecast. OK, so the IFS will change the resolution up to half day 10. So we run at 32 km for the first 10 days, then we lower the resolution and then we go today to 64 km afterwards because it's too costly to go to 32 km for 46 days. So we change the resolution. Between day 10 and day 46, it will be at 64 km. That's correct. I mean, this is a bit of a problem for some users because for precipitation it can create some inconsistency when you pass day 10. So we have to be very careful in the way you retrieve actually precipitation. So we have plans to extend actually this 32 km to 15 km to day 15, but that won't stop the problem. That puts the problem a bit further in time, but there's still this. So this was necessary to be able to use the same system for seasonal forecast because as I said, it will be way too costly for us to run at 32 km all along. OK, sub-seasonal forecast product. So for short or medium-range forecast, usually the forecast are issued for instantaneous or daily values. So basically a typical medium-range forecast is to tell you what would be the temperature in three days' time over three years. For seasonal forecasting, products are generally on a seasonal mean. At this time range, you cannot distinguish the weather from one day to another. There is some work no one is going to produce a monthly means, but it seems that after months one and two, there is very little skill in this kind of issue, the forecast from one month to another. So in general, the products and season time scale are more on the seasonal means. Sub-seasonal forecast, it will be on two weeks. Once again, there is little predictability in the day-to-day variability. We cannot really predict that the day 22 will be different from day 21 weather. But there is some skill in predicting weekly mean anomalies. So most of the products at this time range are on a weekly mean period, which is a good compromise between a period which is long enough so that you have some skill to predict the change from one week to another. And it's short enough to give you useful information. You could also issue a monthly seasonal forecast from day zero to day 30, for instance, but then you lose a bit of resolution. I mean, the two first weeks of the month may be very warm. Two second weeks may be two following weeks, very cold. On an average, you will have a normal month, whereas it was very unusual. So most of the products, and this is an example, similar to the one we did this morning, where we look at a weekly mean anomalies of two meter temperatures. So that's the typical product from ECM2F model. So here we take the sensible mean of the return forecast minus sensible mean of the return forecast. And we had those statistical tests on top of it to see if the sensible distribution of the return forecast is statistically different from the sensible distribution of the return forecast. And we shed the area where the difference is statistically significant. On the area which is white, on the area where there is no significance. And, of course, the longer we do in the time range, the more blank area you are, because your model is doing more and more towards climate and there is less and less skill. So there was a bit of distribution yesterday, or is it a useful product or not? Because here you lose a bit of information on the sensible distribution in one hand. But the fact that you also apply these statistical tests means you still keep some information. And it tells you basically where the area, where the distribution is really shifted from the return forecast. And the reason people like this type of slide is that in comparison to the probability maps, as I will show later on, it gives you an idea of the amplitude of the signal, the amplitude of the anomalies. Another type of plot is to produce a forecast as probability of precision, for example, anomaly in the upper teresile. So we define, of the model climatology, the limits of an upper and lower teresile. And you simply count the number of model and several members which are in the upper or lower teresile. So here it tells you, for example, that there is a low probability to be in the upper teresile. But there is high probability here to be in the upper teresile for precipitation. The high probability of precipitation is those areas. But compared to the products, this one doesn't tell you much about the exact intensity of the anomalies. So we also have a series of more sophisticated products. For example, here we have a six predefined weather regime over Europe. And we project each sensible member onto each of those regimes. And we simply contain the number of sensible member which can be associated to one regime. So, for example, in this example, we can see that two sensible member only over 50 go into this regime. 13 sensible member go to this one, 11 to this one, 9 to this one, and 11 to this one, only four. So you can, the model predicts that those two weather regime are more or less likely to happen than the others. And once again, if you are in the first week of the forecast, you tend to have more sensible member cluster into one weather regime. And when you go to the week three, week four, you tend to have a more uniform repartition of the ensemble across those six weather regimes. Yes? For clusters, it typically is Z500, geoprotection 500 hectopascal. Well, this is the old product. Now we are already finding with the four weather regime, classic weather regimes, bloatings, NEO plus, NEO minus on the Atlantis ridge. But this, one of the products is a tropical cyclone, which I already mentioned yesterday. So, we tried the tropical cyclone in the ECMDF of Z500 forecast. And we compute from that the probability of a tropical storm within 300 kilometers. And by the number of tropical, number of sensible member which predicts the strike, we can deduce the probability. And this tells you there's more than 50, 60 percent chance there will be a tropical cyclone strike in north of Australia. So, that's another type of product we can define. And one important is also to monitor the sources of predictability. I mean, highly, we'll do much more in details into that. But we also have products, there is a prediction of the MGO, evolution of the MGO, over the next 30 days. So, this uses a wheeler and end-on diagram. I think we see quite a lot of them in the next two weeks where the MGO can be represented as a circle going from the Indian Ocean, Maitam continent, western Pacific, or west hemisphere. And the distance from the center of the circle represents the amplitude of the MGO. So, a strong MGO would be a strong circle around here. So, here we show the forecast that in here, we are holding the west Pacific. And each dot represents one sensible member. So, we can see they are all very cluster, very small spread during the first day of the forecast at day one. Then this one, magenta, is day five. You start to have a bit more spread. Day ten is a bit higher spread. On the day 20 is a green one where a lot of sensible member predicts that the MGO will die. OK? So, model seasonal forecast on the forecast. So, once again, seasonal prediction is in between medium range forecasting and seasonal forecasting. For seasonal forecasting, as was mentioned in the previous talk, you need to calibrate your model because the model bias is way too large. And so you cannot issue seasonal seasonal forecast directly from your real-time model output. For medium range forecasting, it's very rare, actually, to have a re-forecast. For the TD database, for example, there is no re-forecast there. Most medium range forecasts are really produced by looking directly at the output of your model because at this time range is a bias but it's small enough. It's a way very, very small compared to the signal you are predicting at this time range, which is very large, usually. So, a system that has been, we are starting to use actually a re-forecast to calibrate medium range forecasts. And there has been some papers showing that there is some advantage in doing it but it's not entirely necessary. So, what's about seasonal forecast? Well, the seasonal forecast is more like a seasonal forecast in that regard. So, the model systemic system will grow very quickly during the model integration. So, the model will grow very quickly towards its own climate. And after two weeks, it can be as big as the signal we want to predict. So, we show here an example of bias, a bias of two meter temperature for the time range 600, which is the day 26 to 32. So, the last week of the whole system, the seasonal forecasting system, we show that in some areas, actually, the bias, we have a one bias of two meter temperature for this, that's in August, which can exceed of four degrees, which can exceed between two and four degrees. So, it's quite a large anomaly, yeah? This is just a model. This is, sorry, that's a bias with respect to error in theory, yes, that's right, yes, yes. And, yes, error is of the 2.4 degrees, which is as large as the signal we want to predict. I mean, often we see a forecast that week four anomalies will look a bit like that. So, we need to remove this bias. And, there's two options. There is an option which is to make correction during the model integrations, which is a flux bias correction. So, you know your flux is a systemically wrong, for instance. Then, you adjust them to, usually, a small for coproduction atmosphere system. And, this system, which is very popular in the climate simulations, but it tends to mess up a bit of the physics for the short-term, medium-range forecast, which means that for sub-sino-sino forecasting system, almost everybody uses the apostere corrections where you basically run your same model over a certain number of years, previous years, as a handcast or reforcast. So, you produce a large set of reforcasts, which will tell you what is a model climate. And then, from this model climate, you can see what is a model systemically wrong. You can measure them and correct your model. So, it's not a perfect methodology because you implicitly assume that the shift in the model forecast related to the model climate corresponds to the true shift in the true forecast related to this old climate. Basically, you assume that there is a linearity in your error, which sometimes doesn't happen. Sometimes your bias can be very strongly flow dependent. So, once again, for the sub-sino-prediction systems, there is not much, there is a lot of the differences in the way the reforcas are set up. Some are using a very long, as I mentioned, reforcast set, some a very short one. Some, we have talked a lot about it yesterday, around the fly, some are fixed. Some have a large and stable size, some a lower one. So, I won't go too much to that. We have been, we have discussed that yesterday, but once again, to remind you, it's important to mention that fixed reforcast, for example, NCEB, BOM, GMA, those are reforcasts which are produced once for all because it is very important to understand this. So, advantage is more user friendly. It's usually application like it because you can measure for, you can calculate your down-scaling coefficients once for all whereas for the on-the-fly forecast, reforcast, you have to constantly monitor because the model changes physics. Yeah? Why is that the variety? So, for example, we have NCEB which is a reforcast is daily. So, if you issue a forecast on the 8th of August, you will have the exact corresponding date in the reforcast data set. But some of the reforcasts which are fixed like Australia, they have a reforcast is the first, the 6th, 11th, 16th, 21st, 26th of each month. So, if you have the 8th of August, you usually have to take the two closest dates in the reforcast. So, you will be on average, I would guess, between the 6th and the one on the 11th of the month. Okay? Yeah? Yeah, but I will say you will try to do that for each date in advance, I think. That shouldn't be too painful for you. Yeah? Yeah? Yeah, but I will say you will try to do that for each date in advance, I think. That shouldn't be too painful. Yes? Questions? So, that's another example. So, some of the reforcasts are for really fixed dates. The JMA2 is a three dates per month. And then, yes, you have to figure out which are the closest dates in the reforcast. It's a bit more complicated. Yes? Sorry? Yes? Yes? Yes? For best reaction, the second one. Nobody does the first one. Is that what you mentioned? Nobody does only in climate simulation. In long-climate simulation, they do that because they don't like to do a reforcast. But for sub-zero prediction, everybody does apostereal corrections, which means to run a large set of reforcasts to do that. Okay? And then, you have on the fly, every reforcast, like this M2FUKMO, which guarantees usually that you have the same dates as the real-time forecast, which is not always true because UKMO is also three fixed dates of the month. That's another story. And the advantage of this methodology is that it ensures you have the best, substantial, best version of your model at a given time. The disadvantage is this methodology is maybe more difficult for applications because each time you change model, well, your statistical model may have to be adjusted. Okay? So here, I will give an example of the ECM2FUKMO. So we read this, which is on the fly, reforcast. So we have this real-time forecast. That's an old example, sorry, from 2014. We produce our forecast on March 27, 2014. So we run 51 ensemble members at T639 for 10 days. And then T319, which is 64 kilometers up to day number 46, at that time was 32 days. And at that time, we have the reforcast consisted of five ensemble members starting the same day and same month over the past 20 years. So from 1994 to 2013, with the same resolution, though exactly the same model, so 32 kilometers, 64 kilometers for between day 10 and day 32. And we produce those reforcasts two weeks in advance. So that we have always a five-weeks window. Because for our medium-range forecast, we use a five-weeks window for some products, like the eight-week forecast index. So for a substantial forecast, those reforcasts should be as consistent as possible with the real-time forecast. The initial conditions start from era interim with the real-analysis, ocean real-analysis, ocean real-analysis under those soil real-analysis. And we go to that more in detail in two days. Perturbation Army, making what we are doing for real-time forecast. We're using our vectors. We use EDA for 2015 because we don't have EDA in the past. And we do use those basket of schemes, such as in physics. Now, a question we have often is, all our products, the system of forecast product at CMDF, use the same reforcast, the only one set of reforcasts, the one we start at the exact same day and same month as the real-time forecast. And the question we often have is, why don't you use a five-week window to calibrate your most substantial forecast? Because with a five-week window, you get a much larger ensemble. And if you have a much larger ensemble, it means you have a much better way to measure the limit, the boundary of the tersile or the boundary of the d-siles. So you can get a much more accurate calibration. But the problem we have is that, for example, if I take the example of two-meter temperature, you have a seasonal cycle. So if your real-time forecast is at the time where the seasonal cycle is at the maximum of the season, so that will be my week zero, the climate of week minus one, week minus two, and week plus one, week plus two will be systematically cooler than at week zero. So if you look at the difference between the climate from week zero, which is the one we use now, and the climate we will use if you did a five-week windows, you can get warm anomalies over the northern hemisphere and cold anomalies over southern estratropics. It's not very, very big, large, but the anomaly can reach one degree, which is almost the same order of magnitude once again as the signal you want to predict. So if you use this five-week window, you can mess up seriously your forecast. That's why it's very strongly recommended to use the exact data to calibrate your forecast, your real-time forecast. OK, so reforcas are very useful for two things. One is to calibrate the models, which is also very useful for skill assessments. And a large reforca database is very useful for calibration, once again to distinguish between a random error and systematic errors, and also to estimate flow-dependent errors. Some calibration are using like Bayesian techniques, a very large set of number of years to be able to not only correct systematic bias, but also the spread of the ensemble depending on the flow. And a large reforca database is also needed for verification and for flow-dependent assessment, like assessing the concurrent impact of ENSO on a specific phase of the MGO. If you want to see if your model really simulates well the impact of MGO when there is an N in your event, then you need a very large number of reforcasts, and also for forecast scores. The signal-to-noise ratio is also improving long-reforecast data sets. The large ensemble size is also important for skill assessments since some proboscis scores are impacted by the ensemble size. So, if you verify your five ensemble members, your proboscis scores are usually much lower. Some can be very sensitive to the ensemble size, although there can be some corrections. Ideally, we should have a very large reforcast, or a set of handcasts. The problem is that a large reforca dataset with large ensemble size is not affordable. It's too expensive. If you have a 30-year reforcast with a 51 ensemble member that will turn forecast, you make a system that will be a 30 times more costly reforcast than your return forecast. So, it means you will not be able to run 32 kilometers or 64 kilometers subsystem forecast. So, usually actually in the S2S database, there is a model which has a huge reforcast dataset. It's a bomb, the one from Bureau of Meteorology, and which has 32 ensemble members, six months for 30 years, six times a month for 30 years. But that's the one which has a very low resolution, the 47. The models which are very expensive usually are very short reforcasts. So, that's the trade-off. What factor? The high signal-to-noise ratio. Well, it depends. I mean, if you have a week before you're in the long range, your model forecast is very close to the climatology. So, you need really a large set of reforcasts to be really able to distinguish between the two. If you look at the maps, you have a lot. The way I wish I shaded are very small. So, that's where actually a large set of reforcasts can be useful. The issue is very hard. Yes, over a region like, for example, if you have a Nino, it's much lower. But a region that's over Europe, over summer, for instance, then you can have a lot of noise. On the for Nino, for instance, I've been sure with Adam Scaff, actually, the signal-to-noise ratio can be quite huge. So, however, I went to this one. The second point is that the long reforcasts are also from inconsistency in quality in the initial conditions, pre-satellite period, and they also start from initial conditions like from here and here, which are not as low quality as the one you use for your time forecast. So, you have to be aware that the skill assessment of reforcasts is often lower boundary of the real skill of your system. So, yeah, another point which is important is there has been some institution about what is more important. If you want to, if you have a lot of resources for a reforcast, what is more important? To have more years, a very large reforcast here, 50 years of reforcast with a small and stable size. Or is it better to have maybe just 10 years, but with a 30-member ensemble? Unfortunately, our forecast always says it's an open question. We don't know yet the answer. For medium-wide forecast, people have looked at this issue and they found that it's better to have a lot of a large number of years on small and stable size, because in medium-short or medium-wide forecast, the ensemble spread is very small. So, if you have very few years of reforcasts, also a lot of the cluster of the distribution will be cluster around a few points. But for stable forecast, it's not clear. And that's one thing we plan to investigate in S2S, actually. So, verification. So, it's not only important to produce forecast. We need to verify them. So, one simple way to do verifications on something we have in a web server, which is open to the public, is to verify each individual forecast this morning. So, you have the analysis from ERA Interim, which shows the anomalies here of two meter temperature for a given week. And here, we show the forecast that was issued for a different time range, but verifying on the same week. So, here is a forecast starting here on 19th February 2015 for day five to 11. And that's the forecast that we started one week earlier on 12th of February. So, different start dates is not the same forecast. But for the time range, day 12 to 18, verifying on the same week on the run. And then the last one is forecast starting on 29th of January for the time range, day 2632. So, it's just a case. It doesn't tell you if your model is really good or not. But, when you are doing case studies, it's always useful to see how your model perform at different time range and at which time range the model captures a particular signal. So, this show, for example, that this cold anomaly was relatively well captured, at least up to two weeks in advance. And it was a bit more to the east, to the west, three weeks in advance. So, you can do that, for example, as I was an important case, which was the Pakistan flooding since 2010. So, here, we verify the worst time was between 26th of July, 2010 and 1st of August. And this is an analysis from Iran Teri, which shows those big blobs of precipitation over Pakistan. And that's a forecast from day 5 to 11, which has issued on the 22nd of July showing that the event was fairly well predicted at this time range. All those amplitude is a bit lower, but that's a sensible mean. And that's a forecast from one week earlier, 15th of July, shows those same pattern of precipitation, and even three weeks in advance. For the forecast starting on 8th of July, we get those very strong signals. So, it's useful to verify some case studies, but if you want to have a more statistically significant assessment of your model scale, then you have to use a large number of cases to measure the scale. So, here we are using probabilistic scale score, which is called the rock area. So, this is quite a popular scale score in the NWP community, which is based on the heat on the full salamow rate. So, basically, the closer to 1, it is better. The perfect forecast would be at one. And, if you are less than 0.5, it means that you are worse than climatology. So, here is a scale score of the same reference, one substantial forecast since the beginning. So, we put all the cases, all the real-time forecasts in 2004. So, we have about 600 cases. And, it shows that for this time range, day 5 to 11, there is generally quite a relatively good scale. And interestingly, you can see that the tropical regions, are the regions where the scale is lower. So, at this time range, you have more scale in the stratropics than in the tropics, which is well known actually at this time range. When you go to day 12 to 18, so you start to go to the 600 time range, the scale, of course, you have a big jump in scale. You have very few areas where you are more than 0.8 rock area. But, most of the time, you are more than 0.6 between 0.6 and 0.8. And, here, the scale is quite uniform. I mean, you have, of course, some patches. But, overall, there seems to be a cool scale in the tropics, the stratropics. And, when you go to the next week, which is day 1925, here you start to have another big jump in scale scores to be between 0.5 and 0.6, some area between 0.6 and 0.7. But, here, it's in tropical area. Actually, that will get the strongest scale, like north-west Brazil and north-west Australia. And, if you go to day 2532, the last week here, you have very little scale in the north and extra tropics. You are always, I mean, that's a good news. Actually, you are always in the red color, which means you are always doing better than climatology. So, you are very close to 0.5, 0.6. So, you are very close to the future of climatology. But, this is slightly better. So, you are not losing anything. But, you still have some scale here in the tropical region. So, here, you start to be really like what you get in seasonal time range. So, it seems that day 5 to 11 is really behaving really like medium range forecast. 3532 is more really like seasonal forecast. And, between here is between is a transition between, you see, medium range and seasonal forecast thing. This is the wall here, yes. Yes, wall here. For winter, we say the patterns are the same in winter and summer. But, summer, you have more scale in winter than in summer. And, the opposite in the southern hemisphere, yes. So, as I mentioned earlier, there was an early attempt to do 60 to 40 years ago. And, as I mentioned, the scale score at that time were better than climatology. So, people were quite happy. But, it was doing worse than persistence. So, here, we see that the model is outperforming climatology. The question is, is it doing better than persistence? So, for that, we use rock area again, which is this curve. So, the closer it is to the top corner, the better. And, this diagonal is for when there is no scale, on the below that, when there is no scale. So, the rock area is actually the area below this curve. So, if it's at one, it's perfect. If it's a point five, then you are like climatology. So, the closer you are to the top left corner, the better it is. And, here, we are comparing the forecast of the real-time signal forecast with the forecast you would obtain if you persisted in the previous week. For persistence, you could persist day zero, the initial conditions. But, usually, it's much tougher actually to beat the previous forecast. So, here, we persist day five, 11 forecast. And, we compare it to the signal forecast of day two to 18. And, we see that the signal forecast actually significantly beats the persistence of day five, 11. So, we are doing better than persistence, which means that we are not wasting CPU time in pushing our system up to the 18, at least. For day 1932, again, here, we have put two last week together. We compare with the two week period from day five to 18. And, here, again, it's not much, but we still beat persistence at this time range. So, the white panel represents the time which was a bit explained by Adrian yesterday. It's a quite visual way to look at the reliability of forecast. So, I'll show you for forecast probability how many times, for given forecast probability, verified with observation. For example, if your model predicts 0% chance of an event to happen, it should never happen. If your model is perfect, it should never happen. Each time your model predicts 50% chance of an event to happen, it should verify 100% of the time. So, if your model is perfectly reliable, it should be along the diagonal. If your model is not reliable, it should be along the horizontal line. So, the closer to the diagonal, you are the better. And we can see that for the forecast at day two to 18, actually, we are quite close to the diagonal and definitely much better than the persistence of day five, 11. On the same thing, on the same day, 1932, we have a much more reliable forecast than the persistence of the two previous weeks. So, this shows that subsequent forecast, even at this time, range, which is 3, 8, 4, the still is not very high. Don't expect too much, but it's better than, I'm not saying exactly that climatology. For precipitation forecast, we get much lower stills course, but the same conclusion, we get the same conclusion. I mean, this still, even the still is low. For precipitation, we get more still, more still full than climatology at week 3, 8, 4. And generally, still, though, more still full than persistence. But the still vary from parameter to parameter. To make up the temperature like here is much more predictable than precipitation. OK, and another point I want to make is that the still can be very strongly flow dependent. And, for example, during a period where you have an MGO, we tend to get a much, much still full forecast over the next three days. And, for example, there is no MGO. So, this is, for example, here we are looking at the still, the relabity diagram for T815, the upper tertile, for date 1925, over week three. And the red is when you have an MGO in the initial conditions, which shows very reliable forecast at this time when for North American or whether if you have no MGO in the initial conditions, you tend to be more flat, much flatter, much less reliable. And that's something useful for the users. I mean, you may know that if you have an MGO in the initial conditions, you can trust more the forecast than if there is no MGO. So, it's a bit of a difference with the medium-range forecast where you have, I would say, always normally some still. You may have some flow dependence, still maybe higher in some circumstances than others. But there is a rarely period where you have no still at all, on day five or day six. Whereas here, for substantial forecast, we have really a window of opportunities. And that's where you have to be very careful with where your model simulates the MGO and simulates its impact, for instance. Because, well, if we come to the tool by Hylian, the MGO is one main source of predictability at this time range. And as I showed in the plot yesterday, here we are not looking at the phase of the MGO. We are looking more at when we have a strong amplitude. We are looking more at the amplitude, yes. Significant more amplitude outside the circle. Yeah, exactly, yes. Yes? Sorry? In both categories? About, yes. Yes, about, yes. It happens about 50% of the time during this period of time. Yes? Yes. And here was a recent paper by Rome Tripati, a work with him, where we are doing something similar from stratocytodon warmings. So, as I explained yesterday, those stratocytodon warmings can be also an important source of predictability at this time range. And we look at the skill scores over North Russia, for example, East Canada. When you have SSW in the initial conditions, which is a red line scores, and blue is a score when you have no SSW in the initial conditions. Those are higher, the better. Those are for week one, week two, week three, four. Of course, the skill scores are going down from week to week. But you can see that the red bars are always statistically significant higher than the blue bars. And same over Canada. We show that again, when you have a stratocytodon warming, you tend to get a higher skill score over those areas. OK, so last 10 minutes, I want to go to model development quickly. One point I want to make is that the models are evolving very quickly. This is the evolution of the models from the JMA model from 1996 to now. So, they have changed resolution three, four times. They have no resolution of about 50 kilometers now, whereas they were around 150 kilometers in 1996. The models were vertically evolving, the models are getting much, much finer resolution quite quickly. This is a change from the ECMDF substantial forecast over the last 10 years. 10 years ago, we were running the model at T159, which is about 21 degrees, 100 kilometers, 110 kilometers for 32 days with only 42, 40 vertical levels on the top at TENETO Pascal. We were re-forecast for only 12 years and we were running re-forecast with five members per week. Now, in terms of resolution, we have gone to T639, T319, those 32 kilometers, so we have gone from 110 kilometers to 64 kilometers. We have 91 vertical levels instead of 40, top at 1, 0.01 ET Pascal instead of 10 ET Pascal, sorry, 1 Pascal. Coupling, that's what I'm interested in. Originally, it was a couple system, then we merged with EPS, which was no couple from day zero, and then now it's a coupling, a couple really from day one. Re-forecast has been extended to 20 years, 11 members, and we are using error interring. So, the models are really getting much finer. I mean, we are changing resolution next year. We will go to around 30 kilometers next year. And, on the skill score, the model is also improving in terms of physics, not only has the resolution improved, increased, but the model physics is also improving. So, this shows, for example, the evolution of the amplitude of the MGO, the amplitude error of the MGO relative to error interring. So, if you are below zero, means that your MGO is way, way weaker, statistically weaker than in error interring. And we see that in the early age of the monthly forecast system, student forecast system, the MGO tended to be very, very weak. The model was not able to maintain an MGO for more than a few days. It was quite a considerable improvement. Some point that your MGO was a bit too strong. And now, the MGO is only about 10%, within 10% of the amplitude, in general, of the error interring MGO. And the skill score of the MGO has also improved considerably. This shows the day we reach MGO correlation of the index at the correlation of 0.5, 0.6, 0.8. And 10 years ago, in 2002, the model has some skills. If we take 0.6 as a reference, the model has some skills to produce MGO for only about two weeks in advance. Now, we are around 26, 27 days in advance. So, there has been a gain of about 10 days of predicted skills over the last 10 years. The MGO prediction has been really one of the big success stories of weather forecasting, I would say, over the last 10 years. So, I think it's very important here to understand that models are really getting used to be 10 years ago, and they are really improving very quickly. And, here is actually an impact of the MGO on the teleconnection over the mercenaries' attributes. So, 10 days after an MGO in phase three over the active phase of the MGO over the Indian Ocean, we tend to get this very strong positive NEO projection. And, in 2002, where the MGO was very, very weak, the impact of the MGO was extremely weak. Whereas, in 2011, in the recent version of the models, we get the patterns which are much more consistent with the Grand Theorem and with the same much similar amplitude. So, not only is MGO improving, but also the teleconnections on the model are much more able now to exploit the predictability associated to those sources of predictability. And, the other skills are also improving as the evolution of two meter temperature skills course showing that for day 12 to 18, there has been almost a doubling of the APSS skills course over the last 10 years. And, the skill for week four now is better actually than the skill for week three 10 years ago. So, there has been quite a very nice, much more nice improvement now of this sort of prediction. And, this represents the next grid. We are going to alter the dual grid, and then we are going to go to the next system, which has a resolution of around 32 kilometers, which will be probably next April. OK, so I would do just quickly to say that the impact of resolution is also positive on the skills course. We go quickly to that. So, when we improve the resolution, we tend to get a better forecast of precipitation on Z 500. And, we tend to get also better prediction of some extreme events like tropical cyclones. So, that's the current operational system, that forecast that I showed yesterday for tropical cyclone PAM. And, with a higher resolution, we get much higher probabilities over the right area. If we go to 32 kilometers, to 16 kilometers all along, then we get even a much better forecast. So, there is quite a sensitivity. Some extreme events like tropical cyclones, heavy precipitation events can be very sensitive to higher resolution and usually benefit from much finer grids. OK, so the model is evolving, not only in having better physics and improved parameterizations and also finer resolution. We also try to make the model more complex. And, the current system now, we have an atmosphere coupled to an ocean wave model, land surface, vegetation, snow ice. And, in the future, we plan also to have chemistry in the atmosphere, to introduce chemistry, as was mentioned in the previous talk. For the ocean, we plan also to have a triosphere. We plan also to have a river, a modizational river outflow to use it to close actually to link with the ocean model. So, to do a much more complex system or to have the wave on the ocean model completely coupled now. So, one question often is do we need really an ocean coupled ocean atmosphere system for substantial prediction? As was mentioned earlier for signal forecasting, you need it because you need to predict a linear, which is a coupled ocean atmosphere mode. For medium-range forecasting, as I mentioned, a lot of models are not coupled for the first 15 days. The reason for that is that there are very little benefits to couple them because the SST anomalies don't vary much during the first 15 days. And when you couple, you tend to introduce model errors. The fluxes of the ocean seas are often worn by a few watts per meter squared, which means that the ocean is doing it very quickly. And you end up to really mess up more your still scores when you have a couple system than you have not coupled, which is why it took a long, long time at SMWF to finally be able to couple an ensemble system from day zero. And we are the only one to do it now, so we follow certainly. But it's a very, very difficult situation. But for substantial prediction, one of the main source of predictability is the MGO. And there has been a lot of paper showing that actually ocean atmosphere coupling is quite important. It's not necessary to get a good prediction of the MGO, but it generally improves the propagation of the MGO on the still score. So, for example, here, blue is a still score with the old couple model. If you perceive the SST anomalies, green is with the observed SSTs, and orange is with the coupled model. So, even the coupled model beats even the observed SST. And this is the reason that you need this interaction between the upper layer of the ocean with the atmosphere. And what is particularly important for substantive forecast, more than substantive forecast, is actually the resolution of the top layers of your ocean. You need to have a very good diagonal circle of SSTs. And for that, you need a resolution of about at least one meter in the top layers of your ocean. Yes? Well, we did some experiment with the Miss Liar model. First, actually, attempt was with the Miss Liar model, which did as well as the coupled model, actually. For the MGO, for the MGO. But the East Pacific, however, already at this time range of one month, you start still to see the development of an incomplete Miss from that. So, for the MGO, it's the same. But for some regions where advection is quite important, like in the East Pacific, you are losing. And now we are going to 45, 60 days, which is even more crucial to get that. So, I think everybody is going towards actually a full, a 3D ocean model. But it's an ocean model. You don't need to have a big resolution in the bottom of the ocean, which doesn't matter at this time range. That's the main difference with the climate models, maybe. And we are also experimenting with a new sea ice model, like the sea ice model. Currently, we are just persisting sea ice. And we can find quite a better skill for summer and winter than the current system to predict the evolution of sea ice compared to the current system. So, I go almost to the last slide. And what is quite interesting is that we also get some improvement in the still score of Europe when we have a sea ice than when we have a control run. So, there seems to be some positive impact of this active sea ice on the prediction of Europe. And we are testing, the ocean is one degree, whereas the testing next year, we should go to a quarter of a degree with 75 vertical levels model. And when we run it, we still don't understand exactly why, but we get quite a dramatic improvement. This is over the north enemy sphere for width 3, width 4 in winter time, which is probably linked to a much better resolution of the Gulf Stream, which can have some impact. I mean, as shown by the method. The annual sea ice is also quite improved with a quarter of a degree. OK, so the conclusions of this talk is that, once again, I cannot stress long enough that we are still at the end of forecasting. There is still a lot, a lot of open questions. There is no consensus on the optimal forecasting system. On S2S database, it would be quite helpful to compare the various configurations and maybe help to answer some of those questions. That's one of the goals of the S2S project. So, S2S forecasts, you need calibration. Flow-dependent calibration, however, we need much more free forecasts than actually produced by most of the models. So, that's something we have to keep in mind. All forecasts, although another important message, have been proved over the last 10 years. But still, the width 4 is marginally better than chemotherapy. And the models are getting more complex, with a higher resolution, more component on the Earth system. And we hope that to keep this forecasting curve to continue to improve. So, that's all. Thank you.