 Okay, so I think we can start the last session before tonight's party. And I would like to give you a quick demonstration of the S2S web database. Two more mornings, so we will have a lab session. And actually the main goal of this first week is, what my hope is, is that by the end of the first week, we will be more fluent in retrieving data from the S2S database through the data portal and through Web API. And the second week, next week, then we'll be more about really more science, exercise on science. So, tomorrow I think the session will be more about retrieving data simply from the data portal. And I will give you a quick demonstration of this database, the data portal. So, first of all, here are the important links that you need to be aware of. I think for me, there is one you need to remember. And Presley Personal is the only one I can remember. He's the top one, which is www.S2SPrediction.net. This is a website for the S2S project, which contains a link, an easy link to find for all the three other web pages. So, this one, the second one, http, apps.isimdb.int, data sets, data S2S, is the S2S portal at ISMWF. So, if you click on this directly, you get directly into the data portal. Maybe we can write it on the table. The second one is httpps2s.CMA.CN index, so that's an easy one to remember. And that's the S2S portal at CMA. It opened just a few days ago, actually. I am not too familiar about it, so we won't go through this one. We go to the ISMWF one. And another important link is this one, the last one, software.ismdb.int, which contains the wiki pages for information on S2S. So, there is some very useful information about it. And it also contains a link to the data portal. So, we plan, actually, to change the address soon, to have something more like the CMA one. But, yeah, it's not the case yet. So, if you go to S2SPrediction.net, you will get this web page, which, as I mentioned this morning, is the S2S project web page, contains the latest information, so that CMA S2S data portal is not open. You can link here. As I said, you have the coming events on the run, the latest S2S news. And you have this menu here, which is a database. And if you click on it, then you have four links. The first one is S2S database with more information. But here you have the data portal on ISMWF, which will bring you directly towards this data portal, the data portal with CMA, our model description, which is, in fact, a description of the S2S database. So, for me, it's the easiest way to navigate through all of this. And if you go to the wiki pages of the S2S data page, of the S2S web page, then you get this web page where you have the latest news and you have several links to parameters, these are parameters, instruction, progress status. I will go to that. So, if you, for example, click on the first one on the left menu, you have the news. If you click on news, well, you have the definition of the project, which is a long text describing the S2S project. So, you have all the details of more or less what I mentioned this morning. If you go to the, sorry, that's in description. If you go click on models, that's an important web page because it gives you the latest status of each model, the latest configuration of each model. This is, for example, status from 1st of July 2015. So, you have all the 11 models, what is their time range, resolution, and say both sides on the run. And this is something that will be updated each time a new page changes the model configuration. For example, for ECMWF, in the early stage of the S2S database, we are running the model only up to 32 days. The re-forecasts were only once a week with five Ansible members. And since May 2015, we have extended to 46 days on the run. So, this represents here the latest status. And if you click on each model, then you get an in-depth description of the model physics, model initial conditions, all the information you need to know about this specific model. Do you have the information on this page about today's version? Yes, exactly. I mean, I can give you, actually, more demonstration here. So, this is not a live demonstration. If you go to model description... So, if you go to S2S, description, models. So, if you go to ECMWF, you click on it. It will tell you the latest version of the model. It will tell you the cycle, which is a bit obscure, maybe, for most people here. But it will tell you it has been operational since 14 May 2015 included. If you click here, it will give you the latest model description. So, you see Ansible version, configuration, initial conditions, perturbation, model uncertainty. You have a lot of details about the model physics. But you can also get, from the old version, cycle 41.1, which was before 14 May, and it tells you what it was, what was the difference. So, you can see through these pages what was the different configuration of the model, what are the changes from one configuration to another. OK, so if I go back to the PowerPoint, those are quite a fairly important piece of information here for each model. And, if you go... Well, I will go later on to more details, but that's, I think, the most important one to know now. So, getting the data. To get the data, step one is to register, and we can do here is that live. If I go back to S2S project, I click on data portal, TSMWF. So, that's a webpage, usually, you get when you go to the data portal first thing. And, first thing I ask you is, please, log in before doing data from the data server. It's OK, it's quite neat. So, if you press log in, either you have already registered, so you need to put your user ID on password and then log in, or if you are not registered, as Adrian said, you are welcome to... not welcome actually, more than that, to register tonight, requested, to do that tonight. And, to click the register now, though it's very simple, you just put your first name, your last name, what is your email address, and, importantly, to who you accept the terms or conditions. I know it's very timetable to not to read them, it's a bit boring, but I think it's important to read them at least, to have at least a first look at them. There is some important piece of information about them, particularly about the registration of data. So, if you take data from ECMWF, what is the law? What is the right to do with it? And that's quite a relatively strict, actually. We do not recommend people to redistribute the data. One reason we don't want people to distribute data is that sometime we can notice after five months, for instance, that one model, one variable is completely wrong. So, we have to fix it in your database, and if people start to redistribute it before, then you have different versions of the data circulating and people just start to get lost about it. So, please have a look at the term of conditions, then you type the text here, and you simply register. There is a very simple process to do. So, maybe Adrienne will look, because I don't know my password by heart. Sorry. No, I didn't like it. Maybe we waited too long. OK, so now, once you have registered and unsigned in, this is all the web page should look like. So, we have no more of this big warning. So, you will see that in this web page, you have a menu on the left which asks you if you want a real-time forecast or a re-forecast. So, we start with a real-time. Then, you can choose which model you want. So, as I said, there are seven models available now. We hope to have a few more by the end of the year, and all of them quite soon, early next year. That's the default, yes, because that's why everybody will do. And what is important, too, is that you have different type of variables. So, you have some variables that are instantaneous and accumulated. For example, the total precipitation is accumulated. Those are available 0, 6, 0, 24, 48 on the run. And you have some variables, as I mentioned this morning, which are daily averaged. And that's the case of two meter temperature, for example, which is an average from six hourly outputs. So, it will be, for example, if you click on it, you will see immediately that the steps look different. This time, the steps look 0 to 24, 24 to 48, 48, 72, 72, 96. So, there are always 24 hourly. OK, then you can choose if you want the control forecast or per-ter forecast. The control forecast, as was explained by Adrian, is the simple, the first realization. Per-ter forecast, then you have another menu where you select the number. And, of course, what is nice with this interface is that this number will be calculated to the model you select. So, if you select the same way, you will see you have, you can go up to 50 members because that's a 50 member ensemble, 51 members if we add the control. But if you choose, for example, CMA, you will see you just have a three numbers because that's a four-member ensemble. So, the first one is exactly like for era interi. You select the debt you want. We have two ways to select debts. Either you put from debt A to debt B, or you can alternatively select the month. You can say I want all the real-time forecasts for January. And what is nice with it is that, for example, for the same-degree, we don't have a real-time forecast every day. We will produce a real-time forecast twice a week. But you don't need to know that. You will automatically know which debt to take on to upload. So, if I go to the same-degree F again, because that's the one I know the best. So, here you can select unstable member number one. You want step 2448 for parameter for two-meter temperature, for example, for two-meter view point temperature. So, like for era interi, if I could take, for example, real-time instantaneous fields, you can see you have parameters steps every six-hourly. But not all parameters are six-hourly. So, if you select 10-meter view wind, which is 24-hourly, you see immediately that some time steps are great, cannot be selected. And you can select 0, 24 on the run. So, that's a nice point of this interface. It can tell you what is available, what you can really download, even if you don't know a priori what is a model configuration. It's a visual way to know what is inside your model. So, once you do that, you select January, you select step 24, and you select 10-meter view winds. Then you have two options. You don't have three, unfortunately, like for era interi. For era interi, we have the third option, which is net CDF retrieval. And there is a way to do offline retrieval of conversion of net CDF. But this is a menu that will come automatically later on. Yeah? So, when you're first investigating this data step, it's a good idea, when you're first investigating, you need to look at one variable at a time, to get used to the availability, because if you click on two or three variables, if they have different steps of availability, then it can be a little bit confusing that when you get to file, because maybe one of those variables is only available every 24 hours, but it shows all the time steps because it's showing a variable which is more often available. So, it's best to start simple with just one variable at a time. It won't crash or kill it. Just means that when you get to file, you'll have different frequency for each variable that's in the file, which can be a little bit confusing. Another important point I need to mention is that the list of variables provided is not the same for all modules. So, the list of variables I mentioned this morning is the expected list for each variable. So, some modules, some centers are providing the whole list, in the case of ECMWA for CMA, but some other centers are only providing a subset of those variables. So, not everyone, for example, is providing some moisture over the top 20 centimeters. So, to know, when you click here, you will see a list of variables, and that's really the list of variables that are available. But if I click another module, let's take Metro France, for instance. You will see that it's a shorter list. Actually, it's not obvious. This one has quite a lot. Maybe GMA has in GMA. I see it a lot. But it's a small daily average, actually. If I go to real-time daily averages. So, that's what you get with ECMWF. So, you have quite a long list of daily mean variables. But if you go to GMA, you will see this list is much smaller. OK, so that's another way to discover what you can retrieve, what is available. And that's why, I mean, this database is what is called a discovery tool. So, you have two options. So, when you select parameters, something that is also interesting is that for GMA is a very specific module because they provide all those models. The forecast starts at zero Z. So, day one is step zero to 24. Day two is 24, 48 on the run. But GMA starts at 12 Z. So, day one is, in fact, as you can see, step 12 to 36, 36, 16 on the run. So, that's the particularity which you can see here. Yeah? Yes? Well, you can select your date here. If you know your date is either on Monday or Thursday, you can specify the date here. For example, 1 January 2015 was a Thursday. If you just want this Thursday, then you specify around for 2015-01 to 2015-01. And then you will get only one single date. And then you don't put anything here. You step, I won't step zero to 24. Okay. Sorry. Oh, yes, zero to 20. Yeah, yeah, yeah. Zero, one. And two dash, yes. One, zero, one, zero, one. I want zero to 24 for two meter temperature. Okay. So, here I have two options. One is to view the mass request. So, that, as Adrienne only mentioned, is something very useful. It gives you an indication of what the retrieve all should be. So, I will go to that later on. It gives you an example of WebBBI script. Or you do the mass request. You go back to those Python codes to retrieve the data. Okay. The problem here is, I don't select everything. So, I select things. And I go to retrieve grid. And then you say, on here you have two nice options. One is to say, by default, you will retrieve the data globally. But you may just want the data over a small portion. And then you can click here on area where you say, I want to change the area. And you specify if you do that, you change. You can just select here which is the north bound of your domain, western, south, and east. And then you can retrieve a smaller domain. Yeah? That's a real-time forecast. This is a real-time forecast, not a re-forecast. I will go to the re-forecast later on. Why don't you use the terminology re-forecast? That's a more or less official way, yes. It's always a re-forecast or uncast. But normally the official way is a re-forecast. But once you're north on this website, on Tuesday, you have to go to re-forecast to get it. No? No. It's always a re-forecast. It's always a real-time. All the forecasts that were produced for real-time use. Yeah. So, he starts from here. You have all of them since 1st of January 2015. You have all of them till now. Yeah? Yeah. Small powers. It's like a dual act, isn't it? Like a comedy performance. November. So your four cards ensemble. Very often it's the future of your 48 days. Okay? That's a real time four card. You can't actually get it today because there's a lag of three weeks before the release of the day. Okay? Same day, November. There's also November 2014 2013 2012 and so on back roughly for a year. That depends on the model system. The four cards, like you call them the print cards. So what Frederick is showing at the moment is just accessing this one four card now for his competition team. It's done every Monday and Tuesday, Monday and Thursday for the use of the WS system. We met up with the system every single day. So tomorrow comes four. They would like four more four cards. Very cool. So Frederick did four more four cards for you. Press the link in the WS system. You get to wait until later than Monday. So there are already four cards to do. And Thursday, there'll be another four cards to do. Okay? So right on there. The difference between four and six. And then on the 26th, I'll also do the time mapping. Back into the past, but that's going to be it. So there's no problem though. The other idea is because this four card from the first member, that goes forward 48 days into the future. Okay? And this four card is made three days later. And this is also going 48 days into the future. So if you look at the four cards for Friday, it's a Thursday four cards. They're a leave of one. So it's the advance time of one day. We can have the Friday four cards from Monday. You've got a leave of four days. Or the Thursday in a week or four. So it's a leave of eight days. So essentially you've got... This is going like this. And then you've got another two. And another two in the same way as this. So at the moment when Frederick's looking at this date range here, that is looking at just this year's operational forecast going forward in time this way, when he's got a date of stream. So it's just a four card. But it's not looking at the hind card associated with that. Okay? Again, it's probably my fault. I probably didn't make it quite clear enough when I was showing that diagram earlier. You almost have almost two times I mentioned. Which in fact is the reason why the net CDF is not there yet, because the net CDF was also very confused with having two time dimensions. Because you've got time for the forecast time before the end date. And the start date when you've also got the time associated with the hind card date, which is the hind card associated with this present day date. So you remember I had that little diagram. I had a little bunch of four cards like this. And then I had it again for the next day. So what we're doing at the moment is we're going to have a really big ensemble for this year. But we're not showing you how to do the hind card for the ensemble that's associated with that in order to be able to calibrate it. So I will go to that later on, yeah? Another question? Yeah. The January. Yes. So it should stick in, you know, the whole month. And it's like, do you see that it's really weighted? No, no, no. It will give you all individual forecasts which are produced in January. So in this case, we get the forecast from 1st of January, 4th of January, 8th on the 1st. Yes. Yes? But it's not a monthly mean, huh? Yeah. Yeah. No, that doesn't make sense, no. Well, here you can select. You have a way to select what you want. Here I am in control forecast. If I go to better forecast, you will see for each module, you will have a new box which is set the number. For the same way, if we have a 15-member ensemble, you can select all of them if you want. Or you may want just one ensemble member, and then you just select one. You say, I just want ensemble for some obscure reason. You may want ensemble number eight. Yeah. And that's again that thing that I'm not... It's not that you have a number zero. Yes. Your ensemble is 51 members. Your first member, your control, it's not member zero. It's actually referred to differently as the control. So that's why in the menu on the left, you actually have two of them. Control. Control. That's like member zero. And then for third, members one to 50. We have a 51-member ensemble, but there's one that's like, should we say, stand out. If only you didn't have the perturbations of time for it. Yeah. And the other 50 do have the perturbations of time for it. Now, we call the control the deterministic forecast. And I'm not really... It's deterministic, because that terminology is normally referred to as the short to medium range, very high resolution system. The control has the same resolution as the ensemble. Again, that's why I want to give the overview of the system. So remember, the ECMWF system has very high resolution around it only goes up to 50. Ten days. Ten days, okay? It's just a 10-day single forecast. But you're throwing all your waste. That's got to really find this grid. There's smaller grid boxes. That's what's known usually as the deterministic system. So that's on the EPS gram of the Trieste forecast. That was one of the solid lines. Now, forget about that for the moment. For the EPS system, that's the intermediate resolution that goes up to 48 days. That has 51 forecasts. But just to confuse you all, one of those forecasts has the same resolution as all the others, but it doesn't have any static physics. There's no perturbations of time for the physics. So it's kind of... In a way, they kind of treat it as, okay, this is my best central kind of forecast. And the others are all perturbations around that. Like I said, I particularly, myself, I'm not so keen on that way of doing things, because, I mean, we can talk about it offline, but scientifically it does actually restrict the way you can do those kinds of physics because you have to be very careful the way you set up your perturbations. You really want to think 51 members as being just one big ensemble. It's just a little bit confusing because one of those 51 members is identified separately because it isn't a third, and the other 50 are the third, but they all have the same resolution and the same setup, okay, just that one of them hasn't got any perturbations. That's what's slightly confusing. It's not that you have a terministic, that's a very high-resolution 10-day forecast. And that's what's deterministic because there are no ensemble members for perturbations. It's just one single round. What they're saying is that you must have one member that needs that determinant. Do they have control here? Well, yes. And no, in a way. It's a little bit of a scientific issue, a mute issue. They have always traditionally treated it in that way, but they say, well, we're not perturbing it. That's perhaps our central model. That's our best model. The one we use for this deterministic system, okay? In a way, your perturbations are supposed to be sampling your uncertainty. So let's just imagine a very simple example. This is after your initial condition. You might have uncertainty in temperature of one degree. So you say, well, this is my analysis, but I've got an uncertainty. So really, in a way, it depends what your distribution is. Obviously, if it's Gaussian, then you've got a high probability of being in the middle of the action. It depends how those distributions are. So if it were a uniform distribution of your uncertainty, then all of the ensemble members would be equally likely. Or you're waiting on the uncertainty. You see what I mean? So you can tell you in a way that the control is supposed to be in the center of that distribution and the others are perturbed around that in your outline and the ensemble. But what you'll often find, if you look close to the EPS graph, and that's something that often causes alarm, sometimes you'll find you've got the envelope of the perturbations under the EPS. Quite often, the control will be almost a bit of an outline. Yeah, because of the stupastic physics in the way it's set up, so you'll find that... Just another exact same model version, if you can say. Yeah, in a way. So it's nuances at the end of the day. If you want to just take one member, you might want to take the control. On the other hand, you might want to look at the ensemble need, because then you're averaging across your uncertainty. You want to either use the ensemble need or the umbrella. Especially when you get on to longer ranges, it's not like just rainfall for the next day where you're quite happy that your evolution of deterministic dynamics is deterministic, when you get on to longer timescales, you only want to sample that envelope of uncertainty, so taking one member on its own is often not the best thing to do, not the thing you want to do, because instead of having a... Remember that example, from the Think About Nature paper. All those little scan maps of the rainfall? Well, probably if you took any one of those, it's going to be quite wrong, but it's probably a much better way to show information in the kind of weekly timescales to show the probability map. That's it. And it's the same way as when Frederick's showing results from the S2S system, he wouldn't be showing, okay, this was the skill at like day 27. It would be looking at weekly anomalies of temperature, you see. I need to look at the statistics across the ensemble. Sorry. A question, yeah? For the case of a model being compared, so the number of ensembles are sample between two models with the similar, or we can compare the easy envelope with 51 members with the other model with that member. Well, there are different ways to combine multimodals, actually. To produce multimodal. Just for comparison. For comparison? I think it's better if you want to clean comparison to have the same ensemble size, because a lot of scores are very strongly dependent on an aluminum ensemble. You are particularly providing a skill score of one. It's a very good question. One of the things here, the ensembles that set that up is different. You have these legs on ensembles and so on. And your skill, I mean, it depends on the question you want to answer. So you wanted to make a direct comparison between ECMWF and the metropolis. There were the two systems I demonstrated and the two I'm most familiar with. So ECMWF has four rounds per day. So in one week you've got 28. So probably I would tend towards past taking 14 members of the ECMWF. So in the week, you've got 28 and you're averaging it over a week. So one of them has a smooth like I was saying, lead time advantage. The other one has two births. But your information over the week then comes from the ensemble size. If you wanted to try and compare somehow the actual quality of the model physics and initialization methodology to put bit those. But then I would take 14 members of the ECMWF Monday and Thursday and start and actually compare that. Again, it's not quite in the middle because Monday and Thursday are not three and a half days apart. They're four days apart. It's still not exactly the same. You have to think quite carefully and this is why Frederick was saying that the S2S database is a lot less it's a little bit more of an advanced user database in terms of because there's much less standardization in the way the systems are set up to bring that kind of intermediate ground between the short-medium main GPS which you have to do on the fly and you don't see anything. Maybe we are sure that at least each model will have 10 members. Yes. Depends which model has its own configuration. Like 4 and 50 are very different. 4 every day. Yes. So there are some models with 4 days, 28 per week ECMWF has got 102 per week. The factor 3 is there but it's not a factor of 25 and a half. 25 and a half. Yes. Now I believe that there will be people like that just and when you are comparing with your strategy there should be a normal smaller range inside out so what do you consider something? Well that depends. If you have a model in your average it's not necessarily the case that your ensemble mean will be smaller. If you say you had a modeling system which your model physics is doing a good job if you think you have a anomaly of 2 degrees yes you have an ensemble persuasion right now but that ensemble persuasion might go from 1 to 3 degrees. So it depends. When they have the canals they show for the first time there was a pretty nice paper this year looking at the NAO so the velocity 5 to 50 for the first time seems to have predictability for the NAO. The rate of correlation signals much weaker in this ensemble. So we need to look and try and work out why that is. Another example where that happens if you look at intramural variability of the picture not rainfall in the head a lot of models can get the picado variations in the rainfall in the west Africa in the monsoon if they drive them we have observed FST but again the signal is much weaker which means there is something wrong there maybe landscape interactions are not amplifying the signal in the way you think they should maybe the aerosol signal is rough but it's not automatically the case just because your average ensemble you've got a weaker signal but you really want it in a system where the ensemble mean has the right anomaly some of them will be stronger and some of them will be weaker but you get the right signal in the average so it's not automatically the case that does happen you can also get systems which are connected so your average and your ensemble mean is actually larger than the observed signal. It's not the best way and it's not actually it's not the only way and it's not always the best way so you can make an ensemble mean but there are also ways you can look at forecast probabilistically which we which is the best which is actually better for an ensemble system in the longer term rather than just simply averaging out but if you average out and take the ensemble mean you actually lose information in that process the averaging process you're actually removing information from your forecast system so they're not the probabilistic way which are not used as often as they should be actually even within the member states at least in the beginning they're not taking advantage of it so we will be showing some of the ways of probabilistically evaluating forecast as well I was trying to give a really simple case for example with the umbrella earlier I know I've been a little bit flippant but I think it shows an important process it wouldn't just take the mean you want to look at the probabilistic information that's contained in those forecasts let's continue if we have questions more we can do that at the end so that we finish the demonstration so as I mentioned before you can change the AR if you want you also can change the grid it has been retrieved so you can choose which grid you want there's no much point to go to a grid finer than 1.5 by 1.5 because all the data has been archived into this grid but you may want to retrieve the data in 2.5 by 2.5 degree grid if you want a larger cell for example for image indices they are typically computed on 2.5 by 2.5 degree grid that makes the file much smaller and much easier to retrieve each model has its own resolution some are 50 kilometers those that would be more like 0.5 by 0.5 degree grid some others like Australia can be 150 kilometers but on the database all of them except the one which are very close resolution around 1.5 by 1.5 degree grid that's what you would get on S2S database that's default for ECMWF because we are very fine we use the same menu for the height resolution model which is a 16 kilometer model so it's not very practical I don't know why you would do that with S2S I don't see any resolution there you should be below this OK so this is the way to on the last thing when you have finished you simply select the retrieve grid and then it will generate the file into your machine so I will go back to the presentation because there are a few important points to mention so there are two ways to get the S2S database this is what I show now was a web interface so that's what we just demonstrated and as I said it's a discovery tool it's a very strongly recommended for the first time users if you have never used S2S it goes there give you a very good idea of what is available what is the structure of each model as I said each model is different it's a way for you to explore what is structured and it's easy to retrieve the data that's very visual easy to use and it's good for small retrievals now if you are a more advanced one we strongly recommend to go to Web API and which is basically a way to run pattern strips on your old computer so you need to register you need to install a Web API I mean if you go to this if you go to this website it will tell you what to do to install this software into your old computer and then you run the pattern in your old computer which will affect the data at this MWF and bring it back to you automatically so it's very brilliant you can run it on your laptop you will get the data automatically the problem with the Web API is that you need to know in advance what is the structure of the model basically and that's where actually the Web interface is useful because then it can give you examples of Web API scripts so it's very nice for intensive S2S data retrievals so it's recommended for advanced users and what is important is that the retrievals can be optimized which is much more difficult from the Web interface so I will come to that a bit later so this is the Web interface I showed and if you go as I said to the script you can get here the Python script it's not very easy to see which gives you an example of Web API command the data you have wanted to retrieve from the Web interface so what you can do is to copy this script and to run it on your own laptop and it will retrieve the data automatically so it's a way to understand the syntax I don't know much about Python but it's very easy to retrieve data from that because you know you see here the list of dates so if you want more dates you can add more you can include that inside the script you can specify the step you want all the start dates you can embed all that into your own scripts and it's relatively easy to understand it's in a master language follow the master language you don't need to know Python I don't know Python but I use it all the time Web API so I mentioned that you can change area on grid so that's another example of Web API script so you have the dates you don't care about class or data set you have the origin ECMF you mean if you retrieve ECMDUF model left up your surface variable the parameter is 165 it's a bit obscure but you can say explicitly that's 2T to meter temperature for 2T on the run on steps you have the list of steps you want to retrieve and that's it and the target is change me we create a file called change me well you can put the name you want yes yes so it's in the way for the Web API tomorrow's lab will show that yes we go more to detail yes we will do that too tomorrow yes so the second one which is important is the re-forecast so this one is quite complex much more complex than the real-time forecast because what is a very important information here is that someone probably want to take home is that here the re-forecast can see archive with two dates so you have three time dimensions the one is the steps so you have the handcast date which is the actual start date of the re-forecast for example 1999 01 01 1st January 1999 and then you have what we call a model version date which is attached to the re-forecast which is for example 2014 01 01 for Bureau of Meteorology so if you go to the web interface you don't need to know that by heart it will be coded by so why do we need the two dates the model version dates the reason is that we want to avoid the re-forecast to be overwritten so if we take a simple case NCEP which is a fixed re-forecast we are using what is in S2S now is CFS version 2 which has a re-forecast from 1999 to 2010 spanning all the days but what happens when we go to the next version of NCEP which would be CFS version 3 you will have another fixed re-forecast which will span exactly maybe more but at least a lot of those dates and if we don't have this model version dates then it will overwrite the first one and people will no longer be able to calibrate the old re-forecast from NCEP so that's why we need to add this additional information which dimension which is model version date so it's a bit complicated because for fixed re-forecast there is one model version date by model version so it's easy for the CFS version 2 all the re-forecasts right now are coded with the model version date which is 2.11.0301 which is the first date when the model became operational that's the rule for fixed re-forecast okay for real-time on-the-fly re-forecast is a bit more complicated because here the model version date has the same day and same month the re-forecasts are used to calibrate the re-forecast so the model version date will have the same day, same month as the re-forecast date but with the year of the model version date so let's look at an example if I go back you thought you were here for a workshop on metrology but you have tried to mention time so if you select a re-forecast you will see the menu is a little bit different from the real-time forecast if I select it okay so I select I will go for an easy one first NCEP which is a fixed re-forecast so you see this model version date 2.11.0301 you have no other choice than this one you can ignore the top here for NCEP it doesn't add much more because you have all the NCEP here on the menu on the right once a day so this date would be important of a CFS if you want to calibrate a real-time forecast you put the date of your return forecast and it will tell you automatically what is the model version date you should use because I don't know in 2019, 1st January 2019 you may have the choice between CFS version 2 or CFS version 3 and if you put the date of 2019.01.01 it will know that you need to use the CFS version 3 and will give you automatically the model version date of CFS version 3 so what are the handcast dates possible available for this and the rest of the menu is the same as for real-time forecast you select your step parameter and then you retrieve the grid so you can select your re-forecast dates and you go I am coming to that it's the same yes exactly so here is the ACMWF for example which here is the only example of on-the-fly model you can see that for 2015-2007 12 December 2015 that's all the re-forecasts which are available for this model version date which goes from the last 20 years from 2014-2007 to 1995-2007 so here you have a lot of version dates you have all of them since the beginning of January if you want to calibrate the 1st January in 2015 then you click on this one and it will generate it will take a bit of time also it will give you the list of re-forecast dates you have for 1st of January here we get it, the 20th January the model version the reason why I found this very confusing is because I think at the present-day date you need to identify with the model version anyway because this menu is a bit confusing you can ignore it for the time being it's a way to use it here to know because the model on the fly is changing and if you introduce a real-time date it will tell you what is the model version date which has the same model version as your real-time if you are for example on 1st of June 2015 you may not know which are the re-forecasts which are exactly which are using the same model version on 1st of June 2015 the list of model version dates should be the one which have the same model version cycle, the same cycle as the one of the real-time date that's the way, it should be complicated my best advice is for at the beginning ignore this real-time date just look at the model version dates on list of uncast here that would be simpler okay so that's anyway tomorrow you will have a practice and you will see how it works so that's how I have been through that and that's an example of what would be a command for re-forecast which should be different from real-time forecast you can see that the stream is no longer N4 but ENFH and you have two dates a date which is in fact the model version date so 2018-03-01 here for NSAP and the H date re-forecast starts which is the 8th of November 2010 so I want to spend some time on an important point which is the efficient data retrieval so it's important to understand all the data is organized on ECM first of all the data is not archived on disk it's not the usual FTP website where the data is on you need to know the file names which is a big file and you extract what you need from that here is an organized database which means you get what you ask you want one parameter only you will just get a tiny file which will contain this only smaller data but the data is archived on tapes which means that each time you do a request it is queued and then you have a robot which is picking up which is first of all looking where is your data how many tapes you need to retrieve and then it will go retrieve the tapes extract the data you need and send it to you no she's not a robot I mean as I discussed with Adrian in the old time it was a real person taking the tapes when I was at GFDL 20 years ago it was actually a person doing that one night I got a phone call from this guy I asked him why on earth do you need one of the times the same tape you have people uploading the same tape so anyway what is important to know is that when you send a request there will be three of your requests will be treated at the time a maximum of three the rest will be queued which means that when you submit a request you may be lucky and get your data within the next seconds or it can take one day on this time can be function of the number of tapes you ask if your request is very inefficient and you are asking 25 tapes it's clear you will be at the bottom of the queue and people who are asking just one tape will have priority over you that's why you need to really be efficient in the way you write your web API when you really ask very very big requests because that can be fundamental and I can annoy a lot of people if you request a very efficient as Adrian said if you keep asking the same tape one of the times you might have to have some complaints even if not physically someone doing it and you won't notice this back so much this week it doesn't just take the tape and then send the information to you what it does is take the difference on to a disk cache you won't notice that so much this week if we are all taking the same data in the cloud because once one of us gets it we will just get it from the disk cache and it will be much faster so you won't worry about that but then when you get home and you start to work more extensively then you will really notice if your performance is getting killed if you are not working on the efficiency then you will be going to need things of the tape and you won't be asking from the cache anymore so there are important, there are important information too so as I said your request is queued it's not automatic we have the problem several times from users later they get nothing so they think the request is dead they resubmit it 20 times and they clog the system and everything goes down so be patient and if you kill the request on your side it doesn't mean the request is dead on all sides it's still there yes it is on Wednesday there is a one hour session so it's important to understand all the data is archived on tape you have to limit the number of tapes you are requesting and try to retrieve as much as possible from one given tape that's why there is a sort of here a tree of the organization of the data on two tapes a time the upper level is a class then streams and versions type, year, month type level, dead time for instance so I will go more in details for ECMWF that's an example of the way the data is organized with the ECMWF Borm and NCEPGMA data as a helper loop you have Borm, ECMWF and NCEPGMA which means do not ask on the same request data from two different centers because they will absolutely be in completely different tapes then you choose the real-time re-forecast what type of data, counter forecast or counter forecast is the next level a type of level, single level or pressure level so we are likely to be in different tapes modal version dates time and steps, members, levels and parameters so for one given date there is a very good chance that all the parameters, all the pressure level all the members are on the same tape actually those tapes contain a few terabytes so you have a good chance to get it so let's take an example yeah so you'll notice again the website if you have a clue about this order so you don't necessarily need this because the website structure was designed to reflect this priority so if you notice that the left hand manual menu had the center it wasn't really easy to pick you can't pick more than one center because that's at the top, they don't want an inefficient retrieval all of the low level lines on this slide they order things on the right-hand side where you can click several at once so that the website well gives you an idea of that hierarchy you can select multiple parameters you know they're pretty low on that list and they're likely to be on the same tape and on the left-hand side where if you remember there was another option re-forecast for real-time that was on the left-hand side you couldn't do both, it was one or the other that's near the top so I just wanted to take that similarly so here's an example if you are really efficient you can retrieve up to one terabyte of data per day in the US in Kola you're able to retrieve one terabyte of S2S data it is a lot of work on optimization and it's one of the times faster than if you didn't do this optimization so here I take an example you want two meter temperature to DSST CIS for variables for NCEP CMA for all the re-forecast depth for all the better forecast so a very bad retrieval would be to loop over the sensible number on daily mean variables to sensible number two for two meter temperature on the one because you will keep asking the same type over and over and over what is a better script is to loop on the models and loop on the start date and retrieve all the better forecasts all the daily means variables together so there are examples in our wiki pages if you go to the BOM model you will see a web page for the re-forecast efficiency retrieval and that's the script that this guy from actually Kola used where here he asked for example for one single debt one single uncast debt all the pressure levels all the parameters, all the steps all the perturbation forecast all of them would be in one debt he loops over the debts and retrieves everything for one single uncast debt he retrieves everything he can and that's where you go much faster whereas if he was looping over the number of sensible members with all the uncast debt here he would be requesting he would request many steps at the same time on many many times the same request the same steps OK so you have more information on S2S data at home yes I wanted to point out a few interesting web pages in the documentation one if you click on parameters you get all the list of parameters what is the abbreviation for the web API scripts that's what you can put in the common line what is the unit of your data and if it's an instant news daily average or accumulated fields only if it's 6 or 24 so that's some information once again you can get directly from the web page from the data portal but it's sometimes nice to have it directly you have another web page a wiki page which gives you the status I mean if you go if you click here on progress status it will tell you what is available now what are the latest news about the database so here it tells you for example 7 models are operational some others are not are in progress and each time when a new model becomes operational it will be helped this web page is updated OK and then you have a web page which tells you the latest news about the web page what are the issues with the data which one are missing old stuff like that and if you click more in issues with data you can see for example if there is a problem with one data set for example we discovered recently that the temperature in coastal region with a view of meteorology model were wrong near the coast sometimes you can get a temperature of 600 degrees in some points so this is notified here so that people are aware that there is a problem with this data and to be careful when they use it and I think I will an important one too is the provided parameters once again this is an information you can get directly from the data portal by clicking each model but this is here a table telling you for each model what is available which parameters in Russia there is no 10 hectopascal that's the only one but if you look at 10 meter U you can see a few one center Austria doesn't provide 10 meter U on the one so you can see in advance what is available and what is not available for each model well it's below I need to go to the web page because it's a long list so if I go to the project I go to model description OK and well it's a bit blurry to be honest we have to put it in development phase progress status provided parameters all the colors don't appear it's a bit bizarre well I don't know why here but the color do not appear but total precipitation is here so some models are 6 hours you can see few models are 24 hours so that's something that is notified here if you go to this one it will be GMA for GMA there is no 6 hour total precipitation for instance well total precipitation is a sum of convective precipitation on a large scale precipitation but it's a bit fuzzy definition because it may depend a lot on the model physics what it means is there any specific reason why we want to use the unlinked print data? well there might be some particular uses that might be interesting I don't know if they can bet from there I don't know but nearly 99 or 999 years ago all they used to be was separate convective from that scale and hopefully we can do something together so it's extremely rare it's very much a model that's why actually total precipitation is available 6 hours convective precipitation we just provided 24 hours because we don't expect many people to look at it and I think that's more or less what I wanted to mention and what time for practice for tomorrow so tomorrow will be an opportunity to do an exercise to retrieve the data first from the data portal I think that's important people understand how it works and if we have time to more maybe Web API otherwise later this week I'll just be working on the set of scripts set them up on the desktop so there will be a very simple recipe to get things going on the desktop but I'll make sure you know how to do it it's possible to do it on the desktop with you when you go home you can do it on the desktop you can work on anything at all if you're running it from work you can do it for years it's a weekly page it's very easy to follow I can follow that I'm not here to make a model or anything myself it's very easy to set up I'm not here with the page but I don't know if I can click on it yes it works so there's some simple instruction here which you have to follow I did it a long time ago if I remember you have to just retrieve an executable it's a procedure easy yes that's all you need to know but it is an IT it's like the same buttons it's not like a Christmas tree in fact that's the access that's where they explain you how to install the key on the libraries last week it was like a very tough but then you want to find out about it the idea will be that you can use your laptop to use an example web page which gives you examples of patterns and some examples from S2S S2S examples you can go we have an example for each model you just need to copy that and just run it on your computer once this key is installed yes if you have access to Mars then you can it's better but you need to be logged on to the ECMWF machine you need to have a member state account for that the Python WPA or FDI is a script that you can run remotely I actually much prefer it's far far far more useful than going from you don't need to log on run a Mars script to retrieve the data so you're going to open an SFD dealing transfer it back to your home machine why go for that bother just use the web API set up the view and you just make a request directly from your desktop your laptop in your office just say I want this and the web API has it all created it goes there gets the data of the archive transfers it back to your computer and if you are familiar with Mars you will recognize Mars command that's exactly like Mars retrieval command with a syntax it's slightly different but Mars access is more for people in our member state special accounts no no no no no no I'm not going to say it's more okay any questions you're not I think we'll be off in the anti-narrow there you'll get to know each other I think you know the groups you are in tomorrow morning despite my efforts to make it as confusing as possible I think we're okay for tomorrow morning No other questions? Questions on anything? If I have, stay for a minute. Thank you. Oh, yes. I can find the shuffles.