 I wanted to basically, again, give a very basic introduction to analysis and reanalysis products and systems. The reason I thought it would probably be a good idea to do this today is I wanted to quickly show you also the database structure at ESMWF of how to access these products. The reanalysis at ESMWF, the present, shall we say, most up-to-date version of the reanalysis is known as ERA interim, because it was supposed to be an interim product between basically ERA 15 and then ERA 75, which in the end has been delayed quite a bit compared to the original timeline. So it turns out that ERA interim has been used for perhaps a longer period than was originally envisaged. How many of you in the room have heard of ERA interim before, or at least maybe even used it? How many of you have actually used it? A lot of people. How do you mostly access that? Are you taking it directly from the ESMWF database, or are you using it from the data library? How many of you are actually taking ERA interim from the ESMWF web server? OK, so quite a few of you. So this might, I might skip over this fairly quickly then in terms of the web access, because it seems a lot of you are familiar with the systems there. OK, so I will just go over this quite quickly. It's a little bit of an introduction to re-analysis and analysis products. Essentially it's one of the products we most often use for validation and evaluation of our NWP systems. So I mean we could use other data sources directly, and people do use those. So for example one might want to use synopsations if you're interested in surface variables. Often for applications of course we're interested in near surface variables such as precipitation and temperature. The applications we will talk about next year of course would definitely be precipitation focused. Then data availability depends on where you are in the world and how open source a lot of the data sources are. This is showing an example from a few years back now, seven years ago, typically the amount of data that was available daily on the global telecommunications service system. It's the system whereby data in basically near real time is beamed around the world and collected in data collection centres. So for example in some areas which are quite data sparse this doesn't actually represent the actual density of measurements. It's just the measurements that are available freely through the GTS. That's the same in Europe as well, the density, although it's higher it doesn't reflect the actual data availability because not all station data are connected up and collected real time. So of course station data has advantages, you have quite a large array of co-located information, especially if you're interested in clouds, you have for example sites such as the arms sites now and the cleaver net sites which have a whole host of supplementary radar data co-located by the station. But they're not always often available locally, especially if you want the whole suite of processes measured, especially the radiation ones, and you have problems with handling data gaps, bad data and the representativeness over complex terrain of course. And then of course you can use satellite, satellite retrievals, so especially for variables such as precipitation, there's a whole wealth of satellite retrievals out there that provide precipitation, which you might want to use for evaluation of your forecasts and monitoring a coverage depend on wherever you're talking about a geostationary or polar orbiting satellite. And so these are combined to try and for example make retrievals of especially surface properties of precipitation. Other variables, it's very difficult to get near surface information, so for example if you want to get temperatures it's very difficult near the surface with microwaves because you don't know the emission properties of the surface, it depends on the texture, vegetation, soil moisture and so on. So it depends on which variable for many, many properties it's very difficult to get information close to the surface over land, over oceans of course, it can be easier. Especially for precipitation as well it's really difficult to know what and to actually use what the advantages are. If you look at rainfall in Africa you'll find a whole array of papers that discuss into comparisons of various data sets, trim, seamoth, they're all in there, now there's going to be GPN added to the myths, I mean there's a whole wealth of them. Trying to know which one is best is extremely difficult, you'll find the papers that recommend trim and one area seamoth over high terrain due to its use of microwave channels for ice retrieval. So it's really difficult to know what to use, so I've just given some example but this list is by no means exhaustive and if you only need monthly means then emerged products such as GPCP can be actually very good but it's not available near real time. If you want something that's near real time for monitoring purposes then there's seamoth and fuse for example which is only available over Africa. Trim no longer, it went down last year which is another aspect you have to think about when you're using satellite retrievals directly is that the satellite has a certain lifetime and trim for example finished last year, now there's going to be GPM but it only recently came online. So it's obviously a high quality instrument but you don't have the history of measurements, you don't have the coherency. So on Wednesday when we have the introduction to the data library a lot of these kind of data sets for example trim and so on are also available through the data library so you'll see how you can retrieve some of these, how to manipulate them, it's a nice tool to be able to access these, manipulate them, take out shall we say areas, cut out over certain locations and not have to go through the pain of going to each individual database directly and make your own conversions. And as I mentioned already a lot of variables that you might want are not actually that easy to get from satellite especially for near surface and especially over land so even things as basic as surface temperature can be quite inaccurate. So station data are good where they exist but they require careful treatment, you have your problems with basically the network density and so on and then satellite data can be useful for a regional view but uncertainties are large and not all parameters are available. So a supplementary source for basically information about the climate states is analysis and reanalysis. So these systems were originally developed because of this initial value problem of numerical weather prediction. So analysis systems, their sole purpose is to basically take information from a lot of different measurement sources, station data, satellite data, radioson, balloon soundings and so on and try and combine these into a picture of the atmosphere and a picture of the atmosphere that's not only accurate but is also representative of a balanced state. So you don't want a picture of the atmosphere which basically has for example a lot of instability that as soon as you let your forecast system go bang, you send off a whole load of explosive convection, tropical cyclones going off everywhere. It needs to be a balanced state and a satellite that's in equilibrium with your forecast modelling system. So what we want to do is we want to take a wide variety of variables from a wide variety of instruments with vastly different measurement densities where you need to take care to reject bad measurements and then combine them into an assessment of the atmospheric states and as I said it needs to be somehow a near balance with the forecast climate and also in large scale balance so it doesn't sound very easy in fact does it? I mean at the end of the day it's an engineering problem and it's an extremely difficult one. A lot of the environment is in forecast quality over the last two decades. I've actually been in improvements in understanding of how to achieve this sole task. So we tend to concentrate again on our advancement on how we represent the model, the physics of how we represent the atmospheric processes but this engineering task of putting together observational information and actually supplementing it with new satellites as they come along has actually been one of the fundamental sources of our advance in the forecast quality so Frederick showed one of the charts of progressing forecast quality this morning in his talk. So in data simulation I've just got two slides on a rough background of how it works. I'm not going to go into the details of the equations or the background just to give you a rough idea but essentially what we need to do for each measurement source and type we need to basically define a radius of should we say influence of each of those data types both in the horizontal and in the vertical and that would depend on the data density of course so something like satellite data you often need to thin out the data because you have very high density measurements so you may throw a lot of that information away something like a radio sond which are launched very infrequently you all know what a radio sond is, yes? So it's basically a balloon that floats out with a little packet that sits underneath that has basically a GPS connection and it measures temperature and it measures humidity and you know where it is so you assume it's affected with the winds so it measures four quantities, your U and V components of the wind and your humidity and your temperature and they're quite clever little things actually, they cost quite a bit of money I think the last cost I saw a viscider is about 800 euros and the thing doesn't come back so it's quite expensive that's why there's not many of them around you know every country if you look at a map of Italy I think we have two sites where they go up in Italy each day one is that we didn't adjust up the road, I mean they're expensive it's the man pair as well, they have to be manually launched so you need at least two people to man one of these stations it's not a cheap undertaking at all and they're quite clever so even the humidity sensors, it's not just a single sensor there's two of them, they pulse heat them between them so one's heated by the other measures and they pulse back and forth so they don't suffer from freezing problems as they go through clouds because there used to be a lot of problems with them icing up and measuring saturated conditions as you came out of clouds so this whole technology of measuring the atmosphere is extremely complicated it's not just the satellite side that's complicated and just this simple task is not that straightforward because you can imagine with this schematic just one example where if you have a strong inversion in temperature actually I'm measuring, you might simply say well my measurement has a certain influence in the vertical well in the well mixed boundary layer a measurement in one point is representative maybe of the virtual potential temperature throughout that whole layer but as you get close to the inversion then maybe the measurement here has nothing to do with what's going on above the inversion so you can have, I mean it's the same in the horizontal as well when you're going across France or so on you might find a complete cut-off between the two air masses so actually putting information in a measurement here and using it to tell you what's going on here might actually do more harm than good and then essentially this is a simulation in a nutshell and again don't worry if this is a little bit too complicated but it's just a graphical way of showing what's going on with these assimilation systems so you'll have a window over a witch you want to basically determine shall we say the state of the atmosphere so this is typically 12 hours long and during that window you will have I'll use the mouse actually to point so you can see on both screens so you will have a number of observations of a certain property so we have all sorts of observations we have brightness temperatures being measured by satellite direct temperature measurements, pressure, humidity so all in different states so you can imagine this for example this is actually a multi-dimensional state you can just think of it as temperature for the moment but the nice thing about the simulation system is it's bringing in all this information in in terms of different types of variables and then basically the whole system is based around the forecast model so you can imagine that we start at the beginning from some kind of first guess state of the atmosphere and we run a forecast basically forward in time and it's not going to be the same as the observations because we have uncertainty in the initial conditions as we said this morning and we have uncertainty in the model physics and then the whole goal is to basically find a minimal perturbation to the initial conditions such that if we start from this perturbed initial conditions we minimise the distance of the new forecast from the original one but also to the observations and by doing that the nice thing is is that this final forecast because it's based around a forecast model the 3D or 4D because it's evolving in time picture of the atmosphere will be balanced and it's also using the model system so it's also going to be in close balance to the climate of the model so you don't want to just simply fit through all of these observations now we also make a number of other assumptions when we take this first guess if we have observations that are too far away from the first guess then we kind of say well we kind of trust the first guess forecast to be in the ballpark so this is a bad measurement and we throw those away we reject them automatically now of course you can see this could be a bad measurement maybe a temperature of minus 5,000 Kelvin so it's obviously but a hung but on the other hand it could well be because you may have a tropical cyclone that's misplaced so you have very severe winds here and maybe your model in the first guess has it over here so you have strong departures in both locations and then you end up throwing away observations I don't know if it's still the case now but a lot of times they used to have these drop-sons the Americans have some planes that are set up as soon as they know there's basically a hurricane coming and they'll send out their planes to try and send all these drop-sons around to get highly detailed thermodynamic and dynamical measurements around the developing system if you haven't got the system in the right place your model goes oh that's a bad measurement and just throws all this information out when it's not used and see there are all sorts of problems there with this automatic screening the nice thing about this system though is because remember this is not just temperature this is a multivariate state so you use for example forward models to take the fields of like humidity and temperature and if you have a satellite you try and simulate what the brightness temperature of the satellite would actually see to get the departure so what this means is what's the advantage of this is that it's not that your analysis of the temperature is only impacted by the atmospheric temperature measurements so when we send off a radio son ballooning it measures temperature so you don't think of it as a kind of simple interpolation technique that those temperature measurements then give me my temperature field in the analysis and my wind measurements give me my wind field and all of the information is combined jointly to give this overall global view of the atmosphere and so in fact if you actually look I think I've got a slide later but I might skip over in the interest of times but if you look for example at the African Eastern Jest just as an example the winds that are analysed in the system are more impacted by the temperature measurements which affect the gradients of temperatures and the pressure gradients are the wind measurements so sometimes that takes some people by surprise if you're all very familiar with the system maybe not but you would think that if I take away if I hide all the wind measurements from the system I would mess up the wind field but in fact you make the wind field less accurate by taking out the temperature measurements in some locations then you do the wind because the temperature gradients have a much bigger impact on the balance of the atmosphere and the overall wind so that's one of the strengths of this kind of system it's not just a univerant retrieval of one variable of just temperature or winds from different observations so you're not using one satellite to give you winds and another source of information to give you temperature you're using all of these sources combined to give a balanced state but your wind measurements will affect your temperature and your temperature will affect your wind and your end product is a balanced state of course what you actually do with this minimisation is to do the minimisation you need a linear approximation of your model and an adjoint and invoice so it's like a multi-stage iterative process you do one minimisation then you use the minimised run as your new first guess and you go through another minimisation using the simple linear physics to give you an idea so the recipe is you make a short forecast as I said that's your control you throw out this bad data okay and you use this minimisation process using the linear and adjoint model to find this perturbation okay and then you use that as your new control and you go through this process so you have three loops of this minimisation inside the UCNWF system again if that hasn't changed recently still three yeah I'm always worried that they put something in two months ago I didn't know about and then I'm out of date okay so you hope if you did this infinitely you would eventually converge that's not always the case you can have situations where in the loops you jump around and you're not converging but I hope it should be when you finally converge this final forecast through this farval window you can then take snapshots like a photograph in time and say this is my analysis this is my best guess of what the state of the information is here so remember this is our 12-hour window so maybe we're starting from midnight and we're going through to midday and then we say okay let's take a photograph here and here now this is simplified in fact because the windows are actually slightly offset okay this is just to make it I've just simplified it a little bit so you actually have a slight staggered so it's not exactly at the end of the window here but you can say you just take basically a snapshot at 6 and 12 and then you can start the whole process again for the next 12-hour window it's all done in near real time it has to be fast and efficient because you need that forecast out in a few hours to be useful now the reason why I'm showing all this is there's a little bit of confusion sometimes about fluxes so you take this snapshot and this will be your temperature or your humidity field okay but what about rainfall okay or what about radiation now it wouldn't be much point there wouldn't be much point to look at 6 o'clock look at the rain flux and then just look at the rain flux at 12 and say okay that's my rainfall why not because rain of course is highly variable so it can be raining like mad here one hour before maybe it wasn't raining at all or one hour afterwards so the rain can be highly variable in time so what do we do with that well to actually get the fluxes so it's such a radiation that which you want to be conserved we want to accumulate them over time so what ECMWF does is that from inside the system as well as running this assimilation window they run forecasts short forecasts from 00 and 12 forward in time and they are used to give you the fluxes okay so really I should have put this arrow here actually to be precise because it doesn't start from 6 it starts from 0 and 12 so you would start from this picture here and you run forward in time okay so what it means is when you're using an analysis product if you use temperature it's the direct analysis of what you think the atmospheric state is but when you use analysis rainfall it's not the analysis it's basically you're just using a short prediction a rainfall forecast okay so there is basically a difference between those two and of course you have a choice then of how you actually retrieve that rainfall okay so for example what you could do if you imagine you've got a so this is 00 12 and 24 you could have a forecast starting at 00 and running forwards 24 hours so you simply take the rainfall accumulated over that 24 hours for that single forecast okay but you could also take a forecast starting from 00 to 12 and then the forecast starting from 12 until 24 and add those two together you see what I mean they both give you a flux of rainfall over the 24 hours except this is from to 12 hour forecasts and this is from one 24 hour forecast now with era 40 for example how many of you have heard of era 40 the previous generation quite a few of you so what did people do somebody used era 40 what did people do for rainfall for era 40 and why an era 40 user anybody any clues do you think they use too short 12 hours or did they use a 24 hour option A or option B okay trick question my diploma students now I'm always asking nasty questions like that is it A or B no actually it's C no they didn't do either of those two what they tended to do is take a forecast here from 12 o'clock the day before throw away the first 12 hours and then take the difference between the range of 12 and 36 now that seems a bit bizarre doesn't it because okay here is your first 12 hours so you have a lead time of 12 hours this option here well this is a bit simpler because we don't have to bother adding the two fields together but it's a lead time which are average of 12 hours lead time instead of an average of 6 hours lead time because it's going from 0 to 24 now we're actually throwing away short range information and taking from 12 to 36 why would you want to do that spin up exactly it's a spin up because in the era 40 it's an older generation of analysis system from 15 years ago and there was a problem particularly in the tropics after the SSMI satellite went in inside the system it wasn't quite in balance so what tended to happen after about 1987 is that the satellite would make the analysis the state of the atmosphere more moist than the model wanted it to be in terms of its climate so what happened in the first 12 hours it would just rain it all out you would have a spin up before it settled down so you used to throw away this okay now the reason why I'm emphasizing this is when I asked who used era a lot of hands went up in the room okay and you'll see a lot of scientific articles where they say we use era interim temperature and rainfall analysis okay now what I want you to do is when you're reading a paper like that I want you to see how many of those papers tell you how they retrieve the rainfall you count how many of those papers tell you how they retrieve the fluxes now it seems like I'm being pedantic but it's actually really important because where is your reproducibility if you want to take that information and repeat that you don't know how the analysis was actually conducted so it's a little bit of a bugbear of mine which is why I'm on my soap box now about it but if you write a paper and you're using your interim I think it's really important to state how you achieved your fluxes and put it in the paper we used the 12 hours and the 12 hours and we added them together or we perhaps took the whole 10 day forecast and we did it every 10 days but you need to put it into your papers how it's used you look how many papers actually state that I think it's less than 1 in 10 so you don't know how they use the data people tend to describe era interim rainfall as if it were just like fuse rainfall or trim rainfall but it's not the case okay so in which respect do you mean reference of how to use this the main thing is understanding the background of how these fluxes are put together and so the main thing is if you are writing a paper you to put it into your paper as a reference of how you're using the data that's the point I wanted to emphasise exactly yes so I'm going to come to that at the end what I'm going to do is I'm going to quickly show I'm not going to go into a lot of detail because I want to leave a lot of time to actually go for the S2S database Frederick after the break will be showing the S2S database what I have done is I've put a little exercise sheet for those of you less familiar because a lot of you I thought might be familiar I've got a little exercise sheet that you can have a look at just to play with some of the data I just want to show the interface the other thing is if you're using the web interface you can't do this 0 to 24 because you can only go up to step 12 on the web interface if you use the web API Python method to remotely access then you have the option of the 2 because you have more flexibility in specifying your run times so anybody using the web access can only take this and this and add it together so as I said I'll skip over that because I've said it already so what does the system actually take into account while we've machine this already pressure information there is actually very little you can use from synop stations it's very difficult because of the non-representativeness of the station data the complex topography it's actually quite difficult to use other variables from the station other than pressure I think some humidity information was used during the day but not at night we used it all day around now I can't remember I think it's already done now it used to be just during the day with the old system the radio sond so that's a Versaila RS92 Sond in Ghana and you can see how few of those they are it's not just Africa where there's sparse it's everywhere they cost a lot of money some aircraft data is assimilated but again of course you're reduced to actually using information where the flight tracks are and satellite data I need to update this this really steep increase it does flatten off a little bit but it's still increasing the amount of satellite information that's ingested in the system and this has been a real revolution so satellites have been around for a long time but in terms of the amount of information that we use now in initialising NWP it's really taken off in the last decade and that's because we didn't have the analysis systems before to be able to actually use this information in a sensible way so it would use imagery for monitoring and so on so that's really been a revolution should we say I think you showed a very similar plot to this this morning Frederick and this just shows the top line these are different forecast lead times so it's like day 10, day 7, day 5 top line shows northern hemisphere the bottom line shows the southern hemisphere the key thing I wanted to show here is how they diverge once that satellite information starts to get used globally so you find that the quality of the forecasts starts to basically match in the southern hemisphere and the northern hemisphere beware of interpreting this to mean that we can do away with conventional measurement sources and only use satellites that's not the message to take home here why? because when we launch a satellite these things being launched into space is not exactly a smooth process you're going through massive temperature contrast the things being rattled around so once it's launched this satellite doesn't matter how carefully you're trying to calibrate it in the lab it'll be all over the place of setting so how do they calibrate it? they calibrate it against the forecast systems so the satellite's not giving you an absolute measurement it's just giving you anomalies the only thing that ties this analysis down to a truth at the end of the day are the radiosons ok so although this says yes the information from satellites has improved the southern hemisphere and brought it up to the northern hemisphere but if you got rid of the radiosons the whole thing would go hey why you've got no way of calibrating your system it's the radiosons that tie you down to an absolute value that calibrates the satellites which then give you the supplementary information in the areas where you don't have the conventional information so the thing I wanted to get across here is when you're using analysis you know sometimes it's used a lot ok and then sometimes people say yes but you can't trust this from analysis it's only a model now that's to a certain extent true it really depends on what you're looking at so if you want to look at something that's very dynamical the winds then you'll find that they can actually be very high quality if you compare them the further down the thermodynamic chain you go towards clouds and precipitation the more model dependent that product becomes ok so you'll find that if you're looking at an analysis of rainfall then yes there is quite a lot of model in there in fact for the analysis system for the malaria we tend to use temperature T2M from era interim but we'll use some of the retrieval products for precipitation ok so there is it's almost like a league like the premier league and then you've got the championship and then you've got the lower divisions of parameters in terms of your level of confidence in those parameters ok so what I'm going to do is I had a couple of extra slides just about some examples in Africa with data denials ok but what is reanalysis as we already said this morning operational systems used for both medium range but also S2S like the system at ESMWF they're updated all the time to take advantage of updates to satellite systems updates to the model physics updates to the infrastructure of the data assimilation system ok so what this means is the analyses are not coherent in time ok because the model system is actually changing ok so if you look at a temperature trend it could just simply be due to the model physics changing over time and not necessarily due to a real temperature change ok so one way to actually improve this is to actually run reanalysis so you're taking one analysis system and you're using it to go back in time and look at the data sources and then rerun over time ok so the advantage is you have a single coherent system ok so it's one assimilation system what would be another advantage to do this a reanalysis so one thing is that if I want to look at the analysis from 2000 ok I'm basically using a much older modelling system much older analysis system than I have now but what else what other advantage would I have of rerunning a reanalysis that's right so you have a new model system so if we started a new reanalysis now of course we'd want to take the latest state of the art version of the Eastern WF model from 2015 and we'd go back maybe to 1960 and rerun it through time and so we would hope that the new state of the art system now would have a better analysis for the year 2000 then the era interim system would have because the era interim system was a new state of the art in 2006 one decade ago what would be another advantage to running a reanalysis so we would have a newer model system with newer model physics a better data assimilation system we could probably afford to rerun it with a higher resolution smaller box sizes so the system would be improved any other idea physics is good the whole system all the aspects the resolution would be better there's one other thing actually interesting I was wondering if anyone would get it, when I tell you you'll go of course it's obvious remember the analysis system when I was talking about it is real time you're up against a barrier of time that information comes in you're going to have an efficient system you want the forecast out within a few hours it's no point me giving you a forecast for this week but giving it to you next Monday, it's too late then no you need it now so, if I launch a radio sound okay, in Ordina today but then I go for coffee, I do some other things first and then at six o'clock I think oh I better send that data now it's fairly automated in most places but sometimes it's sent manually or there's some problem with the infrastructure the network say that is late in arriving if it misses the window it doesn't go in the analysis because that analysis system is real time so, you hope to get the data in but a lot of information comes in afterwards there may also be research campaigns AMA is a good example in West Africa a massive research campaign so that data is not taken and delivered real time but you might have high quality instruments that were used in a research campaign that the researchers then process clean up remove some of the obvious bad data process it and make it available in a format that can be used in these assimilation systems so when you do a re-analysis your big advantage is you also have a lot more in the way of information because a lot of data that was late a lot of research data and other data platforms can be incorporated that weren't available at the time so sometimes people miss the aspect of the two systems however what's the disadvantage is expensive so you'll find that compared to the current state of the art system the re-analysis system will tend to be much lower resolution plus once you start it you can't keep running a new re-analysis every week or every month essentially like I said era interim started in 2006 it was state of the art then but in the meantime there's been another decade of model and assimilation developments at ESMWF which haven't been incorporated into it so the analysis is always using the latest operational system it's your highest resolution it's using the latest observational suite the model and the observations though and another aspect is it's not always easily available because your analysis system these are quite sensitive it's not very easy for somebody from the outside world to get access to the operational system analysis from ESMWF they allow the era interim to be used as a resource but the operational analysis you can't just go onto the site and say take today's analysis as a grid or a net CDF file if you do have access the operational analysis is ideal if you want to look at a recent case study so if I wanted to look at a high impact event for 2013 okay I would use the analysis and not the re-analysis if I had access to it because it's a newer system and it's much higher resolution okay if however you want to look at diagnosing the, I don't know, the NAO or changes in I don't know, ENSO variability over a long period of time you want to look at interannual variability or you want to evaluate your hindcast suite from S2S going back over 18 years then you certainly don't want to be using the analysis because 18 years ago the system was very basic 18 years ago let me just do my date calculation I don't think it was even using 40 bar still 3d bar at that point so it's a completely different assimilation structure okay so then you want to be using re-analysis so it's not your best of your best but you have more continuity okay but you have to remember for recent dates I say obsolete it's no offence intended but it is a 10 year old system almost now from 2006 okay so it's ideal for longer term investigations the reason why I put this up again there's a lot of people who've used re-analysis but I sometimes find that people show case studies and they're using re-analysis when they would perhaps be better off using the operational analysis if they can get told of it maybe for a collaboration or if they have a member state account or something I'm not quite sure what the access rights are for example or another national centres analysis system okay so I've tried to sum this up in this kind of schematic here where re-analysis you have no improvements in model but it's continuous in time so for era interim for example the model and assimilation system is fixed so that was actually used in 2006 okay you have all the new data in terms of satellites coming online so you have an improvement in the observational system roughly up to 2006 and then to a certain extent that's not true up to 2006 because a lot of the new satellite systems the era interim doesn't really know how to incorporate that information because it hasn't had the updates that allow new satellite products the operational system keeps improving in time but of course you don't have that continuity but you are able to take advantage so you get this increasing gap in quality actually go past 2006 but you've got this huge detriment of course if you go right back to 1980s at the beginning so it's just to kind of emphasize those differences there was something else I wanted to say about that don't remember and just to give you an idea of how often these things are updated that's what I was going to say so era interim was from 2006 the previous generation was using a model that was operational in 2000 and then era 15 do you remember what cycle era 15 was from roughly what year now I don't remember I have to look that up there was a long while ago I need to check that out so my take home messages are analysis products are a useful supplement to observations but you have to be careful for which variable instantaneous fields are from the model analysis but fluxes are from short range forecasts and you have to be careful about how you combine them and for recent case studies you might often be better off actually looking at the operational system rather than the reanalysis product so it's 10-3 what I wanted to do, I'm not going to spend an awful lot of time on this because some of you are already using the product so I did want to just show a very quick demonstration of the server just to give you an idea without going for a web API and then I have a little exercise sheet online after the break when we come back from coffee we're mostly going to focus, Frederick's going to go through and introduce the web interface for the S2S and what I'm going to do is if you have any of you questions about era interim in the lab you can tell me to one side we can do things on the side I can show you a few bits and pieces and also an example for the web API does anybody have any questions for I just show the it seems to be a question even Paola when we were talking yesterday you were asking me a question about the rainfall as well because I find that the flux is a thing that confuses people the most actually you were going to say something, is there a new separate wiki that's being set up for that they won't let you do the 24 hour either would they know because I normally log on locally and then take it, so I have that option yeah I'm not sure that's one problem sometimes is I'm not completely familiar with the access rights from somebody outside is not a member state or a local person so I apologize in advance if sometimes I tell you something and it turns out you don't have permission to do it so that seems to be the case for the 24 hour rainfall because when you answer the email you say well you normally use the 24 hour and I say well on the web you don't have that ok so let's just spend a little bit of time I know it's hot in here and some of you are probably snoozy it's the first day but we won't spend too long in it so the first thing I wanted to say was that the web page of the program that stays here after the workshop and so those lecturers who are happy to have their material distributed it will be appearing on here under links each of the talks but what I'm going to do to kind of keep things in near real time in case of some of you want to browse the talks is I'm actually collecting on this Dodds server a different material so power for some of the lab classes is putting information there so I've written the address up there because it's a little bit small to see let's see if I can blow it up here as well but no it's only changing the font and not the address so it's clima-dods ok HTTP and then it's users which I think is a capital U yes and then it's smr2714 which is the number of this workshop so treat that as a temporary resource because once everything goes on to the web page of the workshop then this will eventually disappear ok but it's to enable us very simply to do things in near real time during the course of the workshop ok and so I've already popped my lectures there in this week one ok somebody over coffee can just check to make sure I've got the permissions right for an external user if I click in on one of those and see if it's visible to you so there'll be some things that will be as well like data library in python that where in week one there's also a very simple little handout ok like I said I don't want to spend too much time on this because a lot of you are familiar with the analysis but it's just got a very simple little basically little synopsis of the lecture I just gave just a little description ok a couple of examples of what we're going to do through now ok also an example of that then later on there's also an example of a web API retrieval links to the wiki because I have to say now actually documentation is improving at the ESMWF so for the S2S for example Frederick will demonstrate the wiki pages so that's much more flexible that was actually a big shortcoming of the ESMWF pages in the past that they were very static and it was very difficult for people even if they wanted to to put information onto the web ok I've also and again I wasn't quite sure how much this was going to be needed Grib a net CDF how many of you are familiar with using net CDF information so that's all of you pretty much ok what about Grib less ok good because we're mostly going to be using net CDF so again if any of you have not seen net CDF take a quick read through that synopsis it's just a couple of paragraphs and we're going to do a little bit on the fly that you can pull me to one side in the lab and I can sit with some of you to try and get you up to speed also after hours or whatever on the net CDF side of things because I don't want to go through a lesson on what net CDF is and real basic stuff for those of you that are already familiar with it but on the other hand I don't want people to be left behind if one of two of you are not familiar ok so have a look at this worksheet if you're not familiar with net CDF it's got a description with a few tools such as a little bash overview of a couple of basic commands just to get you going net c view and net nc dump nc view and nc dump hands nc view ok less nc dump same people of course ok so they're very useful tools for quick looks so climate data operators users ok about half of you we're going to play this by ear then we may put an hour in a kind of split class on CDO this is quite difficult to ascertain from the applications and a few little exercises and I've even put a little tiny bit and now I just on to some simple plots in R as well so nc view gives you a quick look now the other problem this week is we had huge conversations about which packages to use for processing and we said oh should we use octave or there's R we can use grads but then grads doesn't do everything well but then things that do everything well can be quite complicated and even these don't do everything then you've got ncls got built in things but it can be very clunky so that was really difficult to decide and so I mean I'm always of the viewpoint that there's no perfect package maybe familiar with a certain package and not want to change so we're going to try and be a little bit flexible on that without going down the path of one package okay but if there are requests again we can see what we can do maybe even in the second week perhaps run a little lab optional for R for example or for the MGO for example there's a really nice built-in package for MGO statistics and ncls so beautiful already made examples of that in octave although we've got some octave scripts for MGO as well that Powell has done so that was the real big difficult question so have a little look at that it won't all be useful some of it will be boring it's just got a little tiny snippet for all of those okay and then I wanted to just quickly show this it's 5, 2, coffee's going to be ready at 3 I'm going to just show like for 5, 10 minutes very quickly okay just to demonstrate so that when we come back we can move on to the S2S so I have logged in here and you can see I've logged in because my name's up here in the top right corner let me just zoom in a little bit is that reason be clear can you see that at the back the writing let me see if I can get it a bit bigger so one of the things that you will be required to do for S2S is set up an account how many of you already have an account at ESMWF to access okay good about that of you so those of you that don't your homework for tonight is Frederick will show you how to actually log on and get an account it's free you just have to type in your email address and name and you get a kind of password sent back to accept the terms and conditions it's just so they know who's using the data and that you agree not to then put it onto your blog site and put it onto the rest of the world okay so your homework is you can't come to the reception until you've done it no joking but tonight sometime use one of the desktops or laptop to log on and get a password please do it it's really important make sure it's running because tomorrow when we're in the lab we want to hit the ground running so we don't want to be in there trying to sort out oh I haven't got a password I can't log in so please do that tonight if you haven't already got it once you've done that this will become quite familiar the reason why I wanted to show this first is this is similar to the S2S and the TIGI databases but it's a little bit simpler so what you'll find on the left is you will have a list of like a menu of different kinds of aspects of the analysis so we have surface model levels, pressure levels the information is available I'll just focus on the surface for the moment we're going to look at daily which we already have so we have era interim daily and then the interface is actually fairly simple we only have a few choices that we need to select so the first area at the top is just simply the dates we want to retrieve so you can either have a from to range of dates or you can select basically months of data at a time and you can also in a shorthand way select a whole year or if you want just all marches you can click on all the marches which is quite neat so if you want to do like a DJF you don't have to retrieve the whole years data you can just click your DJF and get it for all of the years era interim starts from 1979 so it pretty much covers the light period if you want to go further back in time there is actually now recently released I haven't talked about it here but there's era 20C C is the one that brings in the assimilation and then CM is just the model only or have I got it wrong way okay okay and then just C on its own is just the analysis one just to make sure I don't get that mixed up so there is actually a much longer 20th century analysis but it has very limited observations so it doesn't even use the satellite the idea is to really make sure it's consistent in time but it's a lot of model and very little observations okay but if you're interested in that and you want to go right back to the beginning of the 20th century I can talk about that offline I just wanted to flag it but for most of our purposes of course with S2S where the hindcast just go back on the order of two decades then era interim is perfect for our needs okay so in my 10 minutes the only thing I did want to actually demonstrate is this thing that causes a little bit of confusion here okay and that's these basically what we got is two lots of time variables we've got time which says 0 6 12 and 18 okay and then we have step okay so if you put time and step together of course you get another word it's like a time step so what's time and what is step so time is your analysis time when I was talking about that snapshot okay that's what that's referring to so remember you had those two slots in each analysis 12 hour window so two of these will be from one window and two from a subsequent window okay so that's your analysis time step you're referring to your forecast step you know so they're both moving forward in time but one of them is your static analysis time and then from each of these 12 and 0 times you're launching a forecast forward in time so what we're essentially seeing is if you can imagine a time for each day we've got 0 6 12 and 18 okay launching from two of the points forward forecast okay which is going 12 hours forward in time actually goes more than so again so each 12 hour window is a cycle so the analysis times are taking your basic sampling two points in a window to give you the analysis time okay and then you'll have a forecast that goes forward it actually goes longer so if you log on you can go all the way out to 10 days but on this system we only have steps up to step 12 okay so if I want to know what the 12 o'clock temperature is over 30 okay what would I set for time 12 does everybody agree and what would you set for step 0 that gives you the analysis at 12 o'clock okay how else could I get the 12 o'clock temperature over 30 so what would I select for that but which time step would I take if I take the 0 forecast 12 hour exactly that's the only other way with this combination to get the 12 o'clock because my forecast only go out to 12 so I can have the 12 hour forecast that started at 0 that gives me the 12 o'clock temperature or I can take time equals 0 12 sorry and step 0 and that gives me the analysis okay so I have actually two options for the temperature what about if I want precipitation over the whole day let's say I want first of January so I'm going to take first of January 1979 and I'm going to put 19 oops 1979 1979 dash O1 dash O1 okay so I'm only picking one day and I want the rainfall for that day so with this particular interface I've only got one way of doing it how do I get it no it doesn't depend haha I know you want to take 06 of that day to 06 of the next day but why 06 gyda farwch. So, efallai hynny dyddion gwneudwyr 0-0-0-z. 0-24 hwysgwyrdiad. Is that they do that from six today? Everywhere. Is that Z, though? I've never ever seen a forecast model of value. Ya, you can use that. Do you do...you can do that, yeah? You can take this. 0 is 6 but then you can't get their forecast. You can't do that on this system because you only have forecast from here. Okay. You're right. There's no way of doing that. For Z. But they do that for Z times do they? I have something that I have learned in the last years, I have never heard of that Really? One second please? It depends on your time zone it matches with the Z times in the UK they took from 0 to 24 The system will get better Oh, because it doesn't work That's interesting because when you look at satellite retrievals of course that's not what is done Ond ar y cyfnod o ddechrau hyn, mae yna gynllun o ddweud o ddweud y 24. Mae gyddoedd y ffrindwedd hyn yw'r ddechrau yn South America yw... Dwi'n credu, dyna'r ddweud. Mae'n gwedd yn ymloed iddyn nhw. Fe oeddwn ni'n meddwl. Yn y maen nhw'n meddwl ystafell, gyda'r cyfnod o hynna, ac yn ei wneud yn ymlaen. Mae'r gweithio'r API ymloed gyda'r ddweud o ddweud o ddweud o ddweud o ddweud o ddweud, is basically to take 0 and 12 with the 12 hour range. Now what you'll notice actually sometimes if you click on some of these you'll find that did you notice how some of these options changed. So if I click on step 12 for example have you seen how 6 and 18 analysis times are no longer available and that's because of course we have no forecast from those times. You'll also notice that if you click on a flux so if I go to total precipitation and I click on that then to try and help us not make mistakes I can't click on step 0 anymore which will be an analysis. I can only click on 3 to 12 which is a forecast time. So maybe one piece of feedback we could actually handle. It would be quite nice if this system could be extended out to step 24 I think for the analysis forecast because it exists and that would enable people also to really be able to sub-sample that. Okay we'll see what we can do. Question? Very good question thanks. I wanted to mention that and I hadn't mentioned it straight away. Now when you look at any flux in the ECMWF system and this is pretty standard the units are SI so rainfall is in meters and it's from the start of the forecast. So what I've got here if I click on 12 and I click on 0 and 12 I get two forecasts retrieved. One is the forecast starting from 0 to 0 and it's the 12 hour forecast and the next is a new forecast starting at 12 that goes from 12 to 24. So I have two values. What do I need to do? Average them or add them? Add them very good because they're both starting they're both from reset. If on the other hand you had a 10 day forecast and you want to know the rainfall at day 9 you would have to simply take the value at day 10, subtract day 9 and get that difference. It's actually an important point because one of the things that happen when you get very long when they wanted for example to take I don't know if any of you have heard of EC Earth EC Earth? How many of you have heard of EC Earth? So there's a few of you. So EC Earth is a climate model that actually was taken using the IFS forecast model that's used in the S2S contribution, the seasonal forecast and changing it for longer terms. One of the things they had to do there actually was actually remove that accumulation because of course if you have many years the problem with accumulating fields is you can lose accuracy when you get very very large numbers and you can't you can't difference day whatever 100,000 and day 100,001 because the difference is down in the small digits and you lose the accuracy of the rainfall field. So that approach unfortunately can't be used for really long run so when you get to that decayed forecasting and onwards. So that's the main thing I wanted to point out here. I don't want to take too much time up going through the details here. At the bottom if you're using the interactive system you can see the Mars request and when we look at the S2S Frederick will show you that you can also now have an option with S2S that allows you to see the web API Python script that you can run locally on a desktop and access the database which is really quite nice because you can use the web interface to familiarise yourself with how the database is structured but then once you get used to that you can click on that and get a kind of starting script that you can then modify interactively to do more advanced retrievals using the web API which is much more powerful. So it's the same thing here but with the Mars which you use internally. So you click on this and you see what the options are. So I'm not going to go into details now. Frederick will be showing that with the S2S and the Python web API. How many of you actually are Python users just out of interest? Much less. I've only written one or two programmes myself. Don't panic because you won't need to know that much. It's very straightforward. We'll see a little bit of Python as well. You can do most of what we want to do. You can also use the web interface. So you'll be able to use that. Then of course you can have options to retrieve GRIB or retrieve NETCDF. So let's just go back and do this. Give an example and then we'll go to coffee. It's an atmospheric model, total precipitation, daily step 12, 0 and 1. I've somehow got all of the data there. Let's just see what happens. It's telling me that it's going to the archive and it's transferring as many bytes. An era interim, I think, these days is actually all on the cache. I think it's cached permanently. So Frederick will tell you more details about that, but most of the data at ECMWF is stored on a tape archive system. He will actually be talking to you about how to access that tape archive quite efficiently. Because when you make retrievals to the S2S database, you don't want to have a loop that says, go to the tape, take one field, and then the next loop says, go to the same tape, take a field. For example, otherwise it's extremely inefficient and it log jams the system and the operators at ECMWF get upset with you and it's not a good idea. You want to go to the tape once and take everything together if you can. You don't have to worry about that quite so much because it's an era interim because it's used so often that they actually put that onto a disk cache. So it doesn't use the tape system anymore, which means, first of all, it's much faster and it also means you have less concern about repeated retrieval requests because they're just going to the cache, the disk cache. So it's just happily retrieving and we'll see the end product at the end. We can even have a quick look with NCView. So I suggest now that we break for coffee. It's quarter past three, so if we come back at, I'd say quarter to four, Frederick is going to show you a demonstration of the S2S database. And then tomorrow, of course, we go into the info lab. The info lab is the big room just across the hall, directly across here on the far side. The two computer labs is the bigger one of the two. And hopefully you'll remember which group you are in for tomorrow morning. I think splitting it also for the first one is quite nice because it's easier to handle if you have questions. It's to keep everybody together. And it also means you're not queuing for the bank for quite so long. Okay, any other questions? Hopefully that was fairly clear. So this at least gives you access to one simple product that you might want to use for precipitation and temperature for some of your validation when you're looking at case studies in your projects next week. And as I said, Paola will be showing the IRI data library if you want to supplement that with some of the retrieval products for precipitation and so on. Thank you for your attention and we'll nip up for coffee.