 I'd like to start with a question. Who of you has heard of R, the program language? Who of you has worked with it? Who of you kind of regularly works with it? Okay, that's a few people. That's pretty cool. I wrote an R package because I was using data from the German Weather Service and had a hard time figuring out what they have available. They have metadata files and stuff, but they're not always exactly correct and what's actually available. I've needed this for a couple of projects and at some point it became a package and it's now online and you can use it. I'd like to show you what it does. The German Weather Service, or in German, Deutsche Wettadien, that's what WD stands for, has a lot of data sets online, like over 25,000 files. That's a bit too much to inspect manually. Also, it's difficult to search for all that. As I said, they're not always completely consistent. Also, not informating column widths and stuff like that. Here's a screenshot of the FTP server where it's located and you can maybe see that this is a somewhat medium-length URL and there are a bunch of files available there. RSA is the day with that package. It's pretty easy to harness all that data and download and use it. So much for the motivation. I'd like to continue with a few slides on how to use it, a couple of things, what you can do with it, a couple of application examples, and then a bit on the force community role. You want to get the URL, find out which is the file you want, download it and read it, and then I have a bit on plotting and mapping. So the first part is fairly straightforward. You load the library and then with the select DVD function, you can just put in the name of the station that you want and then tell it, I want, for example, daily data of the climate observations from recent data. They have always the data separated in recent files and historical files so that you can update data of relatively small recent files. So that will give you what URL you want to use and that is what you put into data at DVD to download it. And it will give you a couple of messages telling you what it's doing. The nice part is this is all vectorized so you can say I want all the stations for a certain combination of parameters or I want all the data for a certain station or whatever. And it will give you a progress bar and stuff and then start that file in some place on your computer and then the last step is to actually read that. It's a zip file so you need to unzip it. It needs to be read correctly and converted. And that's what readWD does and then you just put in that file name and then you can look at what you get back is a data frame. It has German names and currently I'm keeping that because that's what your original stuff is. They're not quite consistent about using English folder names but then German names in the files and converts data or datum formats as far as I can tell it works kind of correctly so far. And then you can use regular R code to plot stuff. For example, I want to plot two columns off it, make that a line, don't plot anything on the x-axis and then make the numbers go up right and then you can use some other stuff that I also have available to create some nice data access and stuff is pretty regular R code that you can then continue to use with that. So it's a bit R focused but I guess that's okay. I mean that's what I work with. And then I have an interactive map. Also fairly easy to get that and you can then zoom in and stuff like you can always do with interactive maps and then if you click on the points you get some meta information, it will tell you like what's the name of the station, where is it located, how many files are available for that station, things like that and then you can also get more information to see what exact files are available there. So I'd like to show you three applications, things that I've been doing with that. One of them is to get a long-term climate graph at least in climate science this is what people like to look at. I'll show you the picture in a minute. It's fairly straightforward. Again you select what you want. I want monthly data from Potsdam, the historical data. Put that into my climate data frame. Do a bit of managing to get actually long-term averages of the monthly averages and then also have a function for creating climate graphs available and it will give you something like that. Who of you is familiar with climate graphs? A few. So briefly explain what it does. You have the temperature here in red and that means over a long term the average say July temperature is like 18 degrees in Potsdam which is in the northwest of Germany near Berlin. And then you have the rainfall on the right axis. This is on a different scale but it relates to the temperature by a factor of two. This is in millimeters per month so it's rainfall sums. Again the average is over long terms and if you look at that you can see that in Potsdam usually things are not water limited. In other places in the world where there's not so much water but higher temperatures you can of course imagine that vegetation and stuff has more problems with drought and then you would see fall this below the temperature and there see a yellow region and stuff. So it's a quick indicator to get an idea of how vegetation at a certain climatic region works for water availability and stuff. That's pretty common in geography and climate science and stuff. So it's fairly easy to actually do that for any station in Germany. I mean in the end they pretty much look somewhat similar because it's all one climate region but that's not too hard. Another thing I was in a task for is looking at the Brownsbach flash flood that happened last year at the end of May. Maybe you have heard about it or read about it in the newspapers. It was a pretty amazing flood. Probably there's never in human memory been such thing before there. Likely actually in the first part of the 20th century there has been something like that. Anyway it was a lot of rainfall in a few hours and most of it actually in within one hour created a huge flash flood. Very local thing. It's just one village or two villages being washed away by that. Well actually not completely washed away but a couple of houses were washed away. Luckily in that place nobody died. In other places in Germany people actually died. So this was a pretty serious thing and I'm in a research training group that has the idea that there has to be a task for us looking at something like that which is the event we choose for that and we wanted to look at the rainfall from that event. Now of course there was no gauge directly but there was something close to it. What you see here in red on the map is the catchment area of where the flash flood happened. So it's really small. It's just six square kilometers. And then the sums of rainfall along the day along the event on the map. So you can basically see that there is not so much rainfall here but then close to the area there's like 100 millimeters within one day which is like totally exceptional. You would expect this to happen like every few hundred years on average. Now that's if there is no climate change. So if you were to do something like this get the recent data around a certain region you could get that fairly easily with the map that I showed. Actually now that we have the map we didn't have that at the time we created this we found out we actually missed even one station here which would have given a bit more information even. And then you can also get the time series of those stations not of all of them but of a lot they have hourly data which is definitely the minimum resolution you need for looking at something like that. Basically showing when did the event pass by so it's a lot of rainfall here staggering here for some time it did not move very fast it kind of moved off again which created this horrible flood. Alright the third thing I'd like to show you is also on extreme rainfall the idea is that warm air can contain more moisture it's a really really simplified explanation of the Clausius-Clapeyron relationship and you would expect to follow that kind of the red line that kind of shows how much potential moisture there can be in air given a certain temperature. What we see is that in fact it goes up even steeper than the red line but that's fine and what we see is the temperature estimate that happens like in one of a thousand events 99.9% quantile and that rises and rises and rises then it drops off again. Each green line is one of around 150 stations across Germany and this behavior is kind of regular it happens at every place. This happens all over the world there's a bit of research on that it's kind of a specialized topic you probably haven't read about it what we figured out is maybe so there's been a couple of theories why this is happening like if it's really hot there's not so much moisture available so that's why it does not rain much more intensively and we figured out at these places there's often not a very lot of data because it does not happen so often that in Germany this is the dew point temperature so real temperature is the air temperature is usually a bit higher even is so high and there is rainfall so we figured out if you have small samples you probably underestimate what you would actually be expecting also because extreme rainfall is so local it's quite likely it has not even been observed yet at the measuring stations but it may really be possible in between which would be an important question given there is climate change and given people in cities do planning for how much water they need to expect within say half an hour or something and how often the drainage system is allowed to fail and stuff like that so we developed a technique to get good quantile estimates even in small samples then we figured out already other people had done very similar stuff and there's a lot of theory on it but it's good if you find out something that then you find already been accepted anyway if you apply that you see a completely different picture the same kind of thing here but then it kind of continues to rise now there is a lot more uncertainty going on and it's somewhat unlikely that you have like 200 or 300 millimetres within one hour I mean it's not to say it's impossible but it's getting very very unlikely if the maximum ever measured is like 80 or something but the point is even though we haven't measured very very extreme precipitation on very hot days yet may be possible so and it was kind of nice to look at this with a whole bunch of weather stations across Germany alright one slide on how the community helped in putting all of this together I mean we're at a FOSS conference you probably all know Stack Overflow I benefit greatly from that once in a while I post something there too parts of the community kind of lobbied the German weather service into publishing all the data I mean it's tax paid anyway so why not have it public they were actually open for that reasoning in about two years ago or something I believe they started publishing all that so it's available online for free which is pretty cool then there's this whole R package distribution infrastructure that you can use to create and share something like that it's really great also this map has been pretty easy to create I guess you're all familiar with leaflet see a lot of knotting geospatial, definitely I guess that's expected there's an R package linking right to it it's just a few lines of code to even create that so I'd like to finish off with saying that free and open source software is awesome, not a big surprise at a conference like this the German weather service has become pretty awesome and since you cannot have too many awesomes on one slide I'll not say that my package is awesome but I'll say that you can use it to use all of that data very easily so thanks very much for your attention so the question is what's the origin of the data so there's a lot of radar data also available I am using observational data from climate stations operated by the German weather service sometimes since a very long time maybe you noted when I showed the climate grave for Potsdam that it's been around since 1893 so for geo I'm a geo ecologist any people working in something like that this is very very rich data stuff because you need long time series to look at trends quantify climate change and all that and so this comes from observational stations there are like 5000 in Germany they don't operate all anymore they're expensive to run and now with the new advance of radar and stuff all over the world there are less and less climate stations but this is all observational stuff and it's measured by traditional rain gauges partly that had this rotating cylinder with paper on it that has been digitized partly with new measurements it's kind of a mixture that you would read about in the metadata then do you have any plans to include more from the rest of Europe or the rest of the world try to unify the format so that you can maybe make a map of Europe with all stations do you have plans to include data from all over Europe or the world and kind of organize that into a common structure currently not because theoretically I'm writing a PhD and not doing this kind of stuff so this is a hobby project which is kind of large for America the data is available fairly easily there's an R package even for that from the NOAA so yes would be an interesting project it's a bit beyond my current time capabilities same question but are there standards of communication of information are there standards for communicating this type of information yeah so there are a couple of standards in gathering the data for example like air temperature if not specified in elevation is measured at 2 meters high stuff like that which makes it somewhat comparable now of course not all countries follow those standards and stuff I have the general impression that this data that is publicly available has been filtered fairly well but I have also worked with data from in that case Austrian official weather service where I had rainfall sums that were like 2 or 3,000 millimeters per year whereas it should be like 1200, 1500 maximum and then at some point I figured out you know they stopped or they had a gap in their measurements from say July 13 to September 27 or whatever it was and instead of having NA data there they just interpolated linearly which is of course not valid for precipitation right it's very unlikely we'll see a little bit more precipitation than previous day so even with standards you do have filter out stuff what's next for your project what's next for my project getting a feedback from more people like if you have any ideas about this also I'll be spreading this in the community around at my university in the research institutes close to it I'm in touch with the German weather service actually seeing what's coming around what bugs there are still in there or what features people request yeah it's an ongoing project as I said it's a hobby project so I really need to watch out spending too much time on it for perhaps some suggestions if you use R for geospatial what are good packages used and what are good packages to work with in R for geospatial data um I probably am not quite the correct person to answer this I don't do too much but of course there's the RGDAL package there's packages to to a couple R with QGIS um I've worked with that a little bit yeah no good suggestions probably oh yeah sure the basic package that handles all of this is SP for spatial stuff but I don't think I've even accessed that directly it's all in the background for me I try I think there's a vignette for spatial data yeah and there is a task view on CRAN so CRAN or CRAN is the comprehensive R archive network where people publish R packages and they also have task views that kind of give an introduction like you know what are common packages you would use