 I'm very pleased to be following Alex and I was really happy to see that in one of those many boxes he had of sources of big data that informed big decisions there was the Bureau of Meteorology and I guess what I'd like to do is just unpack what sits behind that box a little bit today. He also talked about the definition of big data and as we all do when we're researching a talk, and certainly I've had a passion about data for a long time, I looked up you know on the internet and there are lots and lots of different definitions and the one that sort of works best for me I suppose is the Gartner one that talks about high-volume, high-velocity, high-variety information assets that demand cost-effective innovative forms of information processing for enhanced insight and decision-making. That sounds very much like the Bureau of Meteorology, it's what we do and I'm pleased to credit Gartner because I'm going to steal another one of their ideas later on in the talk. So the Bureau of Meteorology, our mission is all around environmental intelligence, it's providing environmental intelligence for Australians for public safety, for societal wellbeing, security, economic prosperity and sustainability and we define environmental intelligence as the conclusions drawn from environmental observations and models to guide decisions and actions by government, business and individuals. So while we're very much around big data and I've got to say if you define big data as lots of it we've been in that game for a very long time but our focus is also on the decisions but making sure that we inform, we provide the services and the products that help our users make their decisions, it's not just about the decisions that we make. So who are we? Well we are a capability, a national capability, we're reliable, we're resilient, we've been around an awful long time. We provide national observing systems and we're part of global observing systems. We look at the atmosphere, the oceans, land, water, the land surface space, the space environment right out to the sun. So we really are about observing everything that's around us. We're a 24-7 operational facility, we do forecasting systems together and on longer term for climate, for oceans, for flooding, we have supercomputing and massive data storage capability and all the headaches and problems that go with that as well as all the opportunities it provides us with. We have a requirement for very high up time on all of our communication systems and we need to be able to respond immediately to any sort of business discontinuity, disaster recovery and one of the things that we learn very quickly with more and more data is that it is the resilience, the increased resilience you have and the bigger challenges you have to recover when something goes wrong because you can't just sort of sit around and wait for a day while the systems get better. We have professional forecasting capability across all of the disciplines involving water, climate and weather and as well as our core capabilities, we also outsource people into our user communities and there's just a small sample there but we have a very substantial outreach into our user community. We work with our flood communities, we work with our agriculture communities to understand their needs and make sure that we can plan our systems accordingly. So, big data, where does it all come from? Well, I guess we track everything to do with the earth atmosphere system, the oceans, the weather, we track air quality, volcanic ash, radiation, we track the oceans, tsunamis, tides, storm surges, everything to do with the ocean environment. On land, the river floods, river flow volumes, groundwater, all of those things, soil moisture, water quality, water storage, right out to space and as far as weather itself, one word captures so many different things, climate as well, so many different things in terms of what we have to monitor and the time scales that we have to monitor them on but all of that comes together and as our contribution not just to meeting our national products and the services that we provide to Australians but it also is part of our contribution to the global system. The global observing system, if you added up how much is spent around the world by all the national meteorological services in the world, I did the sums a few years ago and it came out around $10 billion. What we spend in the Bureau and our total budget, it's not more than $300 million, it's a spit in the ocean and for our contribution we get free access to everything that comes from a lot of the Met services around the world. So it's a pretty exciting and pretty information data laden world to be in. Every day we collect about 10 million atmospheric observations. They are available to go into our NW, our numerical weather prediction systems, not all of them get used but you need all of them in order to make the assessment about which ones are actually going to use. It's about five and a half million ocean observations which come in on a yearly basis, on a daily basis and from the satellite perspective those numbers are just flying high, they're getting more and more all the time. Already we're receiving data now from the Himawari H satellite that was launched last year and in one day that increased the amount of satellite data we were getting from one satellite by 50 times. That's going to keep on going. So we're getting massive amounts of data that we need to receive and just visualise, assimilate, put into our models and apply. Most of it in near real time and even if we're not using it in near real time we have to actually accept it in near real time otherwise we'll get so far behind we won't be able to create the moment again. So if you just look at one day's collection for weather forecast and that's about 15 million observations there, about half of it from satellites but a lot of it still comes from ships or from satellite sources rather. We get from ships and from land-based observations from boys and being in the southern hemisphere where we're so heavily surrounded by ocean satellites have revolutionised our observations but those rare observations we have in the water, especially the pressure information we get from drifting boys are hugely valuable at giving us the sort of the location, location, location element of what's happening where. So next step is we've got all that data in, what do we do with it? Over the last sort of 30 years, 40 years we've increasingly been bringing computers into our work and generating numerical models on different time scales and if you look at the full range of things that we actually are engaged in in the Bureau it goes from the systems that we need that information and we need to issue forecasts and alerts on a time scale of minutes right through to centuries and as you get to longer time scales the uncertainty grows and the more general the prediction comes you can be quite explicit at the very short time frame and when you write down the bottom there around the minute time frame the observations themselves and the models are almost the same thing than our casting time frame and as we get longer and longer we apply that information into more and more complex models and those models then become more and more generalised as the time frames get longer and longer. We've got a limited amount of processing capacity in the Bureau and we've got to balance our computational requirements around the value that each product provides to the range of customers we serve. So those that need things on a very short time frame are making decisions on very short time frame they need them very quickly and we've got to get turn them around in our computer system very quickly. Of course the longer term time can sort of percolate away in the background but they are essential for the longer term record as well. We have a super computer, a nice really big one. Our current system in my is about 104 teraflops for anyone who's related to who understands about floating point operations. We've got a new one coming in in Australia in July this year, Australia's that will be operational on the first of July this year and then we'll have a midlife upgrade to that system in 2019. That'll take us up over the 5000 teraflops. This still leaves us well behind our international competitors. I think when we turn on Australia's we might just flip over into the top 100 briefly and then the rest of the world will keep on going and leave us behind. But it's important that we at least stay somewhere near where our collaborators are internationally so that we can share and exchange models and so that we can add the uniquely Australian contribution in terms of our data, our understanding of our processes and what that actually means for the broader scale models. One thing is that we don't do on our own super computer at the Bureau is research that's done on the national facility and that's somewhere where we have to rely on a collaborative environment to do that research. So I've mentioned numerical weather prediction, NWP term we throw around a lot in our business. It's basically around carving up the atmosphere and the oceans of the planet into little boxes and then simulating all of the physical processes in each box and putting all those boxes together looking at how each box affects its neighbour and how that story comes together. The processes that we have to look at are physical processes around heat, solar and terrestrial radiation, the behaviour of water vapour in the atmosphere and the horizontal and vertical motion of the atmosphere and all of the oceans as well, the ocean temperatures. All of these things drive how the model develops and increasingly what we're trying to look at are the interfaces between those things, between the atmosphere and the ocean and how that drives the development of our models over time. We do models on lots of different frameworks. We have a global model which is the outer scale of the frame there, a regional model. Then we come down to smaller scale models. The smaller scale models are higher resolution because the boxes in those models are smaller. They take a lot of compute power to run and so the bigger models, the Axis G, our global model, that's got about 21 and a half million boxes in each, about 40 kilometres on the side and about a kilometre high. Axis are our regional model. That's about 56 million boxes. Each of those are 12 kilometres on the side and just over one kilometre high. That's what we do now. In the future, as we implement our new supercomputer, that's going to enable us to increase the accuracy of our models, to decrease the size of those boxes and increasingly to run them at higher and higher resolutions and more frequently. So by the end of 2020, we'll be delivering our global model on 12 kilometre scale, which is a massive improvement in resolution and the detail that we can provide and that then scales down to at a city scale about one and a half kilometres. The atmospheric flow is intrinsically chaotic and so there are limits to how far you can model in a deterministic sense. Small errors in temperature or wind or any of the other initial conditions and of course we can't get absolute information about the initial conditions. We can't get everything. We have to sample the atmosphere at discrete points and they amplify over time, leading to error in our observations of our models. So what we're looking at doing is what we call ensemble models, where we start a number of models with different conditions and then that propagates into the future so that we get much more information about the uncertainty. I've just been told to hurry up, I talk too long. Okay, this is how you get our information through our website. This is one of the products. I'm only going to show you one product because we do lots and lots and lots of products, all of which I think are highly relevant to the agricultural industry. This is our continental water balance model where we basically take in rainfall information, we apply national climatological process and we deliver a number of products that are of particular value to anyone on the land, particularly those who want to know about water and I'm not going to go any further on that because I need to get back to the big data story. We give that information to you in lots of different ways, through mobiles, through webs. So where does this take us in the future? One of the big things that we need to think about in terms of big data is that we get so much data, we've got to make some really smart decisions about what data we actually can use. We go back a hundred years to this very smart man, Lewis Fry Richardson, almost a hundred years ago. As he said, perhaps someday in the distant future it may be possible to advance communication, computations faster than the weather advances, and at a cost less than the savings to mankind due to the information gain. But that is a dream. Well, we passed that dream a long time ago. And one of the reasons that we have been able to, in our world, really make some significant gains in how we turn those increasing amounts of data into higher and higher resolution, higher accuracy products, is by working with, I guess what I've called our nexus of forces. That is data, systems, science and services. The fact that our data has increased rapidly, particularly with satellites, our systems as per our supercomputing have got more and more powerful. Our sciences, as those two have developed, our science has improved and we've been able to deliver more and more exact services. This diagram on the left here basically just shows you that how the Southern, if you look at the bottom line here, that was the forecasting skill in the Southern Hemisphere 40 years ago on a three-day forecast. At the top there it's the Northern Hemisphere. Now, they're both together and they're both up here. Satellites, supercomputing and science have been absolutely critical in coming together to allow us to improve our services here to be as good as you can get them anywhere else in the world. But it's through that alignment that we've been able to do it. And we've got to make sure that we apply our resources in the most efficient and effective way so that we can continue to make smart decisions about the data we bring on board, apply our systems as smartly and as cleverly as we can and deliver the services we need. So one of the questions is, where does big data become too big? And I've heard the word value already thrown around quite a few times today. And I guess I'm a great aficionado of the word value itself. And that is when the costs of actually getting the data and using the data to exceed the value you get from it, that's when it's too big. And that's when you've got to make some really smart decisions about what data you actually are going to take any notice of. And when you talk about the cost, you've got to look at the cost right across the value chain. That is from acquisition, through processing, through ingestion, through quality assurance, through storage, archiving analysis, every step of the way, right through to dissemination of products. But a more effective sort of view of the value chain is really focused on the user. And that is about better defining the need and the problem that you have to solve. Measuring those things that answer the questions. Transmitting, accessing and sharing the data and bringing in data from other sources if you can't all get it yourself, if you can't get it all yourself. Processing it and then applying it and verifying that it actually meets the need you said you'd meet in the first place. So that is the sort of life cycle for big data that's really important to us. If it doesn't deliver, if it doesn't solve the problem, if it doesn't address the need, then it's not delivering value. So more new sources of data. Again, it lies in it being fit for purpose. Does it meet the service needs that we have in front of us? Does it meet our new research goals that we're trying to break, you know, really look innovatively at new applications? And will it feed the new models? What are the new and emerging needs and opportunities? Lots and lots of different observations coming in. But some of the opportunities, you've got to look a bit wider afield. Do we need to be doing it all ourselves? There are things that we feel challenged all the time by the Googles and Amazons of this world. But there are things that they actually do really well and what we do really well and complement it. So it's really making sure that we focus our efforts on those things that are most important to our users and to the services that we provide. We're often told that crowdsourced data will do it all for us. But again, how do we use crowdsourced data? Is it of value to us? And I think potentially it is, but particularly when we're looking at verification, has something happened? Has a front gone through? Has a storm happened? How big was the hail? And it's good for things like situation awareness, but it's probably not much good to us in terms of our numerical weather prediction. The challenge of supply and demand, I think maybe just the key point from here is that we used to be in a paradigm where we collected data, we applied some science to it and we deliver services. What's changing now is that these three clouds are actually overlapping. We work with our users to get the data we need. We work with them to collaborate on the science we evolve and the services we deliver, again, we get feedback on those services. That comes back and then helps us improve them. So we've moved much more into an interdependent relationship with our users so that we can deliver better outcomes for them. And just lastly, we've heard about the internet of things. We are looking there, it's an area that we are trying to go and obviously we're not the Googles and the Amazons of the world, but we're looking at how we can enhance our environmental intelligence outputs in a more timely way. We're exploring some options, we've got some demonstration systems internally and it's somewhere that we're hoping to go in the longer term. And I was just excited to see this example that a friend of mine showed me from an Italian institute in biometrology, which is an internet of things approach to really looking at how they can develop strategies and technologies and really practical solutions that bring together environmental sensors, food, agricultural information, energy in order to develop systems that are very directly applicable to their community. Thank you.