 So, welcome everybody, my name is Albert Kettner, I'm part of the CSD Mass Facility. And I have the pleasure to introduce our next webinar, which is the last webinar of the spring series, we will continue again in the fall. So the webinar of today is provided by Professor Reed Maxwell and Professor Laura Gordon. Professor at Princeton and Laura Gordon is a professor in hydrology and atmospheric sciences at the University of Arizona. And the presentation will be about adventures in developing community modeling platforms. It's going to be an introduction, I think, to hydroframe. So with that, thank you Reed and thank you Laura for agreeing to give a presentation and go ahead. Yeah, terrific. Thank you so much. I really appreciate the opportunity for both of us to kind of tag team this talk. So as was mentioned, I'm Reed Maxwell at Princeton University and I'm honored to be joined by Professor Laura Condon at University of Arizona. And we're going to talk about adventures in developing community hydrologic modeling platforms, hydroframe and hydrogen. And we'll play on the fact that there's a lot of hydro modeling platforms not to be confused with hydro share. So part one is hydrology in the supercomputing age. And it's no surprise to any of us on this webinar that we're in the midst of a revolution in computing and data. And computers have advanced substantially over the past half century. This is a really nice kind of pictorial diagram. It's interactive, Lawrence Livermore National Laboratory, computing.lnl.gov history, put this together. It's a really neat kind of way to think about how computers have advanced over the years. And of course, two national laboratories that were really heavily involved in and defense kind of published World War II, Lawrence Livermore and Los Alamos National Laboratories have also hand-in-hand been really innovators in scientific computing. And so we can roll back to sort of 1970, 1969, the new workhorse, the Cray CD600, which was the fastest supercomputer in the world then. Now we have the so-called leadership class supercomputers that DOB maintains. And one of the most recent is Sierra Lassen, which is actually a split machine. There's Sierra Lassen or classified and classified systems. And they're something like 20 petaflops and two different machines. And so it's really just incredible advancement. And hydrology has progressed as well. So if we think about what was available for hydrologic models in the late 1960s, there were hydrologic models. There were actually analog models. So resistor networks that you plug together, each resistor was a hydrologic activity, must have been incredibly arduous to calibrate. And that was state-of-the-art. We did not really use so-called digital computer models in those times. And if we move forward, we now have things like I'm showing on the right. This is a paper out of Stephan Collette's group that was published in water a couple years ago. And this was a prototype experimental operational forecast that was bedrocked to the top of the atmosphere, fully operational, running in experimental mode, just as a proof of concept to show that you could, over the EU cortex domain, provide full, this is planned available, water that I'm showing here, forecasts of all these hydrologic variables, which is just really incredible, linked to the atmospheric for the weather forecast, numerical weather forecast from the German Weather Service. So just this like immensely amazing advancement. And in fact, as I mentioned, there were no so-called digital hydrologic models in the late 60s, but there was the concept for it. And so the concept was really envisioned for this half a century ago. And this is now, of course, famous blueprint for a physically based digitally simulated hydrologic response model by Fries and Harland, which was published in Journal of Hydrology in 69. And what's amazing about this is that we show the conceptual model from Fries and Harland's paper. This conceptual model is quite modern. This really very much represents how we think about hydrologic systems even today. We have different portions of the landscape. We have ET that changes across the landscape. We have precipitation. We have channel flow. We have all of these different components. And then we drop below the subsurface, below the ground surface. And we have groundwater. We have flow lines. We have a water table, equipotential lines. You know, really, this just very forward looking type approach. Now, the catch is that the fastest computers in the world when landmark papers such as Fries and Harland are written are actually much slower than your average smartphone probably in your pocket right now. And in fact, in 50 years, computers have gotten nine times 10 to the ninth times faster. If we look at what the computing power of the Seymour Cray 7600 in 1969 was a 10 megaflop machine. Sierra Lassen combined is 94 petaflops. So if we roll through the cheat sheet, a kilo is 1000, a mega is 1000 kilos, a giga is 1000 megas, a tera is 1000 gigas, a peta is 1000 teras. And now we're talking about EXO, which is 1000 petas. And so this is just an incredible, incredible advancement in computing. And if we put, you know, an ancient iPad or an average smartphone, it's in the megaflops to teraflops range. And so Fries and Harland had this vision, but they didn't have the resources. And so hydrologic modeling has been accelerated by several different movements. First is collaboration between computer scientists, applied mathematicians and hydrologists. And one of the main reasons for this is that our efficiencies gained from numerical solvers have kept pace with our increases in computer speed. We're not solving the same problems using the same applied mathematics approaches that we had in the 1960s. Our applied mathematics has advanced tremendously. And secondly, all modern supercomputers are parallel. So this incredible advancement in computing speed has been as much because of massive parallelization of computers that happened in the 90s, as much as it's been advancement on single cores. And so if we look at how the speedup has really changed or how the speedup is realized, we have to understand that our hydrologic models need to take advantage of this. So I'm showing three different examples of so-called weak scaling. So you take a single problem, a unit problem size, and then you run it on one processor, and then you scale out that unit problem so that each processor, even when you're running out to say 16,000 processors here in parallel, or 250 to 500,000 processors, or 250 nodes of four GPUs each, the same problem size is the same, the unit problem size, but then the overall problem size grows with the number of processors. And we talk about scaled parallel efficiency. So how efficiently are we solving a problem that's 16,000 times faster, 16,000 times larger on 16,000 processors as opposed to solving this unit problem size? And what you'll notice is that not only have we evolved in terms of size, but now we've also evolved in terms of platforms. And we're now taking advantage of things like GPUs. All of this requires good software engineering. Now the second advancement has been open-source software and centralized development. And this really provides this transparent community approach, right? So we've had version control for a long time. GitHub has really been this engine to make this very transparent and very easy to use. But there's been several iterations of version control before Git. And what this also does is this brings in things like automated regression tests, so that we now have a suite of tests that a code must pass so that it's authorized, if you will, or it's transparent that if we make a change, so if the code advances, that we don't introduce bugs, that we don't introduce backward incompatibilities. And then the other thing is it's very easy for the community to see what the code is and what's been done to it. We can have this transparent version history that's really clear. And so this is an important piece in building this community development. Now another piece that's really important in building community development are intercomparison workshops. And these are intercomparisons between models, verification problems, benchmark problems, and validation data sets. And these again bring the communities together, but they also provide standards and build trust in these complex simulation platforms. Now lastly, we could give an entire talk just about data. Data has exploded in the same way that simulations and computing power have exploded. And these new data sets have become widespread, but also these new data sets are open to the community and are really provided in an open, fair way. And that's been an important advancement as well, whether it's data sets that are, you know, for comparison, like Google Earth Engine, there are things that are published in scientific data, or data sets that are inputs like the Shingwan et al. depth to bedrock that provide new inputs to our large scale models. So I'm up, right? Yeah. Yeah, so we started off talking about these advantages and computation because I think it's really important to understand that over the recent decades, this has really fundamentally changed the way that we can solve and even conceive of problems. And so the question is, obviously, we're not just doing computing for computing's sake, what have we learned from this and how has this helped advance the science? So there's a large hyper resolution modeling community, just as an example. Next slide. That has been really pushing the ability to take advantage of these large computing advances to build better large scale models that can help us answer really big questions where we know we need to be looking across large scales at high resolution and we need to be including a lot of complex processes. Next slide. And that's a lot of the work that Reid and I have worked on doing integrated hydrologic modeling. So specifically exploring the role of groundwater surface water interactions and understanding how these can play out in our watershed systems, both in the past and in the future. And specifically, we've looked a lot at groundwater, which as many of you probably know, is our largest freshwater reservoir. It's 99% of the world's unfrozen freshwater. 50% of the drinking water that we use in the U.S. comes from groundwater. And it provides 60% of the water for our global agriculture. However, it is also being depleted at an alarming rate. And this is challenging because it's hard to see and hard to model. It's a lot more difficult to understand what's going on in the subsurface than with our streams and our surface water reservoirs. And as a result, it's not included or often really greatly simplified in most of our management models. So I'm going to provide just a few examples of the kinds of science we've been doing with these models to better understand the role of groundwater in our systems and what that can mean for the future. So this is some work from Lauren Thatch that's a student of Reid, former student of Reid. And she did some really cool analysis combining power flow models, integrated hydrologic models with remote sensing to try to better understand how the human and the natural systems interact. And so what you can see from this graph is, first of all, that during droughts, we pump more groundwater. That's, of course, a big picture trend that we know is true. Next slide. And what's really cool about this approach is that by combining the integrated hydrologic model with the remote sensing in sophisticated ways, we can actually use this to better understand the human signal on the system. So what you can see here is that farmers in the San Joaquin may have pumped 1.5 times the flow of the Colorado river in 2014 in this drought as a result of declining surface water supplies. Next slide. So this is an example of what we can see with a single study in the Central Valley, but we've also done this nationally. So these are some results from our CONUS simulations, where we're looking at stream flow declines as a function of groundwater depletion. So we took all of the groundwater pumping that happened over roughly the last 100 years. And we applied that to the model to try to understand long term stream flow changes. And what we see is consistent with a lot of observations, but we've been able to do it across the entire U.S. that we see really significant losses. So in some of our headwater systems, we have simulated stream losses up to 100%. So that's complete loss of small tributaries. And we see that a lot like along the high plains, which is something that's been documented. And I think we have a zoom in of the Colorado river coming up where you can see that we have really significant, these are still in the 10 to 50% declines in stream flow. And we can connect this to what we're seeing today, where we're seeing decreased inflows to Lake Powell as we have a warming and a drying system. And we'll talk about that more in a little bit. So what we're really seeing too is this is that was the story of historical groundwater pumping. But also we know that our systems are getting drier. And there's been previous studies focused on the surface water systems that have shown that the 100th meridian, which is generally which has historically been the place where evapotranspiration is balanced by precipitation, so we have the more arid part of the country to the west and the more humid to the east. This used to line up with the 100th meridian, but it is significantly migrated over the last 100 years. Next slide. And so we can study this with our integrated models. We did some warming simulations where we applied warming to our baseline scenarios. And we looked at the change in ET and the change in groundwater storage. And what we see is that as we have warming happen, actually the eastern basins that are more humid have a stronger evapotranspiration response to that warming because they have shallow groundwater that's available. What's really important about this is that this is a one-way trend. So as that shallow groundwater gets used up, that means we would be more sensitive in the future to future droughts because we would have already basically used that buffer. And we can see this happening also in groundwater storage, which is the next slide. And it really what's happening as we're warming our system is that a lot of shallow groundwater is being depleted. So just as humans are turning to groundwater in the drought, our natural systems, if there's storage in the subsurface, will be using that storage as we have warming happening. Okay, next slide. So this was just a couple of examples and we could have given a talk entirely on the science of integrated modeling and what we can learn about groundwater surface water interactions, both at the regional and the continental scale. But really the point of this section was just to highlight the integrated models are really powerful tools that can help us see things that we can't necessarily see if we have separated systems models or if we're not including groundwater processes and that the connections between groundwater and plant water availability and stream flow are really important both in understanding managed systems historically and in understanding how systems are going to evolve in the future. Next slide. So okay, great. So so far we've talked about the fact that there have been amazing computing advances that have allowed us to ask different questions, build different models than we've ever been able to do before. And we've shown that that can really facilitate really great science too. But we would argue that this is really not enough that there's another step to this. And this is where our hydro frame and hydrogen projects come in. So next slide. I think it should be no surprise to anyone in this group that we're facing really large water challenges that are going to require innovative solutions. A lot of the solutions that we've been using in the past are really not ready for systems that are changing as fast as we're seeing today. We know that the quantity and the fluxes of water are uncertain. We know that moving water around is really costly. Water is heavy. So we can't just easily re-engineer and reconfigure our systems. And we have to think about things across scales. The picture before was looking at Lake Mead, a huge reservoir that's gathering water from the entire Upper Colorado basin. But then we have things like individual irrigators that are making really small-scale decisions. And there's many stakeholders involved from the national to the local level. And this is really not a problem that's coming up in 20 years. This is a problem that's happening right now. So I mentioned Lake Mead. I think this is a really good concrete example of kind of what we're up against. So there's a picture of Lake Mead 20 years ago. And there's a picture of Reed at Lake Mead last summer. And it doesn't take too much, you don't have to zoom in too much to see that there's a huge difference in those water levels. And if we went back there today, it would look even worse. And this is causing huge problems, as of course I'm sure you're aware. These are just a couple of headlines. We're seeing the first shortages on the Colorado River. They're actually going to force major cuts. And we have shortages declared. And actually even California, a really senior water rate holder on the Colorado River, could see cuts. And we have water planners that are praying for snow. And so far 2022 is not looking a whole heck of a lot better. So what's really interesting about this though, is if we look at a time series of what's been happening in the Colorado River, that we actually haven't had abnormally low snow years. If you look at the lowest snow years on record, those aren't our most recent years. Those are in the 70s or like 2012. So how are we in this situation where we have such low historically low water levels? And the issue is that our reservoirs are not just rivers. It's not a surface water system in which we have snow that falls on concrete and makes its way into the river. This is a really interconnected and complex system. So as we have temperature changes, that can change the timing of the snow melt. It can also change the amount of water that plants use. And it can change the balance between how much water infiltrates and how much base flow is provided by groundwater. So it can really change the yield of the basin, which is what we're seeing right now. So this is just one example. Overall though, we're really facing significant challenges as we try to manage these changing systems. We have really sparse observations of the subsurface. We're understanding more and more how important the subsurface is, especially when we're dealing with systems that are warming and changing. And we have a fundamental issue that a lot of our decision-making tools have relied on relationships that have been derived based on observations of the past. And the one thing we're sure of is that the future is going to look different. We don't know what it's going to look like, but we know it's going to look different. And in many significant ways that we're expecting. And we can't just model parts of the system in isolation. We know that these are complicated and interconnected. So Reid, do you want to take it from there? Sure. Can you, everybody still hear me okay? Okay, great. Yeah, my everything has automatically switched around on my computer. So apologies for that, if it causes disruption. So what can we do, right, as a community? Well, we have a number of things that we can do. We can democratize model access. We can embrace hybrid solutions. We can decrease the gap between science and application, and we can engage the community. And so this is where the community platforms that Laura and I have been hard at work with others as a really big team that we're representing here that we've really been striving for. So the first thing we're going to talk about is democratizing model access. And our scientific solutions are not enough because we have these great science breakthroughs that come from computing that come from application of hydrologic modeling to computing. But we also have big challenges in computation data and workflow. And we need things to change these. And so the first thing I'm going to talk about in terms of this approach is the hydroframe platform. And hydroframe is a National Science Foundation cyber infrastructure project. We're very grateful to the National Science Foundation. We're also very grateful to our large number of collaborators and partners. There's something like 10 co-PIs from seven institutions, including some of the folks on this call. And it's been really a great community engagement project. And notice that Utah State, Kowasi, NCAR, and Boise State, in addition to Princeton University of Arizona. So what hydroframes goals are is to take these large national model development, national scale model development efforts that we have been undertaking. I mean, what I'm showing here are some early results from CONUS-2, so the simulated water table depths are our CONUS-2 domain, which are the result of just a huge effort to improve topography, pull together better soil and land cover data sets, and then really a robust subsurface, as I'm showing here, right? So we went from five to 10 subsurface layers. We have semi-confining or confining units. So we really tested a number of different geological 3D geologics with depth of bedrock. So transition between alluvial and hard rock aquifers in this new model. And we really wanted to find a way to make this easier for the community to use. And so this is the idea behind hydroframe. And what I'm showing here is the Kowasi subsetter. So what you can do in hydroframe is you can select a watershed and then automatically run a Power Flow CONUS simulation and prepare all the data using cloud containers with a whole range of options. And so some of the things that you might be able to do is you can automatically subset and then launch Power Flow CONUS. This is running on the Princeton Hydro Data Center. This is an example of just the job dashboard which is running. And then you can go in and do results of the simulation and get different kind of dashboard type information. And it really provides this flexible output in this really easy interactive dashboard type approach. Now beyond this, this is really just the front end for a back end API that is API accessible. So we're working hard on connections to HydroShare, connections to other platforms that provide this real interoperability between data and simulation. So I guess it's so me. So we envision a democratized community water platform for the U.S. where these large scale simulations are easily accessible. Platforms and portals allow us to run simulations. And then a piece that I'm going to hand it off on is customized machine learning emulators for a broad range of users. And so I'm going to hand this back over to Laura about bracing hybrid solutions. Yeah, so we talked about the Hydroframe project which is where we're really providing access to the physics-based simulations and trying to make these more of a community resource. But we're also working on additional solutions where we can combine machine learning with the physically-based models. So really the motivation for this is that we have a growing wealth of Earth system observations and machine learning capabilities. We have so much data. Reid mentioned this at the beginning. We have so much more data today to work with. And we have these advances in machine learning which can potentially really accelerate what we're doing. However, there's still a lot of gaps. Observations are really valuable but they can't tell us the whole story for a couple of reasons. We have local measurements which are difficult to scale. That's especially true with groundwater observations which are point observations and we can't take advantage of the fact that it's like a network like Streamflow. We have a really great quantity of new remote sensing data but it can't see everything. They're often still relying on models and the spatial or temporal resolution might be limited. And then in addition all of the observations we have are fundamentally limited by the fact that our systems are changing. And so it will be hard if we're relying only on data to predict changing dynamics that might not look in the future like anything we've been able to observe in the past. So models are really a great tool to help bridge this gap. And I have a picture of our power flow model in the middle here and integrated physically based hydrologic model. And you might say well why not just have that be a machine learning model? And I think that there's a lot of kind of false competition between machine learning and physics based models because the answer is really yes and we need all of the solutions that we can bring to the table to try to accelerate and better understand what's happening with our systems. So in the hydrogen project we're combining physics based models with machine learning and the reason for combining the physics based models with the machine learning is so that we can generate simulations of things that haven't been seen in the past. So if we do purely a data driven approach we run the risk of having our models really go off the rails when we start to look at the future. As with Hydroframe this is a really large team and a large collaboration it's co-led by Reed and I so University of Arizona and Princeton working really closely with the cybers team at University of Arizona that's a large NSF cyber infrastructure project and we have early adopters have been collaborating closely with the Bureau of Reclamation which I'll talk more about in a second. So as I mentioned the real goal of what we're doing is combining the physics based solutions with machine learning so that we can generate scenarios of a future that might not look like the past. And what's cool about this is that with the machine learning emulators we can run things that are up to a thousand times faster than the hydrologic models next slide. Yeah so this is just one example of a convolutional model that we've set up and the point is we can use the physics based model to train the machine learning emulator and if you go to the next slide you can see that already with our preliminary results we can do a really great job if you look at these two models our machine learning emulator and our machine our machine learning emulator and our hydrologic model you really can't tell these two solutions apart and so what this allows us to do is really do more work trying to understand risk to run large-scale and long-term simulations because we can cut our simulation times by orders of magnitude. Next slide. So I mentioned the hybrid solutions really trying to bring together everything that we currently have we have all this computing power we have all this data so how can we combine them to try to get better answers faster. But I think there's another issue here which is where both the hydroframe and the hydrogen project come in which is trying to decrease the gap between science and application. There's a challenge that a lot of the best science is really hard to get out of academia due to all of these workflow challenges that Rita's talked about. So the hydrogen project is an NSF convergence accelerator which you might have heard of. The idea behind the convergence accelerators is that they're really use-inspired convergence research and really focused on advancing ideas from concepts to deliverables that can help society. And so through this project we've really focused on taking a user-centered design approach which means you spend a lot of time talking to potential users and finding out if what you think is the solution they want is actually the solution they want which requires a lot of asking questions and listening as opposed to what we normally do with our scientific results which is present them for feedback. So we had to actually kind of take a step back and say okay we think this is what we want to do but it actually is something that people need is this something that they want and discover what it is they're needing before we do our designing and our testing. And so next slide. So we did a user-centered design process where we talked to people we asked them what the problems were what their challenges were we did more than 20 user interviews to understand the challenges that they're facing due groups brainstorming and low fidelity prototyping. We also did workshops for the Hydroframe project to understand people's workflows and understand what they're doing and really this has really strongly informed our application prototyping and how we go about doing this. Next slide. So this isn't to say this is a perfect process but I think there's some important lessons learned here and that is that better science first of all better science doesn't necessarily lead to better outcomes unless you have significant user engagement and investment infrastructure. So it's not that we always have to be striving for better user outcomes better science is valuable in itself too but if you want something that can be used and can help people make better decisions in real time it's really not going to happen unless you're engaging users early on and you have a way to invest in infrastructure and when I'm talking about infrastructure I'm talking about things like applications and platforms that users can actually use other than just a scientific publication and a pointer to a GitHub repo so things to make things really accessible for a diverse group of people and I think this is not to knock our normal approach to doing science I don't think this is really how most projects are set up because we don't really usually have support for things like software developers so one thing that is really cool about both the hydrogen and the hydro frame project is we've been able to put resources towards developing this infrastructure and I'll hand it back. So one of the last things we want to talk about of what we can do is we can engage the community and computing provides unique challenges and opportunities for education and outreach and I'm showing two different images here this is actually from a workshop we had in person before AGU this was 2019 where we for hydro frames started a user elicitation process and really started to understand what users wanted which was very valuable the other thing that I'm showing here the other image I'm showing here is some direct education outreach work that we've been doing with K-12 students for example of it obviously with the stream table and you know we engage hydrologists with this to understand this range of user stories and then we can also connect K-12 students with hydrologic processes now one of the really valuable tools has been the sand tank aquifer it's been incredibly valuable this is the so-called ant farm sand tank aquifer shown here where you have water flowing in on one side flowing out on the other actually I had this wrong it's flowing into your right and it's flowing out to the left and you know it's filled with different materials and you know provides a visual example you can put dye in provides a visual example of how ground water works and some basics of ground water surface water interaction pollution pollution etc and this is Dr. Lisa Gallagher who's the education outreach coordinator for the High Meadows Environmental Institute and supports all of the projects that we've been talking about here who's developed a number of really successful approaches now what we really needed was a mechanism both to teach hydrologic modeling and ways to teach remotely you know hydrologic processes remotely and to overcome some of the limitations of the sand tank right so this is an amazing tool but you know you do more than a few dye injections and it takes hours to clean up the sand tank there's not an easy way to reset it if you have a long education event you need multiple sand tanks they're expensive you need to repack them so there's a lot of advantages to having the ability to do this in a different format and so what we did was we built a representation of the sand tank aquifer in a gamified hydrologic model interface running a hydrologic model so we can have visual representation of the water table ability to pump or inject water adjust boundary conditions and materials all on the fly and this is live this is a collaboration between University of Arizona integrated ground water modeling center Princeton and Kitware the software company you can launch it sand tank.hydroframe.org it's running on a large server here at Princeton and this is an interactive power flow model with the sand tank aquifer so you can you know adjust boundary conditions you can change material properties you can run different scenarios and it's even templatable so you can generate different and we have generated different templates so you could have a watershed template or there's a Tucson tce contamination template and it's to spend this very highly flexible approach we also wanted to develop gamified applications to teach machine learning and so this is again some work that Lisa Gallagher led but notice this is in close collaboration with Professor Jill Williams who is the director of the wise women in science education program at University of Arizona and this is sort of further our collaboration between these two institutions with the large team including research software engineer at Princeton and folks from Kitware as well and we built what's called sand tank ML which is a machine learning education outreach tool and what's really needed about this is Lisa sort of took this one step farther we have the ability to have a character sort of narrate the applications this is Dr. Sandy loam who walks the user through how to run the power flow sand tank and then build a machine learning model for it and you know kind of walk everybody through this process and these machine learning emulators can be immediately compared to power flow so you can make different choices about the emulators you can run power flow and then run the emulator and not only to give an understanding of how well it does visually but you can get quantitative estimates of errors and things like this and this is also live running also on a Princeton server sand tank dash ml.hydroframe.org and you know you're welcome to check this out so this is just a quick snapshot we really thank you all very very much for giving us the opportunity to kind of give you a quick tour over some of the things that we've been doing on behalf of Laura and I will you know we'd be happy to take any questions and and have plenty of time for discussion so thank you so much thank you Reed thank you Laura this is awesome this is very inspiring talk and I wrote down the websites to go play around okay like what Reed said we're open for discussion maybe good if you could raise your hands you can unmute yourself you can turn on your video if you want to ask any questions um and while we're waiting for the first question I I actually have one um when when I was listening to your talk you're mostly talking about how to make models and and biological research resources available to the public right to make them more aware I guess and to to to close the gap between science and the community but there is less talk about how to involve a government the government body that that has put in place the water right regulations and those kind of things that might affect the water table and the availability of water quite a bit can you reflect on you know on that the importance of the government government water rights and can there be done anything through science to maybe change some of those water rights um yeah I'll go for it and then you can augment um so I think the answer sort of yes and no um so the water rights are you know really legal entities and and there's a lot of of course discussion of so-called paper water which might be different than um than wet water or the actual water available um what I would say something that we do quite a bit is um and this is through the hydrogen project is the goal of the hydrogen project is to provide democratized data and water you know hydrologic short-term forecast simulations so three to six month form um outlook type risk-based simulations broadly to decision makers municipal um you know city municipal water managers um folks who don't otherwise have these capabilities in house and the idea is to provide this you know more level playing field about how much water do we have right now how much water might be available in your in your watershed and then how might that change over the next three to six months and this really came out of the design process that Laura talked about so um Dr. Lindsay Barrett who's at the Bureau of Reclamation from Technical Services Center was has been a partner in hydrogen since the beginning um and a lot of what Lindsay's group does is support the science needs for the Bureau of Reclamation you know really decision makers so those who are saying all right let's pull a lever on a dam let's release water let's not release water what is the state and it's the TSC's office's job to really support that and so a lot of this um a lot of this really was driven by those needs of that office and how to better communicate with those making these making these types of decisions and so that was a real driver but then this is broadened to municipalities um small municipalities that have to make water decisions very similarly um and even um urban design and you know understanding uh how to better design a urban landscape there's been a sort of sort of explosion of different use cases okay thank you thank you Reed um I see David as his hand go ahead so that's what I would jump in following a little bit on the the government point and firstly great presentation Reed is good to see this and I know I'm helping with some of it um what about other models I guess you described uh mostly work with power flow another part of the government is the national water center working on a new next generation model that's uh is also going to be maybe part of the solutions um and then when you think about the Colorado there've been a lot of national and academy type reports that have revived heavily on the Vic model so and I know CSDMS is all about sort of model interoperability and model many models so what is your thinking about how to I guess sort out which model should be used so um I'll go first if you want me um I think that um so we've been doing a lot of work with power flow because that's that's where we started with but the idea is not really to say this is the best model and the only model this is the model that we have resources and our building tools around and all of the platforms that we've been working on developing we've made really big efforts to make everything we do open source and to try to follow standards that can be community standards so other models would be welcome to um be a part of hydroframe or be a part of any of the things we're doing um and I think that as as we mentioned at a couple points in the talk I mean the answer is really not that we need one best model we need everybody to bring whatever tools they have and to make those tools more accessible so I see what we're doing with hydroframe and hydrogen as doing that with the models that we're working with right now but also trying to provide roadmaps for how we can do that with other models and um I think everything that we've done we've really focused on being open source and inclusive uh read on if you have more yeah no I definitely want to augment your great response because I think the big the big takeaway is all of the software is open source all of the data is open access and it's not just you know to sort of Laura's point about you know it's not enough just to satisfy a national science foundation DOE or journal you know data management plan it's really about not only are we providing these data we're committing to housing them we're committing to housing these model results and providing open access tools to diving in and grabbing these results and then um providing them in a way that they are part of an ecosystem so all of the things that we develop um we leverage you know big projects like x array and other big data management projects to make things interoperable instead of redoing things that we're already doing and connections to hydroshare and such and so a big part of this is to make this more open but then also build the ecosystem and then as Laura said you know we're a totally open club so there's no like we use the model that we're going to use because that's the model that we're familiar with and that's the you know we a lot of these things are very research oriented right so all of the machine learning emulation and all the training everything is super is super research-based and there's a lot of work just to get to that point and so if we had to like do multiple models or or expand out that scope it would just be too much but that doesn't mean that we're you know we would provide this platform and then it's it's free and open access so somebody else could say hey this is great I want to do this um you know in another domain which is similar to say what Pierre Gentine is doing with with his NSF ERC or sorry STC around climate models or might be done in other hydrologic modeling domains it's an open community and you know we should all be sharing and building. Thank you I see two other questions Mark first. Rhee and Laura thank you for your talk. I have a probably a quick maybe trivial question but as a research software engineer and a happy CMake user how did you collaborate with Kitware? So yeah I mean so we're already CMake we're already like CMake compatible and all those things we've been pretty pretty and and glad to hear there's other research software engineers other places and glad to hear you're using the R in the research software engineer I've been heavily engaged in Pixie which is the Princeton Institute for Scientific Computing here I'm on the steering committee and been a big proponent of Princeton's leadership around research software engineering so it's a really great thing okay so how do we get involved with Kitware? They you know contacted us and we started chatting a number of years ago and had shared interests and we wrote a pre-proposal and then we wrote a DOE SBIR it's a phase one which was a small kind of seed proposal and then things really just sort of blew from there and so we were really lucky that we could get them interested in a bunch of different things so there was a lot of work that was done and a lot of this was by one of Laura's former master students who did a long-term internship with Kitware so we modernized we had actually a TCL full TCL interface to PowerFlow that was you know built when Python didn't even exist and you know and we wanted to have this scriptability and key database for input for PowerFlow and you know there was obviously much outdated and that really needed to be updated in Python and be very Pythonic and so that was one of the big things that they did that was one of the big outcomes of the SBIR but then we got them really interested in things like hey let's build this you know gamified interface using a bunch of Kitware tools so that's a pair of the web services and you know a bunch of other things that are sitting on top of the the software architecture and SanTech and SanTech ML but we were just really lucky that we could engage software engineers in this and now with Kitware sorry with Cybers we've got a really great collaboration and really great relationship with them that Laura has built over the years that she's been at University of Arizona and then you know we've hired in our own software team and split between the two groups and two institutions and but we all function as one larger team so we've been really really lucky and that everybody's been so like amenable to this hopefully that was an answer to your question not just a lot of cheerleading it's all good thank you and I see one last question now from Laura go ahead Laura thanks Albert hi Laura hi Reid I really enjoyed your presentation I'm glad I just kind of on a whim saw that you were presenting on this topic and I thought I really wanted to hear this talk I was interested in the convergence accelerator project because you know there's a lot of focus at NSF on translational research and there's the new directorate for technology innovation and partnerships and I don't think all all members of the academic community see a space for you know tapping into this these new funds for translational research so I was really excited to see that you were able to kind of you know find this entry point and so I was curious you know you talked about user-centered design and using user interviews to develop prototypes and do user testing I was wondering if you could talk just briefly about who that user community is you know is it a non-academic audience is it an academic audience like is it all of it I'm curious about the user community like what's the audience for that translation and what kind of deliverables you're pitching to the convergence accelerator program thanks sure I can go first if you want Reid so yeah that's a really great question and the convergence accelerator project has been really interesting and is I think we knew reading the RFP for it that it's really a very different kind of NSF project but I don't think we fully understood how how different of an NSF project it is until going through it and I think yeah Reid's shaking his head vigorously because it has been it's been a really interesting experience and I think what's really different about the convergence accelerator is that NSF I mean there's a couple of things but NSF is really pushing and has you structure your budgets around really having support for doing things like user-centered design and for software development so in other projects you might do those activities but you would need to be doing them like with grad students which is really hard because that has to somehow fit into their research and the timeline of a grad student within the convergence accelerator we have a user experience team there's user experience specialists at Princeton who are professionals working there that have experience doing things like running interviews doing interface design stuff like that we have software developers who are just full-time software developers who are who are doing that and designing and building it we do also have postdocs and scientists who are doing research but their job is to really do research on the machine learning and that's different from this process so I do think it's really it's structured intentionally very differently and it's very hands-on in terms of both in phase one and phase two we've had weekly training sessions on how to do user-centered design how to think about how to make your platform sustainable how you could get income from it things like that so I would say it is it's really intentionally structured different and it's very hands-on with respect to your question about the community so far the community we've really engaged our water resources managers so we've talked to government agencies federal government agencies local agencies local water providers state agencies we've talked to a lot to people who do consulting in the water area and that's that's mostly been our focus we've also been collaborating with the Wi-Fi team in California that does real-time fire forecasting and then in phase two we've expanded to think more about more corporate interests of things like insurance and reinsurance but I would say that it's really time-consuming and a lot of effort and if there's not resources and a structure that support that I don't see how that's really possible within our normal academic framework because it's just not it's not the kinds of things we have time for or support to do so it's really interesting and I think people who are interested in it should look into the new NSF programs and how they work because it is really structured pretty differently. Can I ask you a follow-up question just like just to experience something you said because you said you need all this support like you talked about people who are experts in like this kind of user-centered design and the software developers is that support that was provided by the convergence accelerator or you had to have that support in place in order to be competitive in the convergence accelerator program? I think that the fact that we had a team that was trained in that probably helped our proposal I don't know I wasn't there when they reviewed it but I think the key was that it's something we could write into our budget and we could have hired consultants to do that if we wanted just that it's it's usually it's great to interview people but it's usually time-consuming to figure out those people set up things set up follow you know it's just a big job. Reed did you have anything you wanted to add? Yeah no I'll add a little bit those are terrific answers and the things that I'll add is one that it is a really different NSF type of project and generally my working assumption having you know had a lot of NSF support over the years is that any rule from your regular NSF anything that's a rule against something in a regular NSF project is flipped and it's encouraged in the convergence accelerator and vice versa so like you're discouraged from having students because it's a deliverable driven thing you're encouraged to have a lot of software professionals you're encouraged to have you can have a program officer or you know program manager who helps track track tasks and it's you know just the number of things that you were sort of doing that are way outside what normally a faculty member would do is is really it's been very interesting and I will say that it's been hugely helpful to have the infrastructure at Princeton like that we have a user experience design office that we already have a really significant research software engineering group and that we can add software engineers research software engineers who are embedded in our project but also have co-supervision through the research software engineering group there's a lot of these mechanisms that have been very helpful it's been incredibly helpful to partner with cybers because they operate very differently and that's been a you know real benefit but I will say that you know what it's enabled has just been really incredible because what we've built and what we've been able to do is really different and never would have happened it would have been five years to a decade to to accelerate what we've done and so really like focusing this in and then understanding the user need we built this you know 17-member advisory board it's a very different you know swath of people that are interested in water but they're very important roles but that would not otherwise be people that we would normally understand and and you know build try to build solutions for it. Wonderful thank you Laura thank you Reid for for this wonderful webinar.