 first speaker. So the first speaker for today is Fred Ogden. He's at NOAA, the National Weather Service Office of Water Prediction, and he's going to talk today about next-generation water resource modeling framework, opportunities for community involvement. And Fred, if you can start sharing your screen, the floor is all yours. Okay, good morning. Can you see my screen? Yes. Excellent. Well, I'm pleased to present this today. This is an idea that I think will generate some excitement within the community. At least I hope it will. So I'm talking about the next-generation water resources modeling system, which we envision as a community research to operations software framework for water prediction. So a little bit maybe off topic for CSDMS, because it's at least for now more focused on more of a, you know, NOAA weather service type mission, but we are working with partners and we're trying to make the system design allow a great deal of flexibility so it could potentially be used in a land surface or system modeling context. A little background. The National Water Model was originally proposed in 2011 by Quasi as a community model. It generated a lot of interest in the academic and research communities, but it lacked agency support. The agencies at the time were very heavily involved in developing their own modeling systems in frameworks, often multiples within agencies, and it just didn't gain any traction from that perspective. I now work for NOAA and there's the trajectory of me moving from academia to government is parallel in some ways to this framework idea. In 2014, the National Weather Service recognized the potential for making predictions outside of our traditional river forecast center forecast points and decided that a nationwide continental scale modeling system was needed. And in 2016, version 1.0 of the National Water Model was operationalized by NOAA National Weather Service based on Worf Hydro, which is NOAA MP for a land surface column physics running on one grid and then routing functions operating on another grid with a conceptual groundwater component. Now the National Water Model, the blue lines here show the lengths of river that are currently forecasted by our river forecast centers and those coincide with observations at USGS stream gauges and that's about 110,000 river miles for continental US. There are some forecasts outside of US as well, but if you look at the National Water Model, this is showing basically the NHDPlus version 2.0. We've gone from 110,000 river miles to nearly 3.4 million river miles and that is an enormous leap in capability from the perspective of flood forecasting. The current operational National Water Model, that could be changed to 2.1, it just went operational in the last six weeks, is NOAA MP running on a one kilometer grid and that does the land surface column fluxes. The 2D steepest descent, which is a quasi 2D method of diffusive way of routing on a 250 meter grid, lateral subsurface flow within the 2 meter soil column that's assumed homogeneous across the continental US and in other places. Water is allowed to leave the bottom of the soil and go into a conceptual nonlinear reservoir that provides base flow. The water is put into the NHDPlus network and routed using a vector-based Muskingum-Kunj method and then recently we've added basic water management incorporating reservoir releases and their forecasts where they're available from agencies or our river forecast centers and that has led to a significant improvement in model performance downstream from reservoirs. When we look at the performance of National Water Model, this is an analysis of the National Water Model version 2.0 retrospective which was approximately 25 year period between 1993 and 2018. This plot shows the median warm season event scale absolute peak discharge error. So we've written code that goes through the entire 25 years of output, segregates, identifies a warm season to avoid cross-ferric and ice snow processes and then identifies individual events, compares say the peak discharge or the runoff volume or any other metric you want to calculate versus USGS observations. And what you see here is you see a pretty significant regional dependence in the Pacific Northwest and in the Northern Rockies the formulation of the model has errors that are within mean absolute errors within plus or minus or excuse me median absolute errors within plus or minus 25%. But in most places it's actually greater than that and the violin plot shows the distribution of the station median error around the continental United States. So we can use this to perhaps consider that there's a regional variation in the performance of the model and that perhaps applying one uniform formulation to the entire continental United States isn't the optimal way to proceed. The literature supports this. It says that many hydrologic models or hydrologic models formulated for specific dominant local processes consistently outperform general models when they are appropriately applied and that's related to the uniqueness of place concept. Models with fewer parameters that describe dominant processes outperform general models that emphasize process through parameter selection and that's the parsimony argument. And then there's a lack of comprehensive hydrologic theory regarding stormflow generation which means that there is no one model to rule them all. There are places where different theories of runoff generation or stormflow generation are more correct than others. Now since the original national water model was proposed by Quasi a lot has changed. We've got these enabling technologies. The computer science and engineering with the open source development paradigm and the fact that we've now produced a new generation of programmers that are used to thinking that way. The BMI standard wasn't published until 2013 and it provides a very thin middleware, very minimally invasive way to think about coupling models. New geoscience data standards, WaterML 2.0 high features data model and very clever data storage containers, HPC evolution. And then if you consider those advances in the context of hydrologic science and engineering this figure shows we have a spectrum of different formulations and each one of them has their own place. Some of them are really nice when you have the data to drive them but often we don't. So simplifications, the top left shows the, you know, say what a real watershed looks like. It's got varied land use, land cover. Everything varies. Any one of these approximations might be appropriate for that and some more than others depending on the hydrogeo-climatological setting, which we could call the hydrologic landscape. And then, you know, we've got the new 800 pound grill on the block, the machine learning approaches which have demonstrated skills in certain situations as well. So these enablers, our experience with the National Water Model by ours, I mean NOAA, the current operation National Water Model suggests that regional variation performance suggests regional formulations might be appropriate. And parsimony, Occam's razor suggests that optimizing complexity is likely to speed simulation with increased predictive skill. Water ML 2.0 high features data model will allow model setup workflow unification. And that's really important because everybody's struggling with model interoperability. Almost every formulation that's invented in academia or research has its own data model. And subtle differences between those kill interoperability. If we can identify a data model that describes the hydrologic landscape in a meaningful way, we will be able to create unified model setup workflows. And that will also provide a consistent method to couple models to a hydro fabric like the NHDPlus. And in terms of computer science, the GitHub unit testing open source development model is mature. The existing basic model interface coupling standard is definitely very useful for us. Moore's lies are going to save us. We have to run this thing on supercomputers. And machine learning again is proposed to make very significant advances. A little bit about the water ML high features data model, it's a pretty simple standard. It divides the landscape into catchments. Things that move water in kind of a 1D context are flow paths. Things that store water or move water in a 2D context are called or maybe 3D are called water bodies. So you can take a real system like the cartoon shown over on the left and break it down into these things. And the connectors between catchments flow paths and water bodies are nexus, a nexus construct or nexi. And they really represent boundary conditions between different objects. And as you know, boundary condition could be zero, one, two, three or four dimensional. This flexibility of this standard really allows us to couple this similar models use adapters and mediators and and other exchanges of input codes that exchange information across model boundaries to make models with maybe dissimilar dimensionalities work together, which is a common problem. In October of 2020, we had a joint NOAA USGS US Bureau reclamation in US Army Corps of Engineers meeting with the Ertig folks to talk about requirements for such a framework because we would like we'd like this framework for a next generation national water model to be more useful to are useful to others besides just NOAA we'd like our federal partners to benefit from it as well. That would expand our collaboration potential reduce competition competition, increase collaboration, and also benefit the research and development communities and academia and industry. So these are the requirements that we identified. We'd like the framework to be model agnostic with maximum flexibility so those models data sources and needs change the framework can adapt common architecture to avoid duplication and promote interoperability, open source development to promote code reuse and development efficiency, create an authoritative repository of federal water models and ease and encourage participation by partners in the community. We'd like to apply standards or applicable for coding and coupling data and metadata model verification, validation of test data. And but maybe above all, we'd like to be friendly to domain scientists and engineers. It will enable sharing of models data and results. We could create a library of model codes and data sets and evaluation tools. We realize it is important because we're, you know, hydrology, the breadth of hydrology is vast. And then we start coupling it with data science and computer science. We need to maintain a glossary and define terms to communicate clearly across disciplinary boundaries. And we've taken a start at that. We'd like to use mature open source libraries where appropriate. We propose multi language support C++ C Fortran and Python are the four languages that we're focusing on right now. We'd like it to run on hardware from laptops to supercomputers, which is which is surely doable. And here's the big one. We have a two week target to allow graduate students or new employees to add functionality to the framework. That will require excellent documentation and step by step examples and tutorials. And it requires a knowledge of structured programming, but not a computer science background. And that's that's a key, key item there. In this current calendar year, our objectives are to extend the basic model interfaces needed to function in the HPC environment. And we're focusing initially on having capabilities to do load balancing. And in fact, we're working with Scott Peckham on that. We want to demonstrate the next gen framework running a conceptual functional equivalent to the national water model running conus wide at the AGU fall meeting. And it will be linked to inland hydraulics routing with reservoirs and linked to the national water model forcing engine. We by the end of this year will implement a topographic witness index model formulation. We will develop and implement an infiltration access model formulation for testing. We will improve snowmelt and other flux calculations. We are working to engage with our river forecast centers, federal partners in academia and begin next year calibrating regional formulations and in collaboration with our river forecast centers and others. So to sum it up, we developing this next generation water resources modeling framework to engage federal and academic research communities using open source standards based development in a model agnostic focus with interoperability in mind, linking it with NOAA and other weather and climate models that will ease scientific evaluation of coupled models and methods running on hardware from hardware from laptops to supercomputers and a domain science engineer friendly model coupling interface with this target for two weeks to add new functionality to the framework. So that's what we're doing. These are the partners we're working with at this point in time. And if you would like to, we're doing it and we've already put our code on GitHub. So for transparency, we we we've had a few contributions from people outside of the the core and our plan is that at AGU when we when we roll out the demo, we anticipate by that time having the design more or less complete the documentation for download install set up in place so that the community can start to work with it. And there's my email address if you have any comments or questions and I'd like to thank you. Thank you very much, Fred. Your presentation speaks very much towards the coding philosophy of CSDM so much appreciated. So we have a few minutes for questions. Again, if you want to ask a question, use the raise hand feature and the reactions. And I will, and we will we will unmute you. Or otherwise use the chat. And while questions, while people might, you know, need some time to start typing, I have a question for you. So you mentioned towards the beginning, you talked about incorporating the reservoir management actions. And I'm wondering how precisely you're doing that in a forecasting situation, right? Do do operators? Do they indicate on a, you know, 24 to 72 hour level kind of how they're planning to operate their dam? Or is it more like they give, you know, throughout the year, roughly a seasonal indication of how they plan to operate them? That's a really good question. We've we've done a lot of study of that. First, our river forecast centers have relationships with many big reservoir operators. So they can, they do establish a connection and get their get an idea of how much water they'll be releasing over the next 24 hours. So where that's available, we use it. When it's not available, we get a study of machine learning, going back and looking at a number of reservoir systems, see if we could get a machine learning algorithm to predict based on current storage, time of the year, amount of rainfall received. And then we compare that against a persistence model, just assuming that for the forecast period, the flow remains unchanged. The persistence module actually did better at predicting the flow. And it did considerably better than, than making assumptions about releases based on just the hydraulic geometry of the dam, it's still there. That's a, it's a really challenging problem. But it's now that I'm in federal service, I'm going to try to be pushing to get data from more dams and reservoir releases. Yeah, and it's just amazing how many dams there are actually. So yeah. I see one hand, Greg. Yeah, thanks, Fred. Terrific talk and really interesting project. I'm curious what your thoughts are. I love the idea of a two week target for somebody to be able to contribute if they're grad student or new employee. What are your thoughts about what are the key things that person has to already know in order to be in that position? For example, do they have to have had a course in numerical computing? What do they need to know about hydrology? I mean, have you sort of a list of like, once you've studied these basic things, then two weeks from, you know, the first learning about the system to contributing? Yeah, I mean, the new employee question is easier to answer because we hire people that have that background, right? The new graduate student question, you know, maybe not all grad students are going to be able to do that. You know, perhaps somebody, a new PhD student or a master's student who's had sufficient training. I think they'll definitely still be an academic filter that will prevent people from being productive. But what I'm going through personally is the process of, you know, working with our staff from, I've never actually written it to be a my compatible model. Learning how to do it myself well enough that I can help to prepare the teaching documents to get other people to interact with the framework, because there will be some extensions for HPC application that we're still designing at this point.