 Okay, let's get started. Greetings and thank you for attending this month's science seminar presented by the NSF's National Ecological Observatory Network, operated by Vitell. Our goal with this monthly series of talks is to build community among researchers at the intersection of ecology, environmental science, and the end. So today, we are excited to have Drs. Quinn Thomas, Freya Olson, and Catherine Wheeler. But before I introduce our speakers, I'll go over a few logistics. So we have enabled optional automated closed captioning for today's talk. If you'd like to use it, find the CC button in your menu bar. The webinar will consist of a presentation followed by Q&A. As you think of questions, we've added them to the Q&A box, and we also have a meeting chat. We use this to share links and other items of interest, add speaker questions to the Q&A. We'll facilitate discussion at the end, and there should also be an opportunity to ask questions over audio. So if you'd rather speak, we can do that too. Neon welcomes contributions from everyone who shares our values of unity, creativity, collaboration, excellence, and appreciation, is outlined in our Neon Code of Conduct. This applies to Neon staff, as well as everyone participating in a Neon event. The full code of conduct is available via the link that Samantha has shared via chat, and also on the bottom of the Science Seminars webpage. So this talk will be recorded and made available for later viewing on the Neon Science Seminars webpage, so if you're interested in revisiting it, it will be available there. To complement our monthly Science Seminars, we host related data skills webinars on how to access and use Neon data. Registration for those is available on our same Science Seminars webpage. You scroll to the bottom. So here's the seminar webpage that's shared. So we have our schedule of talks, but if you scroll to the bottom, there's the data skills webinars. And coming up at the end of October, Freya, one of our speakers here today in Quinn, will give a data skills webinar in ecological forecasting. So more cool stuff about forecasting that you can check out there. Lastly, if you have ideas for a great talk for the seminar series, nominate yourself or a colleague today by filling in the form on our Science Seminars webpage. I'll show you where that is. We'll scroll up on the webpage here. There's this nominate a speaker button. So we take nominations at any time throughout the year, and then we have a panel that selects speakers for the next academic year over the summer. Okay, so I will introduce our speakers now. So today, as I mentioned, we have Drs. Quinn Thomas, Freya Olson, and Catherine Wheeler. So Quinn is an associate professor and data science faculty fellow in the Department of Forest Resources and Environmental Conservation and Biological Sciences at Virginia Tech. He was a postdoc at NCAR, the National Center for Atmospheric Research here in Boulder, where the NEON folks are based, before starting his position at Virginia Tech. He's also the lead GI for the NEON ecological forecasting initiative, forecasting challenge research coordination network. I probably butchered the actual name of that, but this is their research coordination network. Focusing on ecological forecasting is what they're going to talk about today. Freya Olson is a postdoctoral research associate in biological sciences at Virginia Tech, but she got her PhD in environmental sciences from Lancaster University and the UK Center for Ecology and Hydrology with a focus on lake management and the interaction between lake hydrology and water quality. And Catherine Wheeler is currently a postdoc with the NOAA Climate and Global Change, or is currently a NOAA Climate and Global Change Postdoctoral Fellow at MIT. She received her PhD in Earth and the Environment at Boston University. And so I will turn it over to you all to talk about the NEON ecological forecasting challenge commuted Quinn. There we go. Thank you for that introduction, Eric. Can you see my screen? Yes. That's great. Freya, Catherine and I are really excited to be here and share kind of the framework behind the NEON ecological forecasting challenge as well as results from that. We'll be tag-teaming this seminar as a kind of a coherent whole. It won't be just three unique seminars. And so to start, the broader context is that decisions are being made in the context of a rapidly changing environment. These decisions relate to all kinds of ecosystem services and human ecosystem, human environment interactions, things like algal blooms and endangered species, fisheries, water supply, even forecasting when the fall colors are going to be great for looking at the leaves. All of these decisions are being made, whether we have kind of good information or not. And the goal of ecological forecasting is to kind of help provide information so that decisions can be made to guide us to a world like on the far right where we want to be as humans. And so this sort of brings this idea of basically predicting nature like we predict weather. The idea of taking the observed world and interacting with models to provide actionable information. And the ecological sciences are undergoing a transformation similar to what meteorology and weather forecasting has done over the past multiple decades. And to provide a little bit of context, I want to highlight how a weather forecast comes to be. It starts with a numerical weather model that represents the physics of the kind of atmospheric system. And that model is run and then data is then brought to bear to help adjust to make that model as consistent with the world as possible. The data comes from diverse sources of diverse quality, but that data is not all things in the model everywhere. And so that data is used to adjust both the things that are observed and unobserved to be as consistent with now as possible. That updated model is then used to make a forecast that is initiated at our best guess of where it should start into the future. The output of that numerical model is then translated into decision support to produce what we commonly think of as a weather forecast, things like highs and lows and probability of forecast, things like that. And the real revolution in weather forecasting has come from multiple parts. The first part is improvements in our ability to model the weather. The other is our improvements in our ability to observe the weather. And then finally our ability to combine models and observations. And so that's really accelerated the process of weather forecasting. And the ecological sciences are really in that sweet spot now where we have rapid improvements in our ability to model, improvements in our ability to observe, and our abilities to combine models with observations. And that ability to observe is a central component of the advancement in forecasting and really highlighted by the rise of our ability to sense all kinds of aspects of nature, which is really neon is a great example of that. And so ecological forecasting is a growing field. Here we define a forecast as a prediction of the future with uncertainty. And there is an increased adoption of best practices and those best practices on how to produce, represent, report, evaluate forecasts are, we need those for inner comparison. And some best practices are really being brought to bear and there are a lot of gains in that. Others we need to continue to approve upon as a community to ultimately help us realize the potential gains from ecological forecasting to help benefit our understanding and ability to manage ecosystems and environmental systems. And so what we need are more intercomparable forecasts from a diversity of perspectives and approaches that engage more partners. This will allow us to fulfill the kind of the dream of what ecological forecasting can do for society. And so to address this goal, we've kind of brought these two organizations together. The first is ecological forecasting initiative, which is a grassroots consortium aimed at building an interdisciplinary community of practice in ecological forecasting. This group was really launched in 2019 and has grown to be quite a large and active group of folks who are really engaged in forecasting from all different angles. This isn't just folks running models on their computers. It's people who are really interested in decision support and education and DEIJ and many aspects of the forecasting enterprise broadly. And the other is NEON, which most folks here are probably familiar with, but it's the standardized threshold and freshwater data that are with ongoing collection that are freely available across the U.S. And so we created the NSF sponsored ecological forecasting initiative, Research Cornation Network, which is a five-year project that, with the goal of creating a community of practice that builds capacity for ecological forecasting by leveraging NEON data products. And that project, you know, we're still in the middle of this, you know, the window of this project. And what really excited us about this ability to kind of focus the research community on NEON data and a very practical kind of touch point. And to do that, well, first of all, introduce the steering committee of the project that has been working together to really make a lot of the decisions and community leadership behind the scenes. And so the platform that we really developed to kind of focus the community so that folks are forecasting the same things, we're talking the same language, we're sharing the and co-developing the same tools is this NEON ecological forecasting challenge. And there's a paper that came out this year in Frontier's Ecology and Environment. That's a very short paper that describes the challenge. But what the challenge is, is it's a platform that both challenges, as in sets the rules and the cyber infrastructure and, you know, the ways to engage and empowers by providing the training and the tools and the community to submit iterative, which are repeated forecasts of near term, which are kind of data kind of year ahead, forecasts of yet to be collected NEON data. This is NEON data that not even NEON folks know what it's going to be because it has yet to be collected. So it's a genuine test of our forecasting capacity. And this talk is really going to focus on what is a NEON forecasting challenge, what are the elements of a forecasting challenge and some kind of results from running this challenge over the past few years. And so I wanted to really start with this diagram that's in the paper about the challenge. It really highlights the components of what an ecological forecasting challenge involves, where you have the kind of the inputs to the challenge being data, also inputs from kind of weather forecasts and inputs from the community through training and templates and how all that kind of feeds through to produce a whole catalog of forecasts that we can analyze and understand patterns of predictability. So I'll start with the data behind the forecasting challenge. And that builds off the NEON project and to summarize it's 81 sites. And as of when I last checked about 182 openly available data products standardized across sites with a planned 30 year horizon that really kind of had its completion, total completion around 2019. Really importantly to this challenge to be able to compare forecasting capacities across system and scale are the fact that it's measuring both aquatic and terrestrial, physical, chemical, biological, population, community and ecosystem dynamics, which allows for us to really do some really powerful synthesis of our forecasting capacities across these different ecological systems and scales. And forecasting is kind of fundamental to NEON's mission. The NEON strategic plan from 2011 talks about NEON's goals. And you can see that forecasting is very clearly highlighted in two of the three goals here that NEON has really sought out to. Yet, NEON has really focused on developing robust data products for the community to use. And we view this NEON ecological forecasting challenge as a way to help NEON achieve its mission of the underlying forecasting component that's very dominantly highlighted in its mission. And so starting from NEON data, that NEON data is that we download that and standardize that as time series using automated kind of workflows that are occurring every day. And those time series are specifically designed to be really easy to work with in a variety of modeling frameworks. And we call that the target data. The target data really came out from a community design process. Back in May 2020, we had a virtual meeting of well over 100 participants. And you can guess that that virtual meeting was originally an in-person meeting actually supposed to be at NEON headquarters that we had to transition. And it turned out to be a real opportunity to really engage a much larger group of folks. And during that meeting, there was a lot of discussion about what it would be the most exciting NEON data products to forecast. And folks really narrowed down on these kind of four kind of areas of interest being natural climate solutions, things like carbon storage, water quality, biodiversity conservation, and infectious disease. And then the community kind of came together and decided on these focal themes, which are the five areas that we really that we focused the forecasting challenge on. There's an aquatics theme that focuses on forecasting temperature, oxygen, and chlorophyll A at the one to 35 day ahead horizon across 34 sites. And data becomes available to evaluate forecasts within three days of it being collected. And so that's the three day latency. We have the tick, larval abundance of one to one one week to one year ahead across the 47 sites. And that has about a six month latency due to the time required to identify the and count the ticks. There's a carbon and water flux team that's the one to 35 day ahead horizon. And you may ask where the 35 day ahead horizon comes from. And that's because NOAA produces a weather forecasting, an ensemble weather forecasting product that runs out 35 days ahead that we make available for folks to use. So the carbon and water flux is from the flux towers at the one to 35 day ahead at the 47 sites. And that has a five day latency between data being collected and made available to either train your model and evaluate past forecasts. There's a beetle community richness that kind of parallels the tick where it's the one to year ahead forecast across 87 sites with a six month latency. And find the phenology one one to 35 day ahead at the 47 terrestrial sites. And that has near one day latency because of the way that the phenocam data is made available in near real time. And so after these were decided, the teams got together as a design teams and really discussed all the small decisions that matter on how you convert neon data into a kind of a standardized time series of these particular data sets that people wanted to forecast. There was a cyber infrastructure design team that designed the automation to run the forecast behind the scenes. A working group focused on the standards for how to save the forecasts, column names, things like that. And then partnerships, folks who really were invested in the forecasting challenge that had a 2021 launch and over 600 folks in this entire process have been involved across the globe. And this gives an example of a time series. Here's a file that you can read in from a stable URL to get kind of the most up to date aquatics. This actually for a site in Florida that has both an aquatic and a terrestrial site, then you can see the time series of the different things that folks are trying to forecast as part of the forecasting challenge. And importantly, this data is becoming available every day as soon as new data from neon become available. So you're always able to use the latest information to train your models or update your models. And then you can also, we automatically then say your forecast two weeks ago of today's data, today's data comes in and we automatically evaluate your forecast with the new data. And what's really great about this is that because the forecasts are by definition out of sample, because the data has yet to be collected that will be compared to the organizers can be really involved in the forecasting challenge because there's no holding back data that we've seen that others haven't. And so it allows for the folks that are excited about organizing it who are most who are really interested in being engaged, you know, actually doing some of the forecasting themselves. And so to support the challenge, because in many cases, the forecast may want to have what we call covariates or drivers. We make numerical weather forecasts from from NOAA available really easily to teams. And we created in our package called neon forecast where you can provide the date that you want the forecast to start the site, the variable, you know, and be able to download from our cloud storage, a time series of a future forecast of weather for a particular site and use that as input to your forecasting model. We're doing this automatically behind the scenes every day. So you actually have available to you all the weather forecasts that have been generated for neon sites since the fall of 2020. So you can sort of, you know, calibrate your model on that way as well. So to help folks contribute to the challenge, we really emphasize build the training and templates with training being workshop materials. Also going to where folks are meaning like going to ESA or going to meetings where folk that aren't necessarily focused on forecasting, but have people who are domain experts who might want to engage in forecasting and also templates, meaning that like GitHub repose repositories or other materials where folks can kind of plug in their ideas, but they don't have to worry about all the other infrastructure associated with submitting to the challenge. And some examples are we ran a training session at ESA this this past summer, ecological side of America meeting, also the global lakes ecological observatory network, Freyja ran a training session. It's been it's underpinned multiple university courses as the project, which it's an incredibly awesome project because you get your class forecasting early in the semester, their forecasts are actually getting evaluated they can learn how well their models are doing and update their models. And so this iterative learning cycle is really can occur within the course of a semester long class. And because all the cyber infrastructure behind the scenes is automatically accepting and scoring forecasts, the students don't really have to worry about that part they just have to get excited about building models and learning from those models. We provide templates and tutorial code. And we also, you know, this work paper by Alyssa Wilson really thought about how we can then address the opportunities and inequities in undergraduate education and broaden participation in the forecasting enterprise, you know, writ large. And in fact, you know, this is an introduction or a plug for the upcoming neon data skills webinar, which will walk you through how to submit a forecast to the neon ecological forecasting challenge that isn't just about the nitty gritty of forecasting that at, you know, particular challenge but also teaches you kind of broad concepts in ecological forecasting that you need to know to kind of really think about what it means to produce a forecast and interpret a forecast. And so this is a great training opportunity in that area. And so you can register here if interested. We really invite you to come. So the folks teams start submitting forecasts using the training and the tutorial. And importantly, because one of the best practices in ecological forecasting is to compare to a null or baseline model that represents kind of some sort of rudimentary understanding of the system that we start generating those. And all together when you combine these like baseline nulls and the forecast have been submitted by the contributors, we are up to nearly over 23,000 forecasts submitted by, you know, over 200 teams. And you can see that this is spread across, you know, the different themes, both on the right, which is the number of teams and on the left is the number of forecasts. The number of forecasts is so much more than number of themes or number of teams that's on the left is because of this iterative nature. Each teams can submit forecasts every day. And for the one day at a time forecasts, the total number of forecasts ends up building up over time. But we've had kind of wide engagement. And we've been really excited to see these numbers contribute to the forecasting challenge. And so one of those is these baseline forecasts. And here's an example of the baseline forecasts. There's two that we automatically submit that everyone can compare their forecasts to, which is what we call climatology, which is the historical day of your mean of the neon data at that site, and persistence, which is tomorrow is the same as today, plus some random noise, sort of like a random walk. And here's just an example of one of those, those two kind of null models for water temperature at one site, you can see that, you know, the red being the climatology, which has a sort of constant spread over the, over the forecast horizon, which is the time period in the future. And then the random walk is kind of gaining spread over the course of the horizon, as we move further and further from today. And so these two don't have a lot of ecological knowledge embedded in them. And so they represent kind of null models to ask whether adding more complex knowledge or models actually builds on top of these kind of baseline assumptions. So these forecasts are coming in and we're building up this big catalog of forecasts that we are scoring and develop summaries for, so folks can look at how forecast are performing over time. And to get there was the development of these standards that are available at the pre-print link here, and as a paper that's accepted in ecosphere led by Mike Dietz, where the standards of how we format the targets files and the format files really allows for the automated scoring as new data comes in. And we focus our scoring on, and a score is basically how well a model, a forecast and a data point compare. And we focus on the continuous rank probability score as our metric, because it emphasizes both the ability to predict the effect, the mean behavior, but also capture the uncertainty that the forecast uncertainty is robust. And I'll highlight what I mean by that. So for example, here is a example of a kind of an isocline of a score, meaning one of these lines, this line like say right here is a score of one. And this is a forecast where the forecast say here is I say I forecast that water temperature is going to be eight with a standard deviation of say four. And then the observations come in, and the observations were eight. That forecast did great predicting the mean behavior of that observation. And so you could call this a bias of zero and a standard deviation of four. But a standard deviation of four is quite uncertain. It says that I'm not very confident. I guess the middle right, but I wasn't very confident. And so you can actually score better by having more confidence and balancing that against the bias. For example, say I was a little bit, my prediction was a little cold by a degree and a half, but I gave two degrees or two standard deviation units. And you actually drop down, because here's the standard deviation here, you actually can drop down a unit, a half a unit of scoring or so by actually better representing uncertainty. And so this type of forecasting score really allows us to capture the, you know, this tentative, this important tentative of the ecological forecasting initiative at large, which is the importance of representing uncertainty in a forecast. And so we focus on this continuous rate probability score metric as a way to really capture that either. And so those scores are, again, this is a, this database is building up every day in real time. And those scores are made available for real time analysis of the submissions to a dashboard that's available for folks to look at and see how their forecasts are doing. And as those scores build up over time, they become available for us to really think about synthesis and combine them together and do the analysis. And the real power of the NEON project is that we can really focus the synthesis on a lot of different questions. For example, how does our forecasting capacity vary across ecological system? NEON is collecting all these different data. What about site? There's 81 sites. Time of year, the data is continually being collected. Modeling approach, we can solicit a lot of different approaches, whether it be regression, simple regressions. We have our baseline models. We have, you know, people are submitting machine learning models. We have process models, which embed kind of mechanistic understandings of how the system works. We have models that combine multiple different things, like, you know, a machine learning and process model combined together, or even models that just aggregate other models. All of these things can be compared. And also we can think about lead time. How good are we one day ahead? How good are we a month ahead? All those questions that we can address with this catalog of forecasts that's being built up over time. And so we're going to focus here on examining the catalog of forecasts to kind of better understand our forecasting capacities for these two themes here. Phenology, which Catherine will talk about, and the aquatics theme, which Freya will talk about. And so I will hand it over to Catherine. Can you guys see my screen? All right. Cool. I led the first round of the phenology forecast challenge, where in the spring of 2021, we focused on just forecasting greenness at plant canopid eight deciduous broadleaf forests in the neon domain, shown on this map to the right. And we had people submit over 192,000 predictions of greenness. And we had over 40 participants in the challenge, and everyone who participated were offered to become coauthors on our manuscripts. And those coauthors are listed here. And we have a preprint available if you are interested. So we were, as Quinn mentioned, we were able to quantify and measure phenology using phenocam data, which are just future phenocams, our digital cameras that are positioned to point at plant canopies and take repeated images of these canopies. And then you can analyze these images for the percent greenness and see how that greenness changes over time. So here's an example of phenocam data, greenness data that's collected at Harvard Forest for 35 days at mid-May of 2021 through mid-June. And we were able, and you can then also analyze these data to be able to determine what days the canopy has 50 percent greened up, also what days the canopy has 85 percent greened up. And then these are some images of the phenocams to give you kind of a visual of what 50 percent green up and 85 percent green up is. And then if we look at, these are some example forecasts of models that submitted forecasts for this specific day. As you can see, we have this in this red, we have the day of your mean null model, which Quinn talked about. And that has a pretty wide confidence interval, but it was able to really include most of the observations. Some other models had really narrow and maybe were too slow. But overall, most of the models, at least in this example, kind of show that greenness is going to be increasing during this next month at some point. And we were also really interested in how the skill is changing with based off of lead time. So here we have the CRPS for the different lead times on the x-axis minus what the CRPS was at lead time of one day. And positive values indicate that it's worsening with lead time, with a longer lead time, and negative values indicate that it improves with longer lead time. And overall, we see that most model classes, so we have the different models characterized into different classes, which these different colors represent. Most of them are the skill is worsening with longer lead time, but we actually did see one model class that includes covariates. So models that include stuff like the temperature is actually slightly improving with longer lead time, which could be because they are becoming I mean, could be because of certainties are changing over time. We were also really interested in what these, which model classes performed the best and had the highest skills and how these really compared to the day of year mean model class. So here we have the CRPS value of the model class, or the model minus the CRPS value of the day of year mean. And these positive values over to the right of this line indicate that it's the model is performing worse overall throughout the spring of 2021 than the day of year mean. And then the once the right indicates it performs better. And as you can see those only one model, this green bears par, which is a day of year plus kind of historical average of so aesthetically active radiation model is the only one that performed better than the day of year mean across the whole all the sites and the entire forecast period. And then but we also then ranked them by classes, which the class averages are shown by these vertical colors. And the day of year mean performed the best of all the different model classes. And I think this is really important because Quinn mentioned the day of your mean doesn't have a whole lot of ecological by basis to it. But with phenology phenology can be very is can be very driven by photo period and voter period does not change, or it's the same for each day of year. So day of your mean is a really is a really strong null model for phenology. We're also interested in where forecast skill was the highest. So this is showing different sites and connect the site predictability CRPS on the y axis and with CRPS higher values indicate lower skill or values indicate higher skill. And this is versus the day of year of the 5% green up so one that the canopy is 50% greened up. And we've found that based off of this specific year in these specific sites, earlier sites had much higher skill and later sites are much harder for us to forecast. And then finally you're really interested in what part of green up is the skill lowest. So we standardized all the forecasts based off of what what date they're forecasting and how much that forecast a date is different from that site specific day at 50% green up. So that's shown on this x axis where negative values indicate that these these are forecasts of days that ended up occurring before the canopy had started green up and positive ones indicate that say this 20 indicates that they were forecasting a day that occurred 20 days after the canopy had hit this 50% green up. And then we looked at this for each of the different sites which are shown in these different colors with the overall shown in black. And then these vertical stripes indicate the days of 85% green up so you can kind of get a sense for if these days of 85% green up are pretty high then the green up takes a lot a long time to green up and if it's short like these green and blue over here then it means that it greened up really fast. And we found that across all the sites the predictability was highest during the winter which makes sense it's not green all the models forecasted that it wasn't green in the winter but that the skill was lowest at the very end of green up so right past these days of 85% green up so in the future we're using these conclusions to hopefully improve the models and know that they're pretty bad at forecasting late green up and the end of green up but then also inform other hypotheses and these other rounds so we've added more sites because we were internally into the site differences but we didn't have a whole lot of sites initially so we're just using these to create developed hypotheses for future rounds and I'll pass it to Freya. Thanks Catherine. Yeah so I'm just gonna give you a bit of more preliminary work that we've started on the aquatics theme. Catherine's done some great work with the phenology and we hope to replicate some of that and learn some more about our aquatic sites as well. So in terms of what data we're going to be looking at we're going to be looking at this year's forecast that I've been submitted we've been making a real push to get more engagement with our aquatics theme and it's been really successful which why we're able to do this. We're going to be looking at all three of our water quality variables that were mentioned the water temperatures, dissolved oxygen and chlorophyll A. The theme covers all 34 sites so this includes seven lakes and 27 rivers and streams. We've been able to collect 43 different forecast models accounting for almost 180,000 individual forecasts at each of these sites. We're hoping to run this until the end of the year and do the synthesis on a full year of the forecast submissions. Just to give you a bit more context not all 43 models forecast all variables at all sites. We do see a bit of bias towards water temperature forecasts with most of our models forecasting water temperature. Only nine of our models do not forecast water temperature. These are generally process based models for chlorophyll. We do see 18 models that forecast all three variables which is really exciting. We also see quite a distinct bias towards lake forecasts with 17 of our forecast submissions only forecasting the seven lakesites. Again this is a bias towards process models that are specific for lakes. Today I'm just going to show you some preliminary analysis from the water temperature forecasts in our lakes. Just as an example forecast of the types of models that we're getting submitted these are forecasts from late August. Each of the coloured lines representing a different model, different team and our observations of water temperature shown by the points. These are our two lakes in Florida, Barco Lake and Suggs Lake and the water temperature forecast. Just wanted to highlight just a couple of the different model types that we're getting submitted. We have these process models such as Flair, GLM and Gotham. Some empirical models including a lasso and a linear regression. We also have some machine learning forecasts being submitted including Random Forest and a number of multimodal ensembles which combine forecasts from a few different forecasts. We can evaluate these and look at their skill relative to our known model so again as Catherine showed the negative values indicating better than the climatology null and positive values indicating that the model is worse than the climatology null. The forecast here for Suggs Lake show that we have much more skill than the climatology at shorter horizons but after about 15 days there are fewer models that are able to do better than climatology. For Suggs Lake particularly we see that the profit model and the Random Forest do particularly well across the full forecast horizon outperforming climatology for all 30 days. Looking at which model class does better this is across all of our lake sites looking at water temperature and we see that we have a number that outcompete climatology and these are dominated by our process models and our machine learning. However this is not always consistent across sites so if we break this down by site what we see is that the skill is variable across sites with our Prairie Lake and Prairie Pothole Lake and Tulik Lake having skill which is much higher than climatology compared to Crampton Lake this one in the middle here where none of our models out compete climatology. Also we're noting that the model class that is most skillful is not consistent so for Suggs Lake we saw our Random Forest and our profit model but for our two Prairie Lakes we see the process models doing better. If we think about how this might translate into overall forecastability if we compare all the models submitted at all of the sites what we find is that again our Prairie Lake and Tulik Lakes have more models that are out competing climatology and this is true across the forecast horizon with only Prairie Lake able to out compete climatology across the full 30 days whereas Crampton Lake there's no forecast horizon in which the models that were seen submitted actually out compete climatology. Now these are still quite preliminary findings and we're still trying to do some analysis into what it is about these sites that makes them more or less forecastable. Okay that's me Gwen. Great that gives you an example of one that we really the Research Coordination Network has been able to engage early career researchers and provide them opportunity to kind of grow their both science and leadership skills in terms of being champions for these different themes and really try hard to support them in their work and you can see some great synthesis that comes out that's influenced by the network because we're thinking about the analyses in similar ways which helps us build this kind of cross-themed synthesis which we're working towards and here's this example of if you look at both Catherine and Freya's talks some kind of early synthesis is climatology is hard to be. If you have observations at a site over a historical time period that's a pretty good starting place for an ecological forecast at the kind of the horizon or lead times that we're talking about here but things like water temperature may be better able to beat climatology than kind of a more complex biological process like phenology. So altogether you know this ecological forecasting platform is really a focal point to advance the field of ecological forecasting. There are a lot of things that you can see that help advance the field beyond just the particular application of forecasting neon data within this platform. For example, we've had to work through protocol standards and best practices for submitting forecasts and archiving forecasts. The training material workshops has grown the field and it has helped encourage diverse participation. There's been supporting software and cyber infrastructure that you know have applications beyond the challenge. Partner engagement through you know folks like the USGS and neon and the national phenology network and then finally this is also building up to kind of this multi forecast synthesis that will help add a body of knowledge to the ecological literature. And you know the growth that's come out of the challenge is both the people. For example, we had a conference at neon this past June really it was an unconference so the participants got to kind of pick what they wanted to work on and you know they picked projects that really kind of centered around the forecasting challenges a focal point but weren't all just about forecasting a particular neon data product. There were some about viewing forecasting challenges through the lens of design justice. There are others about you know how best to kind of improve say the Beatles theme which has been got you know received some of the fewest submissions. What can we do to lower the barrier of entries to predicting some of the biodiversity elements. Those all kind of things came out of this meeting as well and this community of forecast people engaging in the enterprise of forecasting will persist long beyond the research coordination network. Also there's a whole cyber infrastructure behind the scenes that is being made available where we're had the technology worked out to do the automated components of the forecast and this technology is being transferred to other elements. For example, the like other networks across the world are interested in using this for using forecasting challenges to engage folks in their particular data streams. So some challenges to the challenge you know basically you know how do we determine a winter in multi-dimensional space. For example you know there's all these axes of time in the future or time of year or variable and should we is an open question. We also have an overcome inconsistent submissions. For example if you submitted only your phenology forecasts in the winter you'll appear to do really well as Catherine showed but if you submit them during the time where there's the most change it's the hardest to predict and so inconsistent submissions by a particular forecasting team is something that we are working to overcome. Participation across all themes we're really you know excited to kind of push boundaries of longer horizons that year scale kind of level forecasting. The data latency for the non-sensor themes there's been a lot of engagement in the sensor-based ones because you can actually run which is something Freya did where you ran a workshop on the beginning of a week-long conference and folks created forecasts and then by the end of the conference you could see five days of evaluation that's harder to do that with the more the biodiversity themes and so that could be one of the reasons why the engagement has been lower. And then also direct connection to decisions you know in this case sort of like management better tying to management decisions whether it be management on sampling at a neon sites or being able to transfer the knowledge and knowledge gained to ecosystems that are being more actively managed than than the neon sites. Some future directions are you know synthesis, increased participation across themes, continued to reduce barriers through training tools and tutorials, adding additional themes like birds, mosquitoes, spatial kind of forecasts and then expanding beyond neon to include other eons across the globe and also thinking about adding more decision-relevant targets. And as an example of that here at Virginia Tech we have a new Ltrev long-term research environmental biology site that is centering around running a forecasting challenge on these two reservoirs that are neighboring each other in the Blue Ridge of Virginia. And so we're under development of creating a forecasting challenge that builds on the neon forecasting challenge cyber infrastructure and then actually feeds back and helps improve that by because this is another case study that helps generalize the underlying framework that we're using. So with that I think the question is how best can we you know continue to leverage a neon ecological forecasting challenge to help neon achieve its 30-year mission. I think it's a great open question for discussion. And so finally you know we'd like to thank you for your attendance and participation. We look forward to questions and if you're interested in getting involved recommend joining the Ecological Forecast Initiative where the link here. You can go to neonforecast.org to learn about how to submit forecasts and you know don't hesitate to reach out to me if you're interested in using neon forecasting challenge in your in your courses because it's we have some you know best practices and some kind of guidance on how best to do that. And so I'll you know we'll welcome questions from the the attendees. Thank you. All right thank you Quinn Freya and Catherine for a great talk virtual applause. Yeah we have time for a few questions so either you can use the Q&A button so it's at the bottom of the zoom window you may have to mouse over to see the button or you can raise your hand and we can unmute you and you can ask a question. So we have one question so far on the question and answer box from Nick Harrison. It says great talk. Can you speak briefly on what it would take to expand forecasting efforts to stream flow and discharge at neon river and lakesites. He's curious what that level of effort would be versus predicting temperatures in lakes for example. You want to take a stab at that Freya? Yeah sure. To start I think it's a great idea. I feel like that's one reason we've had less participation in the river's side of the aquatics is that like discharge is like crucial for all of these water quality until like getting the forecast of discharge is going to help us in the water quality aspect as well. And I think one of the key questions is like the neon data availability for that variable. I would have to defer to Eric on that because I'm not 100% sure what that is and like the data latency we've seen it to be a big like driver in participation and so if the data latency is good and the data quality is good then I don't see why it can be expanded would be a super exciting way to improve our other aspects of forecasting as well. In terms of the model output and like how good the discharge models are for rivers and lakes I would say they're probably less good than some of our water quality ones. The probably river hydrologists would probably disagree but I think one of the like barriers to that is probably our precipitation forecasts tend to be not so good as that like the things that drive lake temperatures were quite good at forecasting like air temperature and solar radiation were less good at forecasting precipitation which is like a real driver of discharge but I would be super excited to see a discharge forecast even if it would just improve my lake forecast. Thanks and Nick said oops I meant river and stream sites but I think we get the idea yeah the discharge data is interesting because that actually is a level four data set like there has to be quite a bit of processing for us to create so probably the latency would it'd be hard to reduce the latency I think but yeah that's the thing we could talk to Nick about. Okay we have another question how much do the ecological forecasts depend on NOAA forecasts or long-term climate data just wondering about the implications of climate drives everything. That's a great question and we you know one of the types of analyses that we can do is we can actually apply the same scoring approach to you know how good are the weather forecasts at the sites and see whether the kind of the forecasts ecological forecast degrade you know faster or slower than the weather forecast degrade as you know the horizon increases and so we can actually directly ask that question not every model that submitted forecasts to the challenge uses NOAA you know uses weather forecasts as inputs and so it's not like a requirement to a model and so if those models do do you know do better or you know perform as well then you know that can help also address that question but we we are specifically making the NOAA forecast available because we know a priori that that is an important question people are going to want to drive their models with weather forecasts and being able to address that the question from the attendee is an important part of one of our objectives. All right real quick before we go on to the next question I'm going to put a link in the chat before folks take off just again to the me on singing seminar webpage in case anyone wants to go there and nominate speakers future see the info for the forecasting data skills webinar that we're having at the end of October and all the other good stuff there so just putting it back there for people. Okay we have a multi-part question from Wayfiring I have two console questions one application question in the first section when what did you mean by bias when you showed the standard deviation graph in the second and in the first section from Catherine what did you mean by skills and then three is could you please explain the classroom lessons a bit I want to incorporate those to my class this semester thanks. Okay given the time we have remaining I'll try to be brief bias meant the central tendency of the probability distribution was not the observations did not fall on the on the central tendency of the observation of the distribution so a normal distribution has a mean and a standard deviation the how wrong was your forecasted mean that's the bias and then the spread is the spread of the normal distribution that you use to forecast and so bias is just the difference between the observation and the mean of your of your forecast I'll go ahead and jump to the lesson plans a bit we have a bunch of different approaches that we can point to both through the macro systems eddie program that provides kind of online like GUI interfaces for doing forecasting to learn about forecasting and then you dive a little bit deeper by using one of Freya's modules to kind of submit forecasts using our code and then we have examples of courses that have been built on that to build more complex ecological models for forecasting but don't hesitate to reach out to me about those about how to use it in a class and I'll hand it over to Kevin forecast skill this meant how good the forecasts were at forecasting so higher skill meant they were their forecasts were better lower skill were worse forecasts all right I think we're gonna wrap it up here we're coming upon the hour and yeah so again check out the science seminar web page for future seminars and data skill seminars and again let's thank our speakers Quinn Thomas Catherine Wheeler and Freya Olson thank you so much virtual applause yeah thanks everyone for attending this was great