 Hello, everyone. Welcome to the third of three public information gathering sessions. Paving the way for continental scale biology, technology, techniques, and teamwork for connecting research cross-scales. Organized under the species of the consensus study committee on research at a multiple scale. My vision for continental scale biology. I'm Jack Liu, committee chair and professor at Michigan State University. Next slide, please. I will begin by acknowledging the wonderful committee members for the great hope in planning this webinar. The committee's work is being conducted under the auspices of the national economies of sciences, engineering, and medicine in response to a request from the National Science Foundation. The committee's statement of task is available on the national economies website, and the public can provide feedback on this project at any time. Next slide, please. I would also like to acknowledge the outstanding staff members at the national economies for the fantastic hope in planning this webinar. And the study is a collaborative effort between the board on environmental studies and toxicology and the board on life sciences. Cliff Duke is the study director. Together with Cliff, Natalie, Kat, Tisha, Kavita, Eric, and other staff members are the driving forces behind this exciting webinar. Next slide, please. Today, the committee will hear six panel discussions by leading researchers working cross-scales and experts in other related fields. The panels will address a wide range of topics, and they are important and exciting, such as coordinated data collection and theories, indigenous perspectives, inclusive training and workforce development, innovative tools and techniques, and theories across different fields. Following each panel, we will take questions from both the committees and the participants joining us live stream. If you are a committee member or a speaker, please use the raise hand feature in Zoom to ask your questions. If you are an audience member joining us live stream, please submit questions through Slido. You can also upload questions you would like to hear most, and we encourage questions from the audience, but questions from the committee will be prioritized. Next slide, please. After the webinar, anyone for which to submit written comments or other materials that are relevant to our charge should contact Cliff Duke, the study director, or provide feedback through the project website. I would like to emphasize that this is a information gathering session, and comments by individuals should not be interpreted as the positions of the committee or the national academies. I want to note that this entire session is on the record and is being recorded. Now, it's my pleasure to introduce my colleague, community colleagues, and Dr. Brian, interest professor at the University of Arizona and Santa Fe Institute for moderate the first panel coordinated data collection and theories. Brian, please take away. Thank you. Great. Thank you very much. I would like to welcome everybody to our first session panel. This session objective is going to explore coordinated data or continental scale biology, and we're going to be emphasizing the integration of various data streams, open science, network science, collaborative research, and the role of theory in understanding and protecting ecosystems. In this session, we're going to have four speakers. We were going to, we're going to first hear from doctors, Gillian Bowser from Colorado State University, Teresa Crimmins from the National Phenology Network, and Chris Flipsick from Auburn University. And finally, Daniel Park from Purdue University in this session. My name is Brian Conquist. I'm a professor at the University of Arizona in the Department of Ecology and Evolutionary Biology. I'm also external professor at the Santa Fe Institute. I would like to remind each speaker that you have 10 minutes in total. Ideally, we will give you a warning at about one minute to the final. So we can quickly wrap up. Okay. And after this session, I'm going to moderate a 20 minute question and answer period. And again, I'm going to prioritize questions from committee members, but we'll also take questions from the audience. And we're going to help monitor questions on Slido. So I would like to introduce our first speaker, Dr. Gillian Bowser from Colorado State University. Dr. Bowser is a wildlife ecologist and associate professor at Colorado State. And Dr. Bowser focuses on ecological indicators of climate change and linkages between changing ecological conditions, local community livelihoods, and climate, placing special emphasis on citizen science engagement. To note, all of the panel speakers and full bios are also available on the agenda. And so it's with my warm. I have a warm welcome here for Dr. Bowser than to speak. So thank you very much. And I look forward to your talk. Thank you so much, Brian. And it is quite warm actually today. We've been here in Colorado. I'm going to go ahead and start by sharing my screen. And my role today is to sort of open the discussion, which is kind of exciting, and talk about the coordination of continental scale data. I added a sub-tech at the scale of very small things like bugs. It is important as we think about the use of this data and emerging technology that the scale of things and where those questions are asked and how they move is important. And that's all part of the coordination emphasis of this particular panel and how we use different types of data streams to ask questions. So I want to start with talking about small things with big stories. And why that becomes important is that when we start looking at climate change, we have to be able to go from local to regional to national and to societal. And how we make those moves very often depends on data that is collected at different scales. So when we think about things like biodiversity shifts and higher elevations like the mountains in my home state of Colorado, or you look at warmer species moving or appearing to moving and the impacts of all this on society, we have to talk about how the data is collected. And all of these, all of these are part of the early warning systems that depend on that data collection. So I'm going to emphasize four key points and want to keep it lively and talk mostly about the opportunity. And my four key points are that big data sets and small things. And we're involved in different type of coordination and a different type of thinking both statistically and scientifically about how we use that data and how that data is so important to answer questions. So number one, we'll talk about how data is gappy and messy. The data is opportunistic and people collect data based on what they love. And number two is that the data is biased and we need to engage and embrace that bias. Participatory science or citizen science is based on perceptions and people perceive what they think is interesting. And number three is that data is beautifully represented of local and traditional knowledge. And we have a panel later on that will talk about indigenous knowledge, but here I'm talking about local knowledge. I'm including urban populations, different types of populations that are out there that have knowledge that is important because again, people protect what they love. And they add data that's critical to answering continental scale questions based on that level. So it's just important to think that data in of its nature is cultural, economic and place-based. And this is important. And then finally, I'm going to conclude with boarding participation changes that data and we need to embrace that change. The data is biased, it's gappy, it's messy, it has systematic gaps, but yet it is critical for us to answer societal well-being as well as human well-being. Let me start with number one. Data is gappy and messy because small things still tell very big stories, but they're very hard to detect at different scales. And why that becomes important is that when we look at systematic changes of continental questions such as high elevation and the impacts of leisure retreat, drying, permafrost, etc., at high elevations of latitude, there are huge systematic data gaps. And this messy opportunity actually tells rich stories. So if you look at state and trends and you ask is high elevation pollination changing, shifting biodiversity laws, terminology that we hear very commonly in large assessment reports. The problem is that these species are largely unknown and largely hard to detect and that it assumes that detecting these species is based on different types of data. And I'll talk a little bit about what those different types of data are. So to get that trend, biodiversity loss, biodiversity change, we need to understand that baseline data. So here's a table from the IPCC report that just to highlight the importance of understanding what's happening in mountain systems worldwide is considered high impact with high confidence that these systems are changing. And they have huge impact on society as society change. Yet the simple question of asking biodiversity loss in such systems is heavily, heavily reliant on different types of data is often opportunistically collected. So here's a good example from some of the work that we've done in the high mountains of Peru, that you can look at small things shifting across scale, and they're only detectable by this data. So as biodiversity shifts across these high elevations, how do you detect this is by looking at species and species occurrences. And here's a great example from our work that shows that a high elevation species occurring within two feet or two, one and a half meters from a low elevation species that has moved upscale to about 14,000 feet from its normal coastal habitat. So you have two different bees occurring in the same area from two different ecological histories and you can track them in these different data sets. But this is opportunistic data. It's data that shows up across an opportunistic scale. So messy and gappy data, my point one is critical to tell the story of high elevation pollinator decline. The data gaps vary across different landscapes from Peru to Argentina to Colorado to different areas that data will change at all times. And this data has very high species of observer biases. And what's observer biases is part of the story and understanding that people protect what they see, which gets me to my second point. So number two is observer bias. Observer biases in citizen science is that people protect based on their perception of what is interesting and what's important to the community. And as a pollinator ecologist, I love to pick on the monarch butterfly, the wonderful species, charismatic, easily dependable, but not the greatest pollinator out there. You'll see why it's important for the community. So we see this bias in these records. So here's a quick example of some of our research that shows if you look at different types of records from natural using to other sources that you can find species records across GBF, GBIF is the global about diversity information facility. I naturalist the citizen science data set. I dig bio scan a national park service list. And all of these are showing different types of data, but yet you can actually get the same species list the same occurrences across all of these data. So why that becomes important is you can embed those species in the same area and get that information for multiple data sources and have fair confidence that you know what's going on at that local scale. So my point number three is that messy, gappy data is beautifully and locally representative. People protect what they love. They see the monarch butterfly they see other things are in or in emblematic or sorry, notable as a pollinator, even though it's rolled in foundation services maybe a question that love is culturally embedded. People protect what they see as important. It could be economically influenced, and it's entirely place based. So I think about this also about thinking about access to nature and understanding local impacts and climate change is all translated to what people do. And that what people do then translate into how these data sets are biased, and that bias is beautiful. And we should accept that bias and understand how to interpret that data at a continental scale, so that that community remains engaged. And that's that local knowledge. So my last quick point that broadening precipitation changes data sets at all scales and influences that continental data is getting important ways. Because people protect what they love, who they are influences what they perceive as important. And you'll find that this shifts in data sets. And it shifts in the perception of whether species are present or not present or the species are abundant or not abundant. These are all part of that bias. And with the broadening participation of different groups, different cultures, different sets of eyes, that bias will increase in that data set. And this is an important thing for us to recognize. So my last thing is to sum up. So we have time is that number one is that the coordination of continental data at the scale of a bug. Is a discussion about climate action, education and engagement, and all of these influence the actual data types you receive. Those data are messy. We have a lot of things to have big stories to tell that data are biased. Observer biases are unavoidable. And it's up to us as scientists to understand where and how the telling of that cultural story has regional economic and cultural importance to people and to society. And number three is that we need to make sure this data are relevant to scale matters representation matters locally biased data or this local traditional or other types of knowledge. Is more important to the community's well being than data that is disconnected from that local scale. And finally for number four is about love. Broadening participation changes the nature of the data and that it needs to be okay. And that's beautiful because people protect what they love. They engage in what they love. And when we talk about climate action, where we talk about people's society and well being a human well being. We need to be our data needs to be at a place where people engage, protect, and contribute to those data sets. So we can answer the question beginning of biodiversity loss. Are pollinators declining at high elevation? Are they shifting? And why is that important to that local community? Thank you. Thank you very much. We're finishing right on time. We're now going to switch to our next speaker, our second speaker. Our second speaker is Dr. Theresa Crimmons. She's the director of the National Phenology Network and a research professor at the University of Arizona. Dr. Crimmons research investigates changes in plant phenology at local to continental scales. And also communicated widely on topics of phenology and climate change. We'd like to invite Dr. Crimmons to speak now. Thank you very much for joining us. Thank you. Thanks, Brian. And I'm really excited to be able to share my experiences, both with running a continental scale program, data collection program, as well as as a user of these data. So very quickly, phenology, in case everybody isn't familiar with this very old fashioned sounding term, refers to when things happen seasonally in plants and animals. So things like when new different species of plants open their flowers or fruits ripen, when do insects hatch their eggs or migrate. And this applies to both animals and plants. We care about phenology because it has a lot of direct consequences for ecosystem structure and functioning, as well as a lot of human and economic impacts, things like agriculture and tourism, as well as our health, things like allergies. And it's a really sensitive measure that is an excellent indicator of how species and ecosystems are responding to rapidly changing climate conditions. And so, as a consequence of this recognition that phenology really is an excellent indicator of response to changing climates, the USA National Phenology Network was established back in 2007. So primary aims to collect, store and share readily with researchers, decision makers, and the general public data and information about when things are happening seasonally in plants and animals. And so that those that data and information is available to both support scientific discovery, as well as decisions, things like when to treat for invasive species or went to plant or crop harvest crops, as well as to communicate we spend a decent amount of time. Like Brian said, communicating with non scientists to about what this term means and why it matters. And our aims are to ensure that the work that we do we do benefits everybody who lives within the boundaries of this country. We were very fortunate that when the network was established in 2007 and for many of the subsequent years, we were supported generously by the US Geological Survey. But as budgets change, things shifted in more recent years and our funding is a little more variable. And we're fortunate right now to be supported primarily by the National Science Foundation, as well as by a number of federal agencies through grants and collaborations. But I want to call this out because part of what this this session is about is identifying both challenges and opportunities and long term funding for long term monitoring is a significant challenge. It's just not something that granting agencies or federal agencies or others are that interested in supporting long term. In order to achieve the aims that we are trying to achieve, being able to support scientific discovery and decision making requires a lot of data. It requires an awful lot of observations of individual organisms out in the field to document what they are doing, what their their seasonal status is. And ideally what we're trying to construct is time series observations of repeatedly on the same organisms over the course of the season so that we can construct a picture of when things started and stopped, as well as observations on those same organisms from one year to the next so that we can identify whether things are changing. So the primary way that we achieve this is through a platform that we call Nature's Notebook, which is intended for use by not only professional scientists and managers, but also citizen scientists or volunteers. And at the program's core is a series of standardized rigorous protocols, which are intended to generate data of sufficient rigor and quality that they can be used in science and decision making, even though the observations might have been collected by someone without a strong scientific background. And so one of the things we put a lot of emphasis on is training and support for volunteers, because we want to ensure that they're comfortable with what they're doing and that the data that they are collecting and contributing is of high quality. And so that's another thing I want to flag, because that's a major challenge ongoing for implementing this kind of technique for collecting data that is to be used in these kind of more rigorous applications. One of the key features of the Nature's Notebook platform is that the data collection is not driven by a overarching science question or hypothesis. Rather, it is a platform that anybody can adopt and utilize to collect data for answering questions that they identify. And we designed it that way on purpose, because we really felt like if individuals or groups of folks collecting data were doing it for reasons that they identified that they would be more likely to sustain activity for a longer time than if they felt like they were doing something in service of someone else's needs. And so what we see is that there's a whole wide range of the types of questions and scales at which those questions are asked and answered. On the left, we see an example from Midway Atoll out in the Pacific, where they've been using Nature's Notebook to track verbicina in Cileoides, which is an invasive plant that's kind of taken over their Atoll. And they've been trying to figure out the best time of year to manage it. On the right, we see examples of some of the data collection campaigns that we operationalize on the behalf of researchers who are looking to collect phenology observations on a larger spatial scale. And Gillian made this point so beautifully. The truth is, the data that are coming in, when you don't specify exactly what you want, tends to be very, what did she say, gappy, biased, spotty, and all of that is true. This is a map of the records that have come in through Nature's Notebook, and the size of the circle represents the number of records. And you can see that we definitely have spatial bias. But that said, there are some wins here for sure. Some of the largest circles represent neon sites, the National Ecological Observatory Network, who has adopted our protocols and contribute their plant phenology observations to the EnChan database. And then some of the other larger circles are what we call local phenology projects or programs, which are organizations or groups that existed previously and decided to adopt phenology monitoring using Nature's Notebook. So that's things like classrooms or docent side of Nature Center or folks working for the summer at a national park. And what we see is that the data coming in through those national phenology programs tends to be more, more quantity. The line here is showing the records coming in from people participating as individuals. That's the orange line. And then the blue line is the total number of records coming in from local phenology programs. And what we also know is that the data contributed by those programs, those local phenology programs, is that they observe not only, they not only contribute more data overall, it tends to be on more organisms for a longer duration of time, and more frequently. Which are all things that are good in terms of helping us achieve the kinds of data, the dimensionality that we really want to be able to deliver, again, to support science and decision making. So we're investing more and more in that particular participation model, and we have found that it's been a really successful approach. And even though the data are opportunistic and lumpy and biased, there have been quite a few publications and projects that have emerged even so. And I call out just a couple here that have been published in the last year that are high impact journals that are addressing novel questions at large spatial scales. The first is regional and the bottom one is at a national scale. And so it's really exciting to us to see that those data are still robust, absolutely, and we're achieving the aims that we set out to achieve. And one other thing that we're increasingly doing is taking those data that are coming in for volunteers, using them to construct models that predict when phenological events, seasonal events are expected to occur in different species, and then operationalizing those as real time or short term forecast maps. And so the one we're looking at here is for red brome, which is an invasive annual grass that's problematic in the west because it carries wildfire where it shouldn't. So what we have what we're doing here in this particular map is showing you this this this screen capture happens to be from May 23 of this year. On that particular day, where did we anticipate the plant to be vegetative in growth or exhibiting flowering. And this matters not only for wildfire applications but also for grazing rotations because the cows don't like it so much once it starts to flower and go to seed. So we're at the one minute mark. Thank you. One thing that's really cool is that increasingly those kinds of maps are being picked up by the media. And this is an example of a different product that we have that's being that was being shared nationally this past spring to indicate that spring is starting early. And why I mentioned this is that it's an excellent characterization of how the observations that are being contributed by the volunteers are translated into a data products that then hopefully reaches them closes the circle to demonstrate to them how their data are being used. So that is one other point I wanted to make was just that to appreciate that volunteer data are free I think the field generally appreciates this, but we need to we invested quite a bit of an effort to into communicating back to them how their data are used both through webinars, publication summary, social media, and other events which is a challenge given that we are national scale program and we're quite a small staff. So if you could, if you could wrap up. Absolutely. This is my final point that there's a lot of opportunity to for us to try to lower barriers to increase participation and increase data growth. And so that is something that we are investing in heavily now, without trying to put a deep prioritize data quality. Thank you. Excellent. Right on time. So at this point, let's shift to our third speaker in the session. Dr Christopher Lipchick is an ecologist and professor of wildlife biology and conservation at Auburn University. Dr Lipchick focuses on questions related to conserving nature and biodiversity. He and his lab work on a variety of topics such as conservation planning, urban ecology, and citizen science. Dr Lipchick to begin presentation. Take it away. Great. Thank you so much, Brian and Jack and everybody for inviting me to attend and talk a little bit today about some of the work that we're doing. I'd also just like to thank Gillian and Teresa for really a great lead into what I'm going to talk about today. And I'll try to keep this hopefully pretty short and cover some of the highlights. So as Brian mentioned, a lot of what we are working on in my lab are questions that really have been using either publicly available data or citizen science data sets to answer large scale questions. But we're also very interested in scaling up and scaling down and using these data sets for conservation and management. So when we think about the promise of continental scale questions, we're really thinking about how to answer ecological questions and advanced theory of broad spatial scales. And this is not to say that we don't do small scale work or want to have small scale data sets, but we're really interested in answering what we think about our questions that perhaps might be above a landscape or a regional level and be able to do comparisons across systems. So when we think about continental scale, one of the promises is both scaling up as well as downscaling. So similar to a lot of climate data, we're interested in not just having small scale sites that we can scale up, but we're interested in being able to have very large scale data sets that we can then downscale and estimate local information that's ecologically relevant. So we're interested in ultimately in continental scale ecology to do investigations that evaluate both local and broad scale spatial and temporal patterns. And I think a big highlight too is that we have lots of wonderful aviatic data sets, many of which are being collected at continental scales are larger, and that we are interested in being able to put the body data with so that we can understand the aviatic-abiotic relationships that drive many ecological patterns. So just to give you an example here, even on the bottom, these are images on the left of sound data from the federal government that we've been using to evaluate how it relates to bird diversity. And that is a national data set that evaluates both natural and anthropogenic sound, and we can tie that to existing data such as from Ebird to evaluate relationships. So as mentioned in our previous talk, one of the elements that we're really moving into as ecologists is being able to do predictive ecology. So a goal for continental scale ecology is to not just look at the short term or near term, but start to think about how can we evaluate longer term phenology or other phenomena that could happen using what we've known in the past. And so prediction is not perfect, but we want to be able to have the kinds of data sets that we can start to ask longer term questions that have relevance both for theory as well as for conservation, management, restoration, policymaking. I think a key part that I'm very interested in, and I think we all want to see is being able to link systems. So for instance, we are really interested in my lab and understanding the comparative ecology of cities around the world, which is wonderful, but we only have a handful of cities when we think globally about what that information looks like. And we also don't always have good comparisons to nearby rural areas or underdeveloped areas. And so we would like to be able to link different types of ecosystems or systems such as social and ecological together at large spatial scales, which is something that I think we're already well on the way to doing, but is kind of a big next step. And as I mentioned, from my point of view, I think it's really important that when we're thinking about continental scale ecology, we want to be able to use these data to inform conservation management and policy. It's not just simply as a way to test theory, but it's really to make the world better in the future for biodiversity. So how do we accomplish continental scale ecology? So there are already present many spatially explicit data sources. So we have a lot of types of data sources like the breeding bird survey run by US Fish and Wildlife Service and USGS. We have E bird run by Cornell. We have my naturalist is just a handful of examples. We also have many wonderful abiotic data set such as National Weather Service data. We have USGS stream gauging data. There's air pollution monitoring systems. So while it is not perfect, there is a fairly rich set of data, both abiotic and biotic, which has allowed a lot of these initial studies to be done from many of the people that are speaking today. And some of the real benefits that we've seen in some approaches such as when we talk about weather station monitoring is using a standardized monitoring approach. However, when we think about a lot of the abiotic data that we're interested in and Gillian really mentioned well is that we don't always have standardized monitoring technology or approaches. And so that I can really result in difficulties in either using or understanding data. So a real important part about accomplishing large scale analyses is to be able to have standardized approaches. So what are some of the challenges we face? I would say one of the main ones is simply incomplete knowledge. And this isn't simply an issue that we've talked about before. But we do not know where many species exist in the world today. We have limited information on the biodiversity in the world. So we don't expect that we will probably ever know all species or where they are certainly not as fast as extinction is occurring in many places in the world. But we really have limited knowledge about many different taxa and many locations. And that just has not been the type of information that has been prioritized. So we lack data on things that we can't see, such as microbial systems, below ground systems, etc. We also have quite a bit of taxonomic bias and this gets back to both our personal interests as well as different types of approaches that have been used. Some of these relate to simply the programs that exist like eBird that focuses on birds or reading bird survey versus things that we know people might be interested in or people can collect. But the reality is that we have an overrepresentation of things like birds and mammals and many types of data sets or certain types of plants. But there are many types of invertebrates that we have very poor knowledge of and so that bias can lead to an underestimation or a lack of knowledge about patterns we want to evaluate. As we've already talked about, there is unequal sampling across space and time and this results from the many reasons. Some is simply because if we don't have programs where people don't live in certain places in the United States or other parts of the world. So almost always in the United States, you see the western parts of the United States before you get to the coast is being under sampled often because of a lack of roads or people. And that creates areas that we just don't know quite as much what's happening in terms of species information. A real large challenge I think remains and will remain for the foreseeable future is who and what collect the data as well as who owns the data. So many of the data sets that we've already talked about this morning have either open access or maybe you publicly owned. But there are lots of data sets that we may or may not be able to access right now depending on who is considered the data owner. We often have many different types of projects we're utilizing to collect data at different small spatial scales around the country or perhaps large projects where each of us participate in and collect data. But again, who collects the data, how we collect the data are key to being able to do large scale ecological questions. And so I would say this presents a trade off in the sense of how are we thinking about data collection data needs and where we go in the future. Just one minute left here. Perfect. Thank you. So I think what we really get to is that we need coordination. So we need things like the national phenological network. We do need financial resources as well as resources in terms of locations transportation, etc. And I do think a big part that we'll and we'll talk about this later in the session. And as part of other panels, but I think data considerations really underlie a lot of what we can and cannot do. And so I think we ultimately kind of revisit this long term dilemma that we have in ecology is how do we monitor because at the end of the day, what we really want to do is be similar to something like the National Weather Service and that we would like to have information collected at regular intervals and space and time so that we can begin to synthesize that and answer questions at different scales. So with that, I appreciate the time and happy to talk later when we are in a dirt. Thank you very much. I would like to now shift gears to our last speaker, Dr. Daniel Park. Dr. Park is an assistant professor at Purdue University. Dr. Park's main research interest and expertise is in plant ecology and evolution. And his group broadly focuses on questions of how biodiversity is distributed across space and time, while developing novel informatics approaches to facilitate the application of big data to ecology. We would like to invite Dr. Park to begin his presentation. Welcome. Thank you, Brian. All right. Can everyone see my slides? Yes, we can. All right. Great. So hi, everyone. I'm Daniel Park. I'm interested in the drivers and consequences of plant biodiversity change, as Brian mentioned. And I just want to show you a few things about what we do here. So I'm interested in plants which form the basis of ultra-territorial ecosystems. So the study of plants is linked not only with plants but other organisms that rely upon these plants. So it's important to know when and where plants are and when and where they do certain things. For instance, things happen at different times of the year, cherry blossoms bloom early in the spring, some flowers bloom later during the summer. And then if you travel the globe, you'll notice that different places are tend to be associated with different types of plants, which are locally adapted to the conditions that they thrive in. However, these conditions are rapidly changing to an ever-increasing population, which modifies the face of the globe as well as changing the climate and the environment until conditions that these organisms encounter. Along these lines, we're interested in how these effects are changing patterns of biodiversity and space and time. And here are just a few things we can do using a lot of the data people have talked about before me. First, our lab is interested in biological invasions. And using herbarium specimen data, we found things like how invasive species relate to the native flora, mainly that they tend to be more closely related to native species than our non-invasive introduced species and have higher niche overlap, have better dispersal capabilities. And what's important is this research was facilitated by natural history collections, namely herbarium specimens, breast-dried plants that were originally generally collected for taxonomic purposes. And from these specimens, you can see that I was able to get things like trait measurements, the size of the fruit and the papas bristles that they used to disperse these fruit DNA from tissues and location data from where they was collected to any which we used to associate these plants with certain environmental conditions which allow us to model their niches and distributions. We can also do phenological research with these kinds of data. And we've done research where we've seen things like with climate change, the temporal gap between closely related congeners, flowering times will increase in the near future. Or in this case, we also used natural history collections of bees to see how bees and the flowers they pollinate respond differently to changes in climate. And we found that the insects tend to be more sensitive to climate in colder areas while flowers tend to be more sensitive to climate in warmer areas. And hence in the future, we're going to see kind of heterogeneous patterns of mismatch among these mutualists. Now this research was facilitated by both citizen science observations or community science observations and again natural history collections. And on the top row, you can see pictures that I've taken in the field which are similar to the type of community science observations you'll find in iNaturalist, which capture a phenological event such as leaf unfurling, flowering, senescence or fruiting. And you can see that these phenological events are all well captured in historical herbarium specimens. And these data together can inform us of how these species are responding temporally to climate. So in terms of scaling these studies up into the macro scale, we're interested in how these patterns, these spatial temporal patterns will change in the face of global change. And as you know, the world is burning up this year and has been for some time. So we use these data on a macro scale to look at things like the effect of fire and this study we collaborated with with Brian Anquist here to see how policy driven changes in fire impact biodiversity in the Amazon. And this study in particular utilized data from natural history collections, community science observations, plot surveys and remotely sensed data from satellites. We've also used data from floral treatment surveys of the flora of the United States to do machine learning predictions of how native versus non-native plant phylogenetic diversity will change in the near future. So with the increase in digitization of natural history collection data and community science data as well as other more traditional plot survey data. This digitization and this mobilization online is facilitating a lot of research, including functional ecology, research on species distributions, morphometrics, things like automated species ID and our official intelligence applications, etc. However, regardless of how they were collected, most of these data were actually collected in an uncoordinated fashion. And these data are being used beyond their original purposes and the scale and the resolution of these data vary greatly from point observations done on the ground to satellite observations done across kilometers. And again, it was mentioned before, but these data comprise large gaps on certainties and biases and sometimes downright errors, and they are also scattered across many different sources and aggregators will physically and online. And just our lab is interested in categorizing and addressing these biases and you can see for instance, the scale gaps in phenology data on the left, you can see that phenology data mostly comes from the global north, and that there are huge gaps in our spatial and temporal grain due to the way we observe these data. For herbarium specimens, we can see that they tend to be collected around population centers and not where there's not a lot of people or roads. The climate data associated with these things because of geographic uncertainty tend to be greatly biased as well. Further, these data are stored in a lot of different places, a lot of different databases that sometimes talk to each other, and sometimes don't. Some are fully isolated, some are fully integrated, some are most are somewhere in between, and then the taxonomic systems or how they name these species can be compatible or incompatible across these databases as well. Then we run into the problem of scale. If we want to scale these data from the local scale to larger scales or vice versa, we need to overcome the problems of spatial scale and temporal scale. Just because the nature of how we sample these things, using these data of different spatial resolutions results in different estimations of means and medians of different phenomena such as phenological firsts. Also, because the landscape is heterogeneous, the patterns we observe and the conclusions we draw from these data can change with the spatial scale upon which these observations are made. So what do we do with all these problems? The question boils down to how do we better integrate, synthesize data across sources, scales and contexts. And to do this, the first step is to clean these data and standardize these data, like this workflow, which conceptualizes what we do in the end, the Potential Information and Ecology Network. We need taxonomic standardization, we need to call species the same way, and we need to make sure that the very least the geographic location information is compatible among all these records. And then we move on to validation to see that these records are indeed correct. And sometimes, for instance, we want to know whether they're native, non-native, cultivated, etc. But this is only the first step. And I hope I've convinced you that while natural history collection data and community science observations can expand our knowledge of biodiversity beyond their original purposes, they comprise a lot of gaps biases and errors that need to be very thoughtfully addressed. And finally, this will be discussed in a further future panel, but we should be mindful that many of these collections bear a colonial legacy and access to these flawed but useful resources are not equitable across the globe. And thank you very much. Thank you very much. Dr. Park. So what I would like to do is kind of invite everyone to stay on. We're now going to move to our question and answer period. So we have approximately 15 minutes run over just a little bit of our time. So if there are any questions, you can please either post your questions using Slido or post your questions using the chat window. And I will ask, is there anyone who would like to pose a question then to the group? As we're waiting then for our first questions to come in, what I would like to do is I would like to pose then a general question then to each of the speakers. And so if you could maybe take 30 seconds, maybe a minute, if you could address that. So in terms of our focus then of the panel, I'm curious to know from each of you, given kind of what you presented and what you've heard from everyone else, what is needed to advance a more predictive body of knowledge in theory for continental scale violence. And not everyone jump in at once. Well, go ahead and jump in if everyone can hear me okay. And I appreciate Ines has a question. I think it's to me it's really about, and I think our last speaker, Dr. Park, he talked about that really well, it's this issue of getting informatics into a place that does simply accept that this data is messy. We need to think about different ways of analyzing the data. We need to think about different statistics that can answer these questions. And I think it's, it's almost at a crisis level because the data is exploding so rapidly. And for example in the insect world, we get accurate data we can get our data down to species accuracy now to about 85% just with artificial intelligence. So the data has fundamentally changed. And we need to catch up with it. That would be my thought. Anybody else, I can jump in. I think we need to address some of the problems in the data we have as well. So prioritizing data collection in places that we know we have gaps in places where we know data are biased, and also thinking about how we collect these data how to collect them to make the most the useful and compatible with the data that we already have will be on all of these lines. Sometimes I think it might be useful to have not overall but more targeted data collection that is theory driven and hypothesis driven in concert with the less kind of coordinated data collection to kind of fill in those gaps and biases. I totally agree with those things and I'll follow up with funding to support that would be super fantastic. Yeah, just a top level appreciation for the contributions that these different programs can make more coordination among programs because I think sometimes we have infrastructure redundancies that could be cost savings. But I think it stems from things have grown in different ways origins for different programs are really distinct and diverse and maybe taking a good really high level look at the different programs run by different governmental agencies or privately and thinking about how that could be better facilitated. And it would fold right into these issues of data, you know, bias and unbalanced and messiness as well. Thank you, Teresa, Chris. Yeah, I mean, I think I think just to add to everybody's comments and me one thing I had always hoped for was that we would have the equivalent equivalent of neon really at a very fine spatial scale. And I think that I don't I am a firm believer in hypothesis driven research but I think National Weather Service is a great example of you need monitoring and I think we have struggled with being able to talk about that in a way that we can advance science because it isn't it's a huge funding request. And it doesn't really necessarily have some of the immediate payoffs that we think I would like in everybody's talks today have wonderfully illustrate what we've done with essentially existing data that we're not intended for those purposes. And what we could really answer if we attempt to deunify and really had a top down approach of what, you know, I'm Teresa is talking about of some combination programs and unified approach. Thank you Chris, these are all excellent points. I would like to shift then to the questions that we have kind of on the line. And as I believe you have a question. Yeah, so all the speakers, you have all pointed out that the need of more data to cover those gaps but also at the need of cleaning and standardizing data that it's already being collected. So if you were to have to prioritize between those two, which one do you think is a more pressing needs covering data gaps or trying to standardize the data that we already have. Thank you. I guess I'll jump in again and thank you for that question. I think it's, if you had to prioritize is definitely the more data they are millions of insects out there that we have absolutely no information about. And artificial intelligence as I think Brian and dangle also mentioned, have up to anti that we can get things accurately really fast. So we can't wait for the best data. We can't wait for the best technology. The data is being collected now and pollinators apparently are disappearing now and they're shipping now. So I think it's, you know, really embracing that messy data might be the day of the future. It's not hypothesis driven, but it's so critical and we have millions, millions of species of insects that are fairly unknown. We can't wait for someone to figure out the hypothesis. We need to go out and grab as much data as we can and artificial intelligence really gets us very close as again, we get about 85% of the observations. We can get to tax comic accuracy, which is new. And I think we need to celebrate that. Thank you. I would like to move to Jeanine believe you have a question. Yeah, so I appreciate the the commentary on the data gaps. All this data that's coming in the problems with upscaling and downscaling. And what I keep thinking about is how do we synthesize all of that in meaningful ways to most critical questions. Do we want a really comprehensive monitoring system without a way to synthesize and ask new questions, discover new phenomena that we hadn't been thinking about and it seems like the synthesis all the monitoring and data explosion doesn't really get us to the point of understanding how and why things are changing and what's going on it. So that's a comment but my question is what do people feel about the role of synthesis and how could should that take place. Do you mind if I happen. This is something that I've been really grappling with for the last several years, because we've now been running our program for about 15 years and it's been the way I described it, you know, kind of free for all anyone can use it to collect any data. And it's been widely adopted and widely used but perhaps it's time to really take a careful look at what we have achieved and why and how it's good for and are we missing key opportunities should we perhaps pivot and put some emphasis on particular species or even just seasonal events in particular regions. Because I don't want us to get another 15 years down the road and reflect back and think oh we missed an opportunity to really better understand key phenomenon and change and so I don't know I still feel like it's a it's a balance of both of what you're describing in there you know continue with opportunistic data collection because you never know what you're going to discover and you don't want to, you know, minimize that, but at the same time we want to make sure that we're also encouraging data collection to be to yield the most robust and meaningful data resource to answer questions and it's hard because some of those questions we can name now and some we can't. I don't know what they are until we're further down the road so I was really excited to even be a part of this panel and hear what other speakers have been saying because it feels so relevant and timely. I think given the nature that question is there anyone else on the panel that would like to chime in on that. I would. So, I was really intrigued to both by what Teresa just said and Daniel about sort of a development of hypotheses that we could use to help direct more citizen science and so that there could be this give and take of top down and bottom up. I was just wondering whether anyone has thought about mechanisms for that. And it sort of comes down to the question about synthesis and then also. So, anyway, that's sort of my question. And I think when I, when I hear in your question about mechanism is how do we bring people together to try to tackle this and I'm excited to share that we actually have a project that we're undertaking in collaboration with the US Geological Survey again this year with the aim to ultimately have a Powell Center working group that will launch next September. So that may be important to share with this group as as we we start to kind of gear up for that but that was kind of our thinking was let's bring together, let's do a lot of the groundwork of baseline data synthesis as best as we can and various dimensions and then let's bring together our end users and smart people to think about this more collectively so that's kind of our as far as we've gotten but I invite folks to follow up if they're interested in being involved in learning more. I'll add that I think some of those efforts already being done as in the national technology network and nature's notebooks, they have sometimes projects that are geared at studying certain areas or certain species or groups of species likewise on iNaturalist or other community science initiatives you can open up projects that are specifically tailored to test certain hypotheses or specifically tailored to focus on certain regions or taxonomic groups and if those were a bit more widely adopted I think they would go a long way of filling in the gaps in our knowledge although I admit some organisms are less charismatic or appear to be less charismatic than others so getting community participation might be a little harder for some of the things. If I could jump in real quick I think the danger is going back to your point Daniel about colonization of data. If we go so hard into our hypothesis given questions, our data needs to be representative of the people we've been to broaden our participation of our data and right now it's really biased and if we just go back to our iNatural approach sorry to say hypothesis driven data we have to sympathize we need to do all this what we do is we close out those communities that collect the data that may have questions that we haven't thought about and those are the minority communities or other communities with local knowledge of what's happening in their backyard. What bees are showing up? What invasive species are showing up? Who is missing? Who are we not seeing? And the monarch is a great example of that when you look at a species that reaches communities that anyone can gather monarch data fairly quickly it may answer different questions but if you wait to drive a hypothesis you can also drive out community participation and more importantly drive out communities that are not yet engaged that need to be engaged. So allow the messiness to be part of the data. We can work post-fectory in our data and get trends. That's what every assessment out there does. All these big assessments create models based on messiness and they can put it together. But I really think we need to be careful about our big data sets and truly decolonize. I mean all senses of the word and recognize that people protect what they love. So if we engage people in collecting data we will get a fundamentally different set of hypotheses. A fundamentally different set of data because people will feel that they can contribute to something to answer a question close to their own hearts. Thank you Gillian that was excellent. Can I ask Natalie how we're doing on time? Yes we have about five minutes for some more questions. Okay excellent. I see Janine you have a question. Yeah so listening to Gillian I am thinking about the increasing disconnect between data observation and data analysis. And I mean this is because the data comes from many many sources we're often not involved in collecting the data that we're analyzing then we're interpreting it without the direct observation. And there are many good things about that but we're also losing the initial part of the scientific method of the initial observation that allows the hypothesis generation. I'm wondering if people have thoughts on that about the actual act of observing that goes into the data collection that is critical for generating hypotheses. I guess I'll just respond really quickly. I think it's a new way of thinking and it's not necessarily changing the scientific method but it's an acknowledgement of how different types of data come in. And you know patterns are an equal way of looking at hypotheses and looking at opportunistic data should be at the same level in my view for us to answer climate change questions as a hypothesis driven because it's a community problem. It's a societal problem. A society needs the data now. And I think we over emphasize or under emphasize the importance that our own perspective brings to a data when we create a hypothesis. And we don't allow a community member or a local community or people who are different than the researchers to be involved in the process. And I think climate change pushes us to a new place. And for us to think about classic hypothesis only driven as better than observation, nonparametric, messy, the function analysis however you want to call it type data might need to and you know if you look at assessments, the large scale assessments all do this now. They look at scenarios and they build scenarios based on data that can be driven in lots of different ways. And this is what society needs right now. We will never have enough people in the field to collect insect data accurately. There are millions of them out there. I can't hire enough graduate students to look at that. But yet that data can be generated. And I think we need to celebrate it and celebrate that messiness. And we can do post factorial analysis, but we shouldn't stop that excitement of people chasing monarchs, even though it may not be directly hypothesis driven. Thank you. I guess I'll just jump in and I agree with usually we should embrace the messiness and the sun coordinated a collection may very well be the way of the future. And, you know, I use that messy data in many ways and had, you know, we've done some really cool stuff with it. But I don't think that these thing two things are mutually exclusive. Yes, we should continue to collect a coordinated data have, you know, people collect data about what they're interested in what they care about what they can collect. When they can however there can always be another approach that that kind of fills in the gaps so more that can be hypothesis driven that can be aimed at filling in gaps that can be more targeted in various ways. And maybe those types of studies can go deeper and get at mechanisms or biases that we were not previously aware of and I don't think those two things are mutually exclusive that I think they need to both exist and they keep you synergistic. Can I just chime in here I think maybe in the last minute or two that we have left. So what I'm hearing is really a focus on stakeholder engagement and who are our stakeholders are, and who we're trying to communicate with. And in our effort to really generate a continental scale view of biology. And it seems that what's really kind of critical and also missing a lot of our discussion is how we best do stakeholder engagement. And so I'm wondering for one, if I'm kind of hearing that correctly but then also, if anyone has any kind of final thoughts. Brian I guess I guess I would one point I would make I mean I think stakeholder engagement is really important I mean I think we have none of us have really talked about that today. If we want people to care about the environment and ecology, we need to get them involved and this is a great way to do that. I mean I don't think it has to be the only way but I do think that much of my satisfaction I think all of us on the panel this morning. Has come from working with different groups and people collecting data on the ground in the field. But if we want, I mean, to get the data sets we are talking about, we're going to rely on other stakeholders and it's probably going to be either free or no cost or low cost, and, and that has to be done and some kind of Democratic or open way and I think that to do that. It's really critical that we engage these stakeholders not just the ones we've been working with today or in the past, but future ones and talk about that kind of just as an open part of how we do continental scale data. And I'll chime in with one thought to that. One of the things that we've really been grappling with putting more effort on toward is making sure that those stakeholders that are participating collaborating with us in collecting data are providing the support that they need to be able to get answers to the questions that they're proposing to. In a lot of cases, they're, they're uncertain as to how to analyze the data or how to interpret it or how to clean it, and, and they would like to be empowered to not just have to rely on external scientists to be able to do that and so that's something that we're exploring on our end for supporting participants but I think the whole field could again combine forces a little bit more and prioritize that as well. I think we're really fortunate in the pollinator world because, you know, a stakeholder is simply be somebody that you handed net to. If anyone's watched a bunch of fourth graders or elementary school kids with butterfly nets, you know that that enthusiasm, excitement and engagement of your stakeholders is almost immediate. But again, we have the tools now to then take that really messy data collected by a seven year old and dump it into a database and get it to technological accuracy and that's new. And I think it's important that we think of those people as stakeholders, they're going to chase what they see, and that what they see is part of the data set that will help them to be active in terms of protecting what they see. So our stakeholders are anybody with a net anybody with a cell phone or take some pictures of who's on your flowers, and that becomes incredible data that keeps that community involved just look at milkweed planted in the wild wild, even if those are on species that tells you how you can get stakeholders very quickly engaged with the issue of pollinator loss or their perception of the time, even if that data is fundamentally really messy. Thank you.