 The airborne observation platform consists of a small winotter airplane that's fitted with three sensors, the waveform lidar, imaging spectrometer, and a high-resolution RGB cam. And all of these sensors provide data at a high spatial resolution of less than one meter squared. The AOP typically samples an area of around 100 square kilometers for every neon site. So you could think of this as a regional scale data collection for each neon site. And this is a photo of how those three sensors are integrated on an airplane. The gold-colored drum that you see here on the left, that's the imaging spectrometer. It's the same as the average next-gen sensor developed by NASA JPL. The green box that's kind of hidden here, that's the lidar sensor we use and there's also an RGB cam also present along with the lidar. I'll be spending some time on this slide here because this will be the key for this tutorial. We'll be using lidar data here. The lidar stands for light detection and ranging, and it's an active instrument, meaning it shoots pulses of light at a high frequency of up to a thousand kilohertz, which translates to one million pulses per second. And the sensor then records the reflected energy. By keeping track of the amount of time it takes between sending the pulse and when it was received back by the sensor, one can calculate how far an object is from the sensor. And this way lidar could be used to characterize a three-dimensional structure of the lidar. There are a couple of popular use cases for lidar. The first is determining vegetation heights, which we'll be doing today in this tutorial. And the above, the other common use case would be to estimate above-ground biomass in the vegetation. So let's focus on this diagram on the right here. It shows an outgoing laser pulse at the top, and then as the laser pulse travels through the air, it will eventually encounter objects. In this example case, you'll see three objects that the waveform encounters. The first one is a tree. So as the laser pulse hits the tree, some part of it is reflected back to the sensor, as has been shown in this peak here on the right. So this is the reflected energy coming off of a tree. But remember, a tree will also have gaps in the canopy. So not all the energy that hits the tree will reflect back to the sensor. Some of it will actually penetrate the canopy and go deeper in. And then it encounters a second object. In this case is an understory shrub. And the same is applicable to this as well. There will be some energy bouncing off of the shrub back to the sensor. And again, the shrub will have also some gaps in the canopy. So it will penetrate that as well and hit the third object, which is the ground. So in total, this particular laser pulse encounters three objects here, shown by three different peaks here on the right. Now, NEON provides two different kinds of LiDAR data. One is called a waveform LiDAR, and the other one is a discrete return LiDAR. And I'll be going more in the pros and cons of both. Let's focus first on the waveform. What you mean by waveform data is you, for every pulse of laser that is shot by the sensor, the waveform data will provide you the exact shape of the reflected energy coming off of all the objects that it encountered in its way. The advantage of waveform data is, as you can imagine, it will provide you a lot of detail about the complexity of the vegetation in that area. So it's a lot of information, which is great for doing scientific analysis. But the downside to it is that you will have to store a lot of data on your hard drive. Remember, we are shooting a million pulses a second. And for each pulse, you have to keep track of this entire waveform. It will easily run out of hard drive space. So to avoid this situation, we provide another kind of LiDAR data called a discrete return data. So instead of saving all the information in the peak here, you could ensure that every time the reflected energy exceeds a certain threshold, shown by this horizontal line here, then you record it as a return. So for the energy bouncing off of the tree, you could summarize this peak in the form of just one point here, which is one discrete return. Similarly, for the second object, the understory shrub. Again, if the reflected energy exceeds a threshold, you record that as a return. Again, for the ground, you record that as a return if it exceeds a certain threshold. So by doing so, you are compressing the vast volumes of data stored in this entire waveform into three discrete points and thereby the name discrete return. You're discretizing the reflected energy and thus saving space. So that's the advantage of a discrete return is that you will save some space, but then obviously the downside being you'll also lose a lot of context here. So that's a quick summary of waveform versus discrete return data. And we'll be focusing on the discrete return part in this tutorial. And as you can see going forward, even though we have compressed our data by going for discrete return, you'll still end up with compute issues because there are just way too many pulses being recorded every second. Moving forward, the second sensor on the airplane is an imaging spectrometer, which provides hyperspectral image. Now, you may have heard of satellite-based platforms like Landsat or MODIS, which provide reflectance data for about 8 to 10 bands, making them multispectral sensors. Now, what makes hyperspectral data special is that they capture the entire range of visible to near-infrared shortwave-infrared regions of the electromagnetic spectrum. So you're sampling this entire range and then you collect this data at a fine spectral sampling interval of 5 nanometers. So you divide this entire range by 5 and then you end up with over 400 bands, 426 bands to get precise. And this is what makes it hyperspectral because of the hundreds of bands that are available from the sensor. Now, hyperspectral data has been shown to be effective in characterizing the physical properties of leaves. For example, the leaf mass per area or the leaf water content. And you could also, it's also been shown to be effective in capturing polar chemistry, like what's a percent nitrogen or percent carbon or percent calcium present in it. And this hyperspectral data is collected at a high spatial resolution of 1 meter. And finally, we also have an RGB camera, which takes photos in a high spatial resolution of 10 centimeters. We provide the camera imagery to provide some additional context when you're analyzing the LiDAR and hyperspectral data. Of the 180 data products that NEON provides, 29 of those are developed by the AOP team. And these 29 products are broken down into three levels, level one, two and three. And the raw data that the sensors collect are referred to as level zero data that we do not put out to the public. The level zero is processed to create a level one data product. And then from level one, you go to level two and then three, so it happens serially. And as you go from left to right from level one to level three, you'll notice that the dataset sizes typically tend to go down. And for this tutorial, we're going to be looking at the level one data product, the street return LiDAR point cloud. I don't know if you can see my mouse pointer. That's the one we're going to look at. In addition to that, we're also going to be the level three data product called elevation LiDAR. So this will give you the ground elevations in a raster format at one meter spatial resolution. We have a standard protocol in place to determine the ideal conditions for flying over a NEON site. The biggest impediment to flying over flight is ensuring that we have a cloud-free sky. And we have been successful in conducting most of our aerial surveys in almost near cloud-free conditions or low cloud-free conditions. But that may not always be possible. And you'll be seeing an example of that in the tutorial today. So although we aim for a near cloud-free sky condition, you may inevitably have some clouds here and there. And that might affect the data as well. And you're going to see that in the tutorial. We fly the airplane at a constant altitude of 1,000 meters of ground level. And typically, we cover at least a 10 by 10 kilometer box. That's the minimum. We even go above that for every NEON site. And these flights are conducted at each site at three greenness conditions. So what we do is for every site, we collect EVI trajectories derived from MODIS. And then we draw this interval, time interval, which corresponds to peak greenness for that particular site. And we ensure E-greenness collections so as to have consistency between annual collections. So let's say if you're comparing this year's NEON data, like the reflectance data collected by NEON, but let's say a couple of years from now, then if you collected during the same time of the year, it will ensure easier comparisons between the two years. And yeah, we typically fly these lines in a north-south direction to reduce any BRDS effects. This graphic that you see on the screen summarizes the number of times each of the 20 eco-climatic domains have been surveyed by the AOP over the period 2013 to 2023. As you can see, places within the continental US tend to be sampled more often. I guess every site in the bonus is sampled three times out of every five years. We ever take. And then places that are outside the continental US, like Hawaii or Puerto Rico, they are sampled once every five years. And we do this purely for logistical reasons. Going back to LIDAR, the discrete return LIDAR. This graphic was the value of discrete return LIDAR for characterizing the 3D vegetation structure. On the left, you'll see that all the discrete returns are colored by elevation values, with blue meaning low elevation and red meaning high elevation. On the right, you're going to see the same data. But this time, the returns are colorized by the RGB values from the camera sensor. So by adding the RGB from the camera, you're providing a little more context. And for today's tutorial, we're going to visualize this RGB colorized point cloud in Python. The level one discrete returns are used to generate a whole bunch of level three rasters. One example of that raster is a digital surface model. The digital surface model or DSM provides information on the surface features, such as the top of canopy vegetation heights. So going back to this graphic, the way we generate a digital surface model is we will take the highest returns that are recorded and then we'll interpolate them and create a smooth surface, a smooth raster of one meter resolution, and that is a digital surface. Think of this as draping the landscape with a blanket. That's what it gives you an idea of the top most surface. The next raster that we derive from the discrete return data is a digital terrain model and we'll be ingesting this in Python today in the tutorial. The digital terrain model gives you the elevation of the ground with respect to a vertical data, and this is also provided at one resolution. Going back to this graphic, instead of taking the top most returns, if you were to take the bottom most returns and interpolate them and create a surface, that's basically what a digital terrain model. I'll keep flipping between the DSM and DTM for the same location and you'll see what the difference is. The DSM will give you the top of the canopy view and the DTM will give you where the ground is. So if you have DSM and DTM, you can subtract the DTM from DSM and you can get a new product and this is called a canopy height model. So for every one meter pixel on the ground, it will tell you how tall the canopy is. So this was a quick overview of the neon and the AOP data. Now what I'll do is I'll share my screen again and we'll go over the tutorial real quick. Bridget, can you see the... Yes. The tutorial, okay, nice. Yeah, I'll quickly summarize the content in the tutorial and then hopefully we'll have sufficient time for the coding and any questions you may have. So this tutorial talks about the creek fire, which was a large wildfire that started in September of 2020. And in this tutorial, you'll find that I have provided links to a lot of content. Let's say if you go to the first link over here, it talks about the 2020 California fires. And here's a cool graphic of all the fires that happened in California in the year 2020. And this is the creek fire right. And there's yet another graphic explaining how big this creek fire was. So if you look at the top 20 largest California wildfires ever recorded, five of those fires shown in red here happened in the same year, 2020. And the creek fire is here at number six. So it's the sixth biggest fire ever recorded in California. So it's pretty big in size. And unfortunately, this fire affected one of neon sites in California. It's called the Socrude Tidal site in neon. When you have time, you can actually look at the video here as well. It will give you a good idea of how the landscape of the Socrude Tidal site looks like. I won't be going into that now because of time constraints. And then we also visited this site after the fire event. And we have some photos of the damage caused by the fire. So this is one photo that was taken of Socrude Tidal site. I think we sampled it right after the fire. And you can see that most of the understory vegetation is gone. And some of the leaves on the tall trees are also gone as well. So it's just dead individual standing. And there's yet another photo here. You can clearly see the burns cars on the ground and much of the later gone. Yeah. So that was the background of the creek fire. Now the neon AOP team fortunately conducted aerial surveys over the SOAP site in 2019 and 2021. So a year before and after the creek fire. It was rather fortuitous. So this exercise aims to study the effects of fire on vegetation structure by comparing the LIDAR derived relative high percentiles before and after the fire. So if I were to go back to... Oh, you probably can't see the slides. Let me stop here. But going back to the slides here, what we are trying to do is... Yeah. We are trying to understand how the vertical distribution of these discrete returns change before and after the fire. That's essentially what we are trying to achieve here. Okay. So this Python tutorial is broken down into three parts. First we'll be reading in the discrete return LIDAR data. And I've provided you with links for the data product. This is where you can download your data. We'll provide you with information about how the product was created. Feel free to go through the documentation here. And if you wish to download the data, this is how you would go about it. You don't have to download it for this tutorial. We have already provided you with the necessary data and code. Yeah. And if you're interested in checking out other neon tutorials, we have a couple of them which are relevant to this one. For example, if you're interested in using an API to automatically download any of the remote sensing data, then please feel free to follow this Python tutorial. And let's say if you're more comfortable in R and you want to do the LIDAR processing in R, I would highly recommend this R tutorial. Like I said, all the data for this tutorial and the code are all provided on this Google Drive link. So if you right-click on it, you will see a page like this and make sure you hit download all here. And it will download the whole drive onto your machine. It shouldn't take too long. It's about 115 MB in size. So maybe a couple of minutes stops. And then the only data set that is not derived from neon is a shapefile for the Creek Fire parameter. And we downloaded this from the California Department of Forestry and Fire Protection, the CAL FIRE website. You can check it out. This actually has information of the parameters for all the fires that are connected to the state. But in this case, there will be just a Creek Fire shapefile. Then I go over the details about setting up your environment. And I'll go into this in more detail later. For now, I'll quickly summarize what we're trying to do here. So the part one of the tutorial is about reading and visualizing the discrete return LIDAR data. These LIDAR data are provided in something called a LAS format. LAS stands for laser. I think it's short for laser. And if you open any LAS file and you print out the dimensions that it collects, these are all the pieces of information that the LAS file will give you. The X, Y, and Z coordinates of the discrete return. So for every point, you will have the X, Y, and the Z. And then you'll also have the intensity of the return. And the return number and the total number of pulses. I'll quickly go back to my previous. Yeah. So in this example case that I talked about earlier, you have three discrete returns in this case. And the first return happens at the top of canopy for the tree. The second return is for the understory. The third return is for the ground. And the intensity here refers to the amplitude of this reflected energy. So the Y axis here, if you want to call it that, or the X axis here is the intensity. Okay. Going back to the tutorial. So you have the X, Y, Z. You have the intensity of the return. You have the return number and total number of returns per pulse. And you can also determine if the return is a ground return or a vegetation return. The LAS tool software follows a standard classification. So if you right-click on this, it will give you the classification of the return as well. It could be ground, low-wedge, medium-wedge, high-wedge, or building and so on. So these ports here mean classification types. And then it also collects the scan angle at which the LIDAR interval is collected. So anyway, this function here imports the last file as a data frame in Python. So when you run this code, you will ingest all of your distributed on data in the powerful data frame here with information like X, Y, Z intensity, return number, classification for every discrete return. Now you'll notice that the X and Y values are not lateral on. They are the easting and northing values. So when you convert the world in three dimensions to a two-dimension, let's say on a piece of paper, you have to do some sort of a projection. All the neon data are available in a UTM projection, Universal Transforcement Workator projection. And the Z coordinates for the discrete returns are reported in a North American vertical data. So it's a vertical reference against which all the heights are measured. I would highly recommend reading this article here. You muted yourself. Shashi, you're muted. How about now? Can you hear me? Yeah. Sorry about that. I was talking about the vertical datums. Normally when we talk about the earth, we refer to it as a sphere. But in reality, the shape of the earth is closer to an ellipsoid, as you can see. And this North American vertical datum basically refers to this ellipsoid. So the Z elevation, when you say 907.778, it means it's 907 meters above the ellipsoid, not above the ground, above the ellipsoid. And that's the case for all the elevations that neon provides. Even the digital terrain model elevations you'll see later on. They're also measured against that ellipsoid. But as ecologists, we don't really care about the ellipsoid so much because it doesn't really get a totally made up concept. So for the Z values to make more sense, we'll use the ground elevations from the DTM to calculate the height of the return relative to the ground. We'll be doing that in this book. But for now, we'll be visualizing the discrete point cloud. If you print the number of returns you have in a one by one kilometer file, you'll see that we have about 15 million returns. That's a lot of data. And by the way, I'll quickly share my screen for the QGIS component of this. So if you go to the Google Drive and download all the data and open it in QGIS, this is what it will look like. So this is the Creek Fire in California. And the red outline you see is the boundary for the Soaproot saddle site. And then on top of that, this is the digital terrain model that is available in the Google Drive folder. This provides the ground elevation for this one by one kilometer file and it's available at a one meter spatial resolution. And then on top of this, you can also plot the colorized point cloud discrete return data. So you can see here if I zoom in, yeah. So you can see the discrete returns here which are colorized by the RGB values from the camera. So these are all returned and you can visualize this in QGIS. And similarly, you can also open the point cloud data for the 2021 year as well. So if I simply flip between the 2021 and 2019 data, you can already see the impact of the fire just by flipping between the two. You can see burn scars in the 2021 data versus a much greener looking landscape in 2019. And this big dark shade that you see in the middle is a cloud shadow. And this goes back to the point that I mentioned in my slides that although we aim for near cloud preconditions, it may not always be possible. This is one such example. And if you're wondering why the cloud shadow has such sharp edges, it's an artifact of merging different flight lines. So this is not actually how we collect our data in one by one kilometer tiles. We collect it in the form of flight lines which are then merged to create this square shape tiles. And those edges are I guess the artifact of that. So this is a quick description or a quick visualization of all the data sets using QGIS. Now I'll go back to my tutorial. Yeah, so we have a whole bunch of discrete returns in the 2021 point cloud. And then here we visualize this in X, Y and C. I'll move real quick because we don't have much time left. But we have data for the 2021 and 2019. And in part two of the tutorial, we'll be ingesting and visualizing the digital terrain model using a package called reox array. And this is how you ingest the data. And then you can print the metadata stored in these digital terrain model files. One important thing to capture or to make note of when you're doing this analysis is to see what sensor, what LiDAR sensor was used for the data collection. And you'll notice that in the years 2021 and 2019, we use different LiDAR sensors. In 2021, we use the OpticGalaxy Prime, whereas in 2019, we use the Gemini. And when you're comparing data from two different years with different LiDAR sensors, it's always good to keep track of consistency in the LiDAR sensors. In this case, it is not. So the older sensors like the Gemini have a wider outgoing pulse width, which results in a poorer range resolution for the sensor. And poor range resolution of the LiDAR makes it challenging to resolve objects close to the ground, such as low vegetation. The Gemini has a range resolution of about two meters, which means it can be challenging to distinguish objects less than two meters apart along the vertical profile. But for the Galaxy Prime, this range resolution is substantially better at around 67 centimeters. For more about this, I have provided a link to the theoretical basis document. We can go through that. These are the values stored inside a digital terrain model. And again, all these values of elevation are with respect to the reference data, which is that ellipsoid threshold. And here we plot the DTM here, Northing, Easting, and the values of elevation above the datum. And this part here is the optional part. You don't really need it. We are basically trying to create a pretty looking map here that we earlier saw in QGIS, but created in Python. The only reason why you need so many lines of code here is that all of these are in different projections. So we're trying to bring them to the same projection here. And that's why all the code here pretty much does that. And then part three is the main key takeaway from the tutorial where we calculate relative height percentiles relative to the ground. In the first case, we calculate the heights of the district returns relative to the 2021 DTM. So we have a Z value, which is the height of the return of the ellipsoid. And we extract the digital terrain model raster value associated with that district return. And that will be the ground elevation. So we subtract ground elevation from the Z and then you get the dispute return height above ground. And in this case, what you're doing is maybe it's easier if I share my QGIS screen. So up until now, we have been looking at one meter rasters. So we want to calculate the relative height percentiles for every pixel on the ground. And right now our pixel size is one meter. And this is how one meter pixels look like. And the dots here on the screen represent the discrete returns. So when you want to calculate height percentiles for every one meter pixel on the ground, you would have to aggregate all the returns that fall within the one meter pixel. But because there aren't very many discrete returns in a one meter pixel, what we do is we force in this one meter data to, let's say, 10 meters. So this is how it looks like. I'll flip between these two. This is how a one meter raster looks like against a 10 meter raster. In a 10 meter raster grid, you'll see that each pixel has many more returns. So any percentiles that you calculate here will be more robust. So that is what we're trying to do in this part here. And I'll share the screen again further. We are creating a 10 meter spatial resolution raster grid. And we assign a unique ID to each 10 meter pixel. And then we group all the discrete returns based on the 10 meter pixel they fall into. That was the visualization I showed you in QGI. So for every discrete return, you capture the 10 meter pixel that it falls into. And then we also ensure that we have a sufficient number of discrete returns per 10 meter pixel, which is true in this case. And finally, we calculate these percentiles for every 10 meter pixel. And this is the final plot that you see here. So again, what we're trying to do is I'll go back to my QGI screen. We calculate our height percentiles, the 20th, 50th, 75th and 90th percentiles for every 10 meter pixel in this map. And we have several such 10 meter pixels here. So those box plots that you see there represent the distribution of the height percentiles across all these 10 meter pixels. And we see how those distributions change from one year to the next from 2019 to 2021. Now, the first key takeaway is that the 2019 high percentiles look higher than the 2021, which makes sense because some of the vegetation was lost to fire. And the other cool takeaway is that the differences in the distributions are more evident in the lower height, which kind of makes sense because if you go back to your photos from the side, you'll see that much of the damage actually happened in the under store. So the lower height percentiles were affected more compared to the higher. Because there are still some standing trees here. Even the dead trees that you see will be recorded as heights, vegetation heights. So that's why it's possible that the differences are not as abrupt in the higher compared to the lower ones. Yeah. And there's one other thing I wanted to mention. I don't know if you can see my screen right now. In the neon LiDAR tutorial, I have provided a file called a summary of natural disasters at neon sites. So this has a really cool graphic here, which can you see? Yeah, we're still seeing the screen from the website. Oh, okay. Sorry about that. In the Google Drive folder that I shared, we'll find a PDF called summary of natural disasters. This is a poster that my colleague Bridget presented at AGU conference last year. And it has a really nice graphic at the center, which gives you a summary of all the neon sites that were affected by some natural disaster, not necessarily just fire, could also be flood or a hurricane. So this is, I wanted to point this out to you so you could do similar change detection analysis for other sites and other natural disasters as well. So anyway, with that, I guess I'll stop now or the hands-on tutorial part of the session. So maybe you can follow the instructions on the tutorial and then let us know if you have any questions.