 We are presenting today on the neon API using Python, and this is part of an ongoing seminar and webinar series, which will provide a little more information on at the end, if you're not already aware. So my name is Bridget and I will be leading you through some scripts in Python to work with the API today. And Claire also at neon will be helping out with the chat and answering questions as they come up, and we'll also leave time for questions at the end. So Claire if you want to give a quick introduction. Yeah, hi. I'm a data scientist here at neon, and I work on a lot of our publicly available code resources and that kind of thing. And like Bridget said, I will just be here monitoring the chat and so forth. So if you have questions you can chat me there. Thanks Claire. And I didn't give a real introduction but I work mostly with the airborne observation platform team on the remote sensing data and then also with Claire on the data skills team and help develop a lot of our tutorials in Python. So with that we'll get started. So just a one second or one minute overview of the neon project. We're working with the National Ecological Observatory Network. We hope that since you're here you already know a little bit about it, but we offer over 183 openly available ecological data products, including data that's collected using observational sampling instrumented data and remote sensing data. We'll be talking about how you can access this data. You can access the data through the neon data portal, and this is a GUI way to manually download data you'd select the sites and the data products that you want to go through a process to download the data and certain files if you only want certain files for example. Claire also offers an API, which we'll get into in this webinar. And that we have tools for working with that extensive tools in our which Claire has helped develop largely and maintain and this is the neon utilities package, but you can also access and work with the API in Python or any programming language of your choice. So the API in a nutshell is, it stands for application programming interface. And it's basically a way that you can automatically download data or explore data even in different programs without having to go through the whole web interface. And essentially it allows for constructing URLs that will return information about the neon data in a machine readable format in our case JSON format. And we encourage you to explore the API by going to this website here which Claire will link in the chat. And that allows you to interactively look at the different how the API is set up in the different endpoints and we will get into that in this webinar as well in Python but this is kind of a nice overview page that we encourage you to look at. So Claire's also going to drop these links in the chat, but for following along with this tutorial. We do have a web page it was linked to the original webinar page. And then we also have a markdown and I'll show this in a minute but I encourage you if you're following along with the live coding. The markdown allows you to copy each code chunk pretty easily and if you get lost or behind this is a good way to keep that code handy and copy it in. And then lastly we do have a lot of information on working with the API and our programming language as well. And if that's your language of choice there's another tutorial on that and in a lot of resources on the neon utilities as well. All right, so with that jumping ahead a little bit. I will switch to the live coding. All right, so these are the pages that you can follow along with the live coding section. So this is our neon neon science resources learning hub has all of our tutorials and hopefully that link is in there. And then here's the markdown page, which is on our GitHub neon science neon data skills repository. And so I'll just show real quick scroll down, each of these little gray boxes are the code chunks that will be working through. And if you hover over to the right you can click it will copy that so or that it's called a cell code chunk to your clipboard and then you can paste it in your programming interface of choice. So I'm going to be demonstrating the code from this workshop in Jupiter notebooks, which is a nice interactive platform for working with Python it also works with other programming languages but you're not required to use that. And I'll just show a little demo at the start of how to work with that. It's nice because you can add markdown comments and annotation as you go so it makes it really nice for sharing. And that's actually what our tutorials are built in on the web page as well. So to open Jupiter notebooks, I always start with the command prompt. Actually, it's already short because I searched for this recently and just type in the Anaconda. And then I just opened this prompt. And if you've installed Python using conda, you should have this program. And then I created this folder on my desktop just called neon API Python so I'm going to do all of my coding in that folder. So first time just trying to change directories or CD into that folder on my desktop. So you can either set up a folder or do it locally and move it at the end but we just encourage you to have some sort of local organization for that. And then to open Jupiter notebooks, I'm just going to type Jupiter notebook. And it may take a second but it's going to open up a separate web page. I'm a little bit laggy here. And here it's opened up this home page. You should be blank if you don't have anything in that folder already. I just did a test notebook in here so I'll ignore this for now but I'm going to create a new like clicking on this new. And then I have Python three installed so a new Python three kernel. So before we dive in I'll just give a little overview of how Jupiter is organized. It looks like it's a web page but it's actually running Python in the background. And then this command prompt back here if I control C or close out of this it will actually close down the Python instance that I have open. So just be aware of that you're working in Jupiter not to close down this screen in the background until you're ready to be done. So up here I'll just reading this neon API lives coding. So that's my notebook. And if I go in here. We'll see that was created as an iPad notebook folder for file. And then there's all these options here at the top so file. Different options that here you can save edit here's where you can work with your different cells each of these little chunks is called a cell. And then you can also change the cell type by default it's code and so that means it's Python code, or you can use Markdown and I'll show an instance of that throughout this workshop here. And then another thing to be aware of that is the kernel is what's running that Python in the background and so. If you're getting a little behind or you run a lot of cells and you want to clean up your whole output and just start fresh you can do this restart clear output. And that will, that will allow you to run each cell again as a fresh instance. And lastly, with this help, there's a lot of shortcut keys. And I'll be using the shortcuts I'll tell you as I use them but one of the main ones I'll use is just control enter, which runs the cell and otherwise. You can hit this arrow to run it. But it's just to make sure to do the control interweaving. So first, we're going to import some required to Python packages. Basically everyone got to follow this installation requirements but there's only a few that are actually needed for this. So these ones are import OS. And you can either use a comma or separate imports on each line request, and that's the Python package that works with API specifically. JSON. And then enter tools which is just a nice package for helping to iterate through different data structures. So shift enter or run should run that cell and hopefully you won't have any errors here. If you do have an area you may need to import one of these and I would suggest importing one by one. And Claire, feel free to stop me if people have issues and we can work through at a convenient time. Okay, so with the API, first we're just going to define the URL, the base URL and so for everything on the neon API, it's the same server or URL. And so we'll just define that here. So we'll call it server. And it's going to be HTTP colon slash slash data dot neon science. Or you'll see that over and over again with neon slash API slash view. So we'll run that. And now we've defined that server variable. So the first thing we're going to explore are the sites endpoint of the neon API. And that basically provides information about all of our field sites across the across the observatory spanning the entire continent. And so we're going to start with one of our sites in the domain 17, which is in California called T cattle. And all of our sites have this four letter or four, yeah, four letter code that's an abbreviation for the full name. So we'll define that. Just add a comment here. The site. And actually I'm just going to pull this out. And then you can follow along over this isn't too confusing here. Follow with the market as well. Okay, so next we're going to define the URL. And so with that site. So we can just combine these together and Python with plus sign that just concatenate the strings together. And then this sites is the endpoint we're trying to look at. And we can add that. They could just defined to the end to define the full URL. And then let's just take, oops, accidentally converted that. Put that back to code. So we'll just see what that URL is. So here you can see we're using the site endpoint and then we've defined it or defined our site. Next, we can use the request package within Python to request the URL. So let's just set that to a variable called site request. And then the package works by using this get get method. And then we can set that URL in there. And then we can convert it to a JSON object. Using the JSON. And so, oops. Should be requests.get not request. So for those of you who aren't familiar with JSON, it's essentially just a nested data format with key value pairs. And so if you're familiar with Python, it translates well into the dictionary object structure. So don't worry too much about the details here. But for now you can just think of it as like a nested storage of a lot of different information. And we'll explore that in a little bit. So if you are familiar with dictionaries in Python, we can use this key as part of the dictionary to see what all is included in here. So site JSON.keys to show us all of the keys in this object. And so right here we can just see there's one key that's data. And that's not to say that that's all that's in there. But basically everything in this JSON format is stored under this data key. So we can go to the next level of the structure using this same setup. So let's do site JSON. Now let's pull the data object and then look at the keys under that. So now we can see there's quite a few other pieces of information that are stored in here. So we have the site code, site name, description, type, longitude, longitude, code, all these different things. And then also things like that we'll get into in a little bit. But releases refers to the neon data release, which is an annual process where we are able to see what's in there. And then we have the neon data release, which is an annual process where we provide data and updated data for each year. And the data products and these are the, all the different data products that neon offers. So again, don't worry too much about the syntax here, but we'll use this iter tools package, which is a package that helps iterate through different structures and then look at some of these items in here. I'm going to go ahead and just copy this over so I don't mess it up. So here I just copied and then I use control V to paste. So this is basically going to show us a dictionary of the items in the site JSON data that we just looked at. So instead of just showing us the keys, it's going to show us some of the values as well. So actually I'm going to, I'm going to look through all of these on the tutorial we suppressed some of the output. This was just going through the first 12 items in here, but we'll look at all of them and enter. And you'll see it creates this scroll down because there's quite a bit of information here. So we didn't want to put all of this on the on the tutorial webpage, but we can go through some of it here. So if we look in each of these keys, we can see details about this site. So, like we said, the code is T cattle or T a key K stands for lower T cattle. We have all this information about the location, the domain it's in the Pacific Southwest. And then we can get into the available data releases here so we've had three data releases so far from 2021 and most recent one in 2023. So January, and then there's URLs for the API endpoints for each of these releases. And then lastly, we have all the available data products for this site. So this is where it gets a little long because it shows all the availability, but we won't, we won't dive too much into that here. You can just see that there's a lot of information stored in this sites endpoint. So, let's go ahead and look at a single data product in here. And go to the next cell down here. And don't worry too much about which one I selected here you can actually do any of them, but for the sake of this, we'll just choose one of the AOP or everyone observation data products. So here we're going into the data, and then we're going to go to this last piece the data products. So again, this is just nesting down into the structure. And then negative three means the third from the last but you could also do something like zero or any index in here that's valid. So let's take a look at this one it happens to be the slope and aspect data product which is one of our products from the LiDAR instrument. And here we can see some other information such as the available months that this data exists, and then all of the URLs for each of those available months. And then lastly the available releases. So AOP is a little unique in that it doesn't have or each at the end of each release we we replace it with the latest release so it's only has one release although we haven't released for three years. So we can also look at every single data products that's available, essentially by looping through all the all the data products in that nested structure so let's go ahead and do that with the for loop. So here we're looking through the data data products. And we'll just use print to display some information including the data product code and data product title. So the forward. So this is just a nice way to see every single data product that neon offers. So you can scroll through we have a ton of instrumented data products. We have observational data products. And then all the AOP data products will start with this DP1, DP2, DP3.3 and so that's maybe a quick way to isolate those. So as a quick aside, we do recommend for the observational and instrumented data products to work with the neon utilities. Because that's already has a lot of built in functionality for wrangling the data after the fact, you definitely can work with it in Python but it's not going to be it's going to have a lot more pre processing involved so yeah, just just a little bit about that and a lot of people tend to work with remote sensing data in Python which is why we're trying to make more tools available in Python for that. All right, so now that we've explored some of the site endpoints and some of the data products, we can start querying the data products and actually diving into those was available for each of those. So, we looked a little bit at the available months, and here we will, we will try and look at one of the available products. So here, we'll make a request using the request package again. And then our new, I'll go ahead and write this on a separate line but our product URL actually. So it's a little dangerous changing things in my point but our product URL is going to be the same server that we defined before this time we use the products endpoint. And then the product code and I think I might miss that stuff. So, here's another quick trick if you do insert cell above you can add another code cell. For this example will use the canopy height model or ecosystem ecosystem structure data product which is dp free dot three zero zero one five dot zero zero. You can see that here as well. You can just print it out. To look at model or CHM for short. So go ahead and run that. I'm going to comment out control slash this comment and on Windows at least. Let's just take a look at this URL. And then we can go ahead. And then we can see it's printed out and actually already hyperlinked it so we could actually click on this and see the whole Jason in the web page. But we'll go ahead and explore this and I just wanted to show that you can for each of these URLs you can explore them interactively in the web as well. So next we're basically just trying to show all the available months of data for this data product at our site T kettle. This is a little bit long, but it's mostly just a loop to print out some relevant information. So the show but earlier than some of the earlier loops we just mean. So here is we're setting if the product matches our code that we just set the five product code in this case the CHM code. You don't have to use the parentheses here but you can. We'll print the available months, which are saved under this available months key. And then let's also print the URLs for each of those months. Oops. And we'll do that loop to just for display purposes. So it's a nested loop. So it's a little bit messy but it should. But it should give us a nice little so here we can see. We have five available months for this data product at T kettle from 2013 to 2021. And then here's the URLs for each of those months that are available. So again, I'm just going to click on this last one to show you what it looks like. And here is essentially a long days Jason just a convoluted structure that contains a lot of all of the data files for that available data in 2021. So each of these is a file that's associated with that data product. So he tends to have higher volume or higher number and larger volume. Bridget, we're, we're suddenly, we can barely hear you, or at least I can barely hear you. Thanks. Let me try to sort that out. Okay, that was much better. Okay, I'll speak up. Thanks. Yeah, and please interrupt if anyone's struggling looks like one other person was at least. So we've got an idea of all the data that are available for a given a P site for example. And I think next we can, we can look, we can look more into this specific data product so I'll skip ahead just a few cells in the interest of time but let's take a look at the product abstract, which basically the description of what this product contains. So I'm just going to do a few print statements. I'll do these on separate cells. Oops, actually, can't quite skip ahead here. So, yeah, we'll make a request into this products endpoint, but actually sorry we already made that so let's use this product URL up here. So we'll just make a JSON out of that URL. So I'm diverging a little bit from what's on here but you can either follow or follow along. So, I just called that URL dot JSON. And then we'll take a look at this product name here. And of course, when you change things it doesn't always work. Oh, because I didn't actually get so I have to make my request. So product request request dot get and then we had already defined that product URL here. So if you're following along with the tutorial all I did is I defined this endpoint with the product code as a variable called product URL, and then I'm making a request to get that. And then we can convert that to a JSON similar to how we did before. So let's try that. Okay, so yeah, here we can see the data product name is ecosystem structure. And then let's also look at the product abstract. And just kind of copy this and use abstract instead. So here's a nice little blurb that you can read through I won't go through the whole thing here but it basically describes how this data product is generated and summarizes that. So the height model is generated from our lidar data and it's essentially a difference between the, all the returns that bounce off the top of the canopy and the ground returns and so we create a digital terrain model, which is essentially if you stripped all the trees off of the forest, you could pull out the just the terrain underneath. And then the digital surface model is generated by if, for example, you like put a blanket over the entire canopy the digital surface model would represent that surface which includes all the vegetation. So if you difference those two models, you can get the canopy height model and that's essentially what we've done to generate the ecosystem structure data product. The, the data product JSON that we just created also has some nested variables in here that we looked at a little bit but we'll dive into those a little bit more. Yeah, so let's just look at the all the keys or all of the substructures that are stored in this product JSON that we just created. So you'll see we're kind of repeating through a lot of the same, same types of code. So here we see that there's information about the site code, the available months, the available URLs and then the available releases. So similar to before we can loop through these. So before we looked loops through each of the data product codes and got the associated data product name. Here we're doing the same thing that we're looping through all the available months, and then we're printing the months and then the associated URL. So here, we're just pulling out all the site codes, we're only going to match the site code that we defined. And then here is just a little trick to zip together the months and the URLs so that we can nicely print them out. So it was available. Data URLs. And then we're just printing the month the month. Now, of course, this zip thing so we want to print each object out separately. And we're just going to extract a single data URL again. So in this case, let's look at the 2018 T cattle data. So pulling out the URL associated with the month 2018 or six. So here's all the available data we have for T cattle for that data product and for that month and year. So I'll just click on this so we can see what it looks like. And here you can see before we saw all the files and now we see the files and we also see where they're stored. So neon moved it storage to the Google cloud storage and so everything's going to be stored under this storage Google APIs and then some nested folder structure. So actually, if you copy this and open it in a new web page, it will automatically download that data. But you can also just do that directly through Python. So just showing you how you would do that on the web. Okay. So back to our notebook. Let's print out the URL that we defined here. And that was actually the one that I could just put on so go there again. Okay, so now we know all the available months or years of data for that data product in that site. And next we're going to go and and explore where we can find those locations directly from Python and give you a quick preview of how you can find where the data are stored from the web page, but we'll go ahead and do that in Python now. Just going to copy this. So now we're making a request for the available data in 2006. Again, we're going to meet the request with our endpoint this time we're adding, we could actually just put this. If we copy this it should be the same as that URL we just defined. I'll show you that. Yeah, so that's the same URL here so I'll just say data URL for clarity. And then a nice handy trick with with Jupyter is if you do escape DD it will delete that cell so if you want to test something that's kind of a nice way to do it. So let's make the request for that data URL that we just defined and then convert it to a JSON. Right. Again, we'll look at all of the available keys for that JSON dictionary structure we just made. Here we can see. For this URL we have information about the product code site code months that's the publish month, all of these months here. The release for AOP we saw that it will just be released 2023. And then packages and files. The packages aren't as relevant to AOP data so I won't get into them here but the files are really what we're interested in, in order to download something. So let's just take a look at one of one of the files in here. By looping through just the first of all the keys in the first file that's pulled up. And the first file Python has zero based indexing so that's all we're doing here is pulling out the first file. And then let's print the key. And then dash t is just a tab. So we can see it a little more easily. So we're going to pull out the data. And then the files. So here we can see under data the last object that started the files. And let's just take a look at the first file. Again, index by this zero. And then pull out that key. Okay. So here we can see that each file has information about the name of the file, the size. And all three of these are essentially just checksum information to make sure that when you download and compare against the file checksum. So don't worry too much about that. The CRC32C is what we use for local cloud storage. And then the URL of that file. So again, like I showed you can just download by clicking on that you could also download this first file by clicking this link. And then you can see it just downloads here. Local storage. So that's pretty powerful with the API we can extract I know this looks a little convoluted but we can extract everything down to where that file is stored and then just automatically down with it. So, let's just take a look at some of the different files that are stored under this data product. We know that all the files are saved under this files object. Let's just do another quick loop through the files. You don't in the in the tutorial I just spliced to take the look at the first 10 files I'll go ahead and look at all the files here. Again, that was just to suppress the output so we didn't take up the whole a ton of space there on the tutorial page. But let's just look at the names of all the files. So here we can see we have something and that follows the standard naming structure that you probably aren't super familiar with if you haven't worked with me on data a lot but they all follow this structure of neon. The domain in this case we're in domain 17 which includes a lot of the California sites. T cattle is the four digit site name DP is the data product level number. And so we are looking at a level three data product. And then this is the UTM coordinates of the bottom left corner or the bottom, the, I guess, Southwest corner of each of these tiles data tiles. So this is UTM XY. And then lastly, the, the name of the file so canopy high model is, or CHM stands for canopy high model. And with our canopy high model data we have a lot of associated metadata. And so all of these other files that ended in these different extensions that are shaped files. There's four different shape file extensions. And then we also provided KML files along with them and these are essentially just defining the boundary of each of these files. Apologies that the names don't match here we are going back in and renaming with the standard convention for some of the metadata but if, as long as the UTM coordinates here match up that's the associated shape files for the each of the type files or tiles. So for this exercise, just to wrap it up, we're going to download a single canopy height model geotip tile, and then in Python plot it so you can see what it looks like in the in real life. So, next I'm just going to print out only the tip files. So I'm going to skip over a lot of the metadata shape files. And we can do that again in a for loop using an if statement to filter out only the tip data. So, I don't want to print out all of the files because there's a ton of them, we already looked at all of them so I'll just select the first, go ahead and do the first 50. And then this time it will only pull up the CHM tip file so if all this is doing is it's pulling out just typing in this way. In all the file names, if this CHM tip string is included in that name, then we're going to print the name of the file and print the URL of that file. So it's looking through all the files including the metadata files which is why it's only printing out a subset of the canopy height model tiles. So again to download the data you can just directly click on the link and I'll do this first ones 317 000 underscore 4141000 with five zeros at the end. So I'll just open oops didn't mean to open it. But if you have something that opens tips you can actually see what it looks like in there but I'll show where this was downloaded. And I had clicked a bunch of these so I have a few different canopy height models that I downloaded. Looks like I kept clicking the same one. So here in my download folder I can see that canopy height model and if you use ArcGIS or QGIS you can go ahead and open that in there and this is that model of all the canopy heights over that certain area within the site. Exactly. If you have Rasterio installed in your Python instance that's a nice package Python package for working with Raster data, and we'll go ahead and use that to plot the data directly in Python. Sometimes Python GDL packages are a little bit pinnicky so if you have issues or had issues installing this we're happy to help at a later times. So we'll just import Rasterio, or Rasterio is RIO, and then we'll import some plotting packages as well. Plotlib, Pyplot is the standard one. And then Rasterio also has its own plotting packages with a built up is not Plotlib. And for this next section I basically was just assuming you didn't move the file anywhere locally so you can use this if you just saved it in your downloads and you're on Windows this will allow you to just pull directly from your downloads. We always recommend kind of keeping things organized so I'll go ahead and move this file if I show it in folder. It's a little bit sluggier. I'm just going to move this into my local folder that we've been working in my working directory. And then we'll just simplify the code a little bit. So I'll just say chntiff equals, and I can even just copy this path directly. And this is where I've saved the data, or you could use, if it's in your local directory, you can use a dot slash for a relative path. So let's just make sure that looks right. Okay, so we're in our local directory and it's finding that file. And then we can do this RIO.open to read it into this Rasterio variable that we'll call chm. So we can take a look at that chm. So here is this, it's a data set reader and it's in opened in read mode here. So that's all just through this Rasterio package. And then lastly you could just do, you could just say show chm because we pulled in that show package. And this is just an overview of what that title looks like. So I don't have the scale on here, but the light green here, the trees and the dark blue are ground. But we'll go ahead and make this a little nicer by using some of this plot setting. And I'm just going to copy it in here. But all this is doing is essentially formatting different things and showing the title and making it so that the Y axis here is not using scientific notation, things like that. So it just looks a little nicer. And then I use the green color of upscale. So the darker green and the high trees and the light green are the ground or even white is ground. So with that, hopefully you've got a basic understanding of how the neon API is structured. And I'm just going to wrap up with a couple things and we can ask questions. But I do want to encourage everyone to take a survey so I'll just pull this up and Claire will drop it in the link. So if you have to leave right on the hour. We just ask that you quickly provide feedback because we're always trying to improve our webinars and make them as useful as possible for the scientific community. So Claire's going to drop that chat and then just before we and open up to questions I wanted to point out that we have some upcoming seminars and webinars. The next one is in March on neon soil. And then our next webinar will be associated with that on working with the soil sensor data. So please, we encourage you to follow along and we thank you for joining us.