 This is all about, my name is Dan Buskin. I am currently a consultant contractor for the US Geological Survey, Coastal Marine Geology Division in Santa Cruz and formerly a professor at Northern Arizona University. And prior to that, I was a geologist working at the Grand Canyon Monetary Research Center, which is a USGS office in Flagstaff, Arizona. And I got into machine learning methods as sort of an early adopter of some of the methods that were being used for sea floor characterization. And so I sort of got into machine learning methods using bathymetry and backscatter data back in 2012. And then from there, I've sort of developed workflows using various types of machine learning algorithms for various types of problems in geology and ecology, using sort of imagery, primarily from drones and airplanes and some work with satellites and also a lot of work using machine learning workflows with submarine and sort of sub-aqueous platforms and sub-aqueous data as well. So I don't have a background in computer science. My background is in geography, geology and oceanography. And I imagine most of you have similar backgrounds and that you're sort of coming at this from a similar perspective to me, hopefully. Obviously, the workflows that you're gonna see today, they're very specific to a particular data set, also very specific in terms of they've been written by myself and not too many collaborators so far, but also the specifics of the model and the various training strategies that we sort of use with the model. But hopefully along the way, I'm going to highlight some of the areas that there is room for sort of expansion and extension and replacement of various workflows with others. So even though the workflow is obviously going to be very specific to the data and the particular choices that I've made about the workflow, hopefully I'm gonna impart a few general concepts that you can sort of take home and sort of start developing your own workflows using perhaps using some of these materials as a starting point. Kind of difficult figuring out how many people are actually gonna show up to this. We currently have 44, I guess. I'll just give it another couple of minutes and then I'll just sort of get going here. But if anyone has any burning questions or big issues with accessing the materials or anything like that, then I guess now would be a good time to unmute yourself and let me know. And I'll assume, if you're all silent, I'll assume that you managed to get onto the website that I made for this and then you managed to navigate to the Google Colab from there. Okay, so I'm gonna go ahead and share my screen. And hopefully you see my web browser. I've just got one eye here on my participants because I need to let them in as they come in. All right, so welcome everyone. I gave the spiel of who I am. I'm Dan Buskin. I work for, I'm self-employed nowadays, but I contract for the US Geological Survey and I'm formerly a professor at NAU. My background is not computer science or particularly remote sensing, but I have sort of gravitated towards both of those topics over the course of my career. And in particular, I'm sort of very, very engaged with developing workflows for applying sort of machine learning, sort of state-of-the-art or established machine learning algorithms. And I like to apply them to the science problems that I'm working on. And I've worked both above water and below water. A lot of the work that I've done with machine learning started out in the fluvial world, looking at bathymetry and backscatter data and trying to sort of use machine learning processes to convert that into useful products for us. And then I'm sort of now in the process of translating some of those workflows and ideas and applications over to subarial remote sensing. And my particular interest is coastal remote sensing, in particular hurricane impacts. And so a lot of the workflow that I'm going to be showing you today has been researched by myself, but it's using models that weren't originally developed by myself. They were developed by computer scientists working at sort of the low-level algorithm development. And what you're going to see today is an example of how you might take a sort of sophisticated machine learning algorithm or workflow and sort of adapt that towards your own purposes. So we've got 46 people here. I'm going to keep admitting people as they show up. Everyone's muted or everyone should be muted at this point. If you're not muted, please mute yourself. I will be filled in questions as we go. You are welcome to ask a question at any time. Please ask questions, because otherwise it's going to feel really weird for me speaking to my computer for the next two hours. So I really hope that you're able to engage with me, ask questions, and we can have a sort of a stimulating conversation, as well as a demonstration of a particular workflow. All right. Oh, one more person to admit. Right. So this is where. Yes, we are being recorded. So for that reason, if you can keep your questions sort of concise and to the point, that would be ideal for anyone who's listening to this afterwards. I don't actually know what system has planned for the recording. But I believe it will be edited and posted on the system's website, perhaps, so it will be made public. That is correct, then. Yes. Thank you. And the other thing I'd like to mention is that all of these materials I've made public through this website, I haven't yet advertised this particular website that you're looking at, because obviously I wanted all of the systems folks to get the first dibs on this stuff, but also because I really want your feedback. And I'm hoping to use what I learn today and tomorrow to keep evolving this workflow and this material, so it's sort of generally useful for folks. And then as soon as I've got it to a point that I'm happy with, I'm going to sort of broadcast it more widely. So these materials and more actually will be available through this means, through these means. OK. So hopefully you got to this point here. And you saw this website. You saw the email with the instructions and that this isn't going to be too much of a surprise. I'm just going to go through this. I want to start by thanking US Coastal Marine Hazards and Resources Program and the USGS-CDI community for supporting me and also for systems, obviously, for inviting me. I actually did a systems work clinic, not workshop clinic. I did a systems clinic last year and it was a lot of fun. And I think a lot of people got something out of it and so they asked me back. And this time I asked for two slots because I wanted to talk a little bit more detail about some of these models and have a bit more time to ask to answer questions and to talk through people's specific ideas or concerns. So here we are. So hopefully you're here because you've signed up for both courses. But obviously these materials are available online. So if you don't feel like you need to listen to me talk, then you still have access to these materials. And so this is for anybody really who's interested, anyone who's sort of a professional scientist, especially working in geology and ecology. And they have need of image segmentation for whatever reason. So I've researched and put together a workflow that seems to work fairly well for sort of generic segmentation problems, at least the ones that I've been working on. And the particular workflow that we are gonna see today is harnessing a model called a UNET, which is a deep learning image segmentation model. And I'll talk about what that means. But I found that this particular UNET that I've adapted, it seems to work fairly well for lots of different types of image segmentation tasks. So that's why I'm sort of confident in presenting this stuff in such a diverse group because I think a lot of folks might find utility in this approach. And so I feel like I can say that because I've used this particular model for big scale stuff, segmenting lakes and seas and things like that from large format imagery, like satellite imagery, relatively low spatial resolution imagery, but also very close range photography too, all the way down to very, very close range photography. So there's something about this particular model that I think works fairly well for the natural sciences in particular. And I'll talk a little bit about why I think that is the case. That's not to say though, obviously, that this is a cat, is a generic workflow that would transfer perfectly to anybody and anybody's problems or tasks, obviously. But still, I think there's something in this model that a number of different folks might find useful. I'm trying to admit one particular person here and the button isn't working. Here we go. Okay, so we're up to 49 people now. Apparently there's still another 20 people to show up, but I'm guessing we're converging to a point where there's no more people. So the prerequisites, this is a fairly involved workflow in some respects because it has a lot of Python code in it. It needs a lot of Python codes in order to work because a lot of that Python code is wrangling the data. And that's a big part of what I want to present here. Machine learning is all about the data. It's not so much about the model even though a lot of models get promoted as being these generic models like I just did. It really is also about understanding your data and having enough programming ability to coerce your data or to get your data into a format that's amenable to a machine learning workflow. And in particular deep learning, I will talk a little bit about a few of those considerations for taking these materials and applying them to your own data. And hopefully you can ask me a lot of questions about that as well. But if you're, I'm hoping that you have, that you all have at least some familiarity and experience with Python. This is, as I say, it's a pretty Python heavy workflow. And if you get lost along the way, then it might be because there's a particular Python, Pythonic thing that's not been understood. And if you can, if you can communicate to me that during the workflow then I would appreciate that because as I say, I'm trying to evolve this as we go. This is the first time that I've presented these materials too. So any feedback that you have as we go would be great. What else did I want to say at this juncture? So we're gonna be using Jupyter, which is gonna be a convenient way for us to go through these materials using Google Colab, which is essentially an online, a free to use online cloud computer that hosts Jupyter notebooks. Jupyter notebooks are notebooks that contain both Python code as well as rich text and video and media and other things in order to sort of wrap your code into a way that is easy to teach and collaborate. But if you're doing deep learning on your own data, on especially very large data sets, then the onus is on you to essentially adapt these workflows to a desktop computer. I've deliberately stepped straight away from that in particular for this systems class because there's still the installation of Python and the updating of various libraries and having environments that are clean and are sort of useful for you and are not gonna break. That's actually a fairly difficult thing if you're, especially if you're starting out. And that can be a big barrier to it's a big significant barrier, especially at entry level, deep learning stuff. So Google Colab is gonna be a really useful thing for us. All right, let's get on. I want to go straight to this first lesson is what we're gonna be spending the rest of our time on. I did provide a getting started with TensorFlow, which is what you're looking at on this screen here. Hopefully a few of you who have no experience with TensorFlow managed to get to that and go through that workflow. That workflow is very similar actually to a number of different tutorials that you might find online. I linked to a few of those in the course website. So if you are struggling with some of the concepts and some of the syntax, then I would encourage you to sort of go back to maybe that particular tutorial or go to one of those that we linked. But I'm not going to be starting with this because I'm going to assume here that you have a basic grip of TensorFlow and we can start here, which is day one constructing a generic image segmentation model. Okay, so anyone have any questions? Burning questions before I begin. You're just eager to get on with it. All right. So before we get started here, I'm just gonna sort of go through what it is that we're going to be doing and what it is that we're not going to be doing. We are going to be constructing a, we're gonna be working through a particular workflow that I've put together and researched that works well for segmentation in general. It especially works well for binary segmentation. And what I mean by binary segmentation is segmenting or classifying every pixel of your image. Segmenting your image into just two classes. The class that you're interested in and for us for this first part, that's going to be vegetation and then against the background of everything else. That isn't necessarily the most state-of-the-art way to do image segmentation for, in particular for segmentation tasks that require multiple classes. That type of segmentation is called multi-class segmentation. We have more than two categories. And so obviously that's where the state-of-the-art is. But I wanted to start with a more, a simpler workflow for the purposes of the introductory nature of this work, of this clinic, but also because I wanted to discuss and in part a particular philosophy that I have is that binary segmentations actually are a great way to start any type of segmentation problem. Because you don't always know what classes you need, what classes are going to work, what classes are actually identifiable by a human and consistently identifiable by a human. And so it's better to start off with treating segmentation problems as sort of a series of binary decisions. And what I mean by that is, if you have a particular dataset and you have a particular needs of that, you need to segment out a particular thing in your image because you need to enumerate it, you need to track over time or whatever. The other classes that you don't care about might be, you may not need actually to segment them out explicitly. They might just all fall into a background class that you really don't care about. And sometimes that's advantageous because you have one particular class that you're interested in that's very distinguishable against all of everything else. But you may run into situations obviously where the thing that you're interested in could be confused by the confused of something else. And so you would need to determine at what point these two things are treated separately or together by the model. And if that's not apparent at the moment, hopefully it becomes apparent as we go through this. I'll keep reiterating this point if I remember. So in this first part, we're going to be using a dataset that's the public, it's out there, it's called aeroscapes. It's pretty good in that it has, it's fairly relevant to us in a couple of different ways. It's not an ecological or geological dataset by any means, but it does have a lot of the land covers and land classes that you see especially from UAV platforms. So we're going to be using that. It's also because it's public, freely available. It doesn't interfere with anyone's research program. And it's a good dataset to play with because of the number of images and the number of classes. And also later on, we're going to be using another similar dataset called the semantic drone dataset, which is another publicly available dataset that's used by folks to develop their methods, to benchmark their methods and things like that. It's not necessarily used for any specific purpose, but it does for our purposes have a lot of the same classes. And it's a good case study of how we might take, how we might transfer what we've learned from one particular dataset in terms of a segmentation task over to another dataset. And that principle of it's called transfer learning is something that you may have come across and I wanted to sort of explain and demonstrate at the end. So the first part, the first next sort of half an hour or so, 40 minutes, we're going to be going through the basics of how to set up a model, how to build a model, how to train a model, how to test the model, and then for the last part, we're going to be going over how to then take what we learned from that model and apply it to a different dataset for a similar task. If it wasn't already apparent to everybody, there's no need to actually install Python on your computer. We're going to be working through just this notebook that you see up on the screen today. But obviously, if you did need to install Python on your computer to do this, and then I'm available perhaps at the end to troubleshoot some of that stuff if you're having issues with it. What else do I need to say here at this juncture? Here I've covered. The main thing that I'm trying to get across here is that even though there are many, many, many tutorials that you might follow about TensorFlow and toy sort of examples on the web, there's relatively few working examples, sort of nuts and bolts examples of how you might get one of these models to work on a real world or real world dataset or messy dataset. And that's obviously what we encounter more commonly in remote sensing, especially for geological and ecological applications. So that's really what I focused on. This workflow doesn't give you a complete understanding of the deep learning methods that are employed. It will superficially talk about some of these things, but the onus is on you to really research these models themselves. What I'm focused on is sort of demonstrating what's possible, demonstrating a particular workflow with a model that I find to be quite generic. And then hopefully what you get out of this is a means to sort of take these workflows and adapt them for your own particular purposes and your own data. All right. I don't see any more people in the waiting room, so I'm going to go ahead and minimize that. So here we are. If anyone does have a question at any point and please, I would love to hear your voice. I really would. So this is the general thing what we're going to do. We're going to import our libraries that we need. We're going to use the air escapes data set. We're going to have a look at that data in a few different ways. We're going to make some plots of those data. And we're going to set up a model to essentially find all of the pixels in the image that are associated with vegetation, which is just one class in our multi-class data set. And then at the end, we're going to transfer what we, we're going to transfer those, that model essentially to a different data set. And then we're going to start learning on that second data set and see how, and see sort of just demonstrate how transfer learning could be an extremely useful thing for, for you. All right. So here we go. Hopefully you're following along here. And as I say, feel free to ask questions. I'm also going to stop at points along the way here, especially during model training and things like that. And, you know, open the floor up for discussion. But for now, I'm going to go ahead and run this first cell. So to run a cell, you're hitting, you're sort of hitting shift and enter, or you, and if you just keep hitting shift and enter, it will automatically go to the next cell. This is a piece of code that's just for the purposes of seeing how much hardware you have. The note because actually set up in such a way that the GPU is already, is already loaded or at least it should be. And if you go into runtime and then manage session, oh, sorry, not manage sessions, runtime and then change runtime type, you'll see what we have here. This menu here is, we've got the GPU selected because we're actually going to be training a model on a GPU that's hosted on this cloud computer. We could also use a TPU for this. If we selected none, that would mean that all of the processing is happening on the CPU and that's going to take an order of magnitude more time, essentially. Most of you, I imagine, have the sort of the standard Google Colab account. I actually have a pro account. And so that's why I get an option here of a different runtime. But you should just see a standard there. And so that's what you should have. And so it's a Linux, it's a Unix-based computer that's actually running this stuff behind the scenes. And so we have access to a few Unix-style commands like NVIDIA SMI, which tells us what GPU we have. I have a Tesla P100. You probably have something a little smaller. And it tells you how much RAM we have. I've got 13.7 gigabytes of RAM. I don't know how much you have. But whatever you have should be sufficient. Okay. So this next part here, the AirEscape status actually available. It has its own GitHub page that I've linked to here. And they actually make their data available via a Google Drive link. So this little bit of code up here is not too important. It just allows us, essentially, to download a file from a Google Drive. It specifically extracts a particular string out. It uses a particular string that's in the URL of the Google Drive link. And it just uses that to download the data. Okay. So it's not too important about that function there. What's more important is what the data is. So the dataset consists of 3,269 images. They're 720 by 720 by 3. They're actually fairly small for a UAV image. You're probably working with images that are larger. The semantic drone data that we're going to look at later on, they're a larger format image. They're probably 12, 14 megapixels, something like that. The classes that we have, these are generic classes. I didn't want to get too specific for the purposes of this workshop because you're obviously coming at this from different places. So I chose a dataset that had, you know, sort of a mixture of natural and unnatural classes. For the most part, the scenes are natural. They're sort of mostly vegetation and background, essentially. Background is, I guess, everything that's not anthropogenic or vegetation in this case. And then there's a lot of, oh, sky. And then there's a lot of other sort of anthropocentric classes like animals, people, roads. So we're just downloading this file here. And then that shouldn't take too long. And then we're just using a couple of these Unix style commands to interrogate our data. That first one, tar, that's just for taking a tar ball, which is that tar.gz, a compressed tar ball, and extracting that just in much the same way as the extractor zip file. And then LS, it's just for listing what's in the file system. And I'm piping that to another Unix command here, WCL, which is basically word count line. So it just essentially just tells me how many images I have in that particular folder. And so my imagery, my red, green, blue photographs, they're contained in my Aeroscapes JPEGs images. If we go over to this tab over here, I hope you can see my mouse. If we go over to this tab here, you'll see our files. You may need to hit refresh. You can go to Aeroscapes, and you can actually see what the subfolders are. We're just going to be using JPEG images and segmentation class. You can even, if you're not familiar with Colab, you can actually even, there's a lot of images in here, so it might take a while, but once this goes, you can actually double click on an individual image and actually just have a look at it. Now, this is a label image. And so it looks completely black because the labels are encoded as integers. And we only have something like 20 classes. So there's numbers in there that are sort of distributed between one and 20. But most image viewers are expecting an 8-bit image. So it's expecting your images, your image values to be distributed between 0 and 255, which is why you can barely see it. You can actually just about see it if you really squint. But obviously, if you look at the actual images, they'll look like actual images with their full, their full dynamic range, like so. So you can sort of tell the sorts of imagery that we're working with here. It is a landscape in some respects. It may not be a landscape from any particular technical definition, but we are sort of looking at landscapes and land covers here. All right. So back down to here, I imagine you might be at a different stage to me, but I'm going to just keep working through this in this way. This is just a piece of code that's going to give me a list of those images. It keeps bouncing back. All right. So I've just made a list. I've just used the glob, which is a function for essentially pattern recognition, looking at strings and seeing which fits a particular pattern. The pattern I'm imposing here is anything that's got a JPEG extension for an image and anything that's got a PNG extension. And then I'm using, and that comes out as a list, a Python list, so I can use sorted to then sort those images. And that's going to be important to sort them because later on we're actually going to use these two lists and pair them together. Okay. I like to use color maps that mean something to me. Color map, we just sort of go down here. That's what I mean by color map. I've got my image, my background image that's sort of grayed out. And then I've got a semi-transparent overlay on top of that, which is color coded every pixel according to class. And I like to use colors that mean something to me. So for example, I've got this light blue for sky and I've got green for vegetation. I've got gray for roads. In order to do that, I tend to use this command called list to color map. And I'll just take a list of labels and a list of associated colors and it will just take those colors and make me a color map that I can then use within my map plot lib commands, such as my color bar and my image showing command. Some of these, you'll see that I deliberately used a mixture of sort of inbuilt colors, which are listed here, the red, green, black, cyan, blue, matte lab, sorry, magenta and yellow. It's similar to matte lab in those colors. But you can also give it HTML colors as well. And I've provided a link here. I've also provided a link to a list of several places that you might look up different colors. So I'll go ahead and load these in. And what it's doing here is just showing you, it's just sort of recreating this image here. I've just chosen this particular image, image 1000, because it had a few of those classes and it was showing something interesting. But feel free to actually change that number and explore a few others. Here I'm just, this is a raw. Here is my image. And label is my label. And I just got kicked off by runtime, odd. That's guess one downside of using Google Colab is that they really can kick you off anytime. So anyway, what I'm showing you here is the image that I'm using a grayscale color map just to make that gray. And then I'm using my own color map that I defined up here. And I'm using an alpha of 0.5, which says give me plot this with a 50% transparency. And then I have to give it my minimums and maximums here because not all classes may be represented in every image, but I want to ensure that every color scale is scaled exactly the same way. So my color bar and my colors show as they should. And so what I've done here is just, I've just done exactly the same, but this time I've put it inside a for loop and I'm just giving it a few different indices of images within my set. So instead of using image 1000 like in the above, I'm circling through this and I'm using, of course, my runtime got disconnected, so I'm going to have to go back to the start. This is why Google Colab sucks. But it's great for the purposes of teaching. So here I'm just unpacking. I'm untying this again. It doesn't actually matter for the tar command. It's just going to overwrite everything. If you were using a zip command here, it would actually prompt you to say, do you want to overwrite? But because we have tarboard, it doesn't matter. And I'm only going through this again because I got disconnected from my runtime. If you're any doubt what that means, then you should consult the up in this corner. It should tell you if you're connected or not. All right. Hopefully I don't get disconnected again. So here it's just cycling through a different, a bunch of different images and it's making a plot. And there's a few different images I sort of cherry picked because they show different things. This is an image where it's just a drone over the sea and there's no C category. So that goes into background or other. You see a lot of, a lot of things like this where multiclass segmentation problems are actually very difficult to label by hand. If you're trying to classify every single thing. So often if your classification scheme doesn't have the thing that you're looking at, it will go into a separate class called background or other. I'll talk a little bit about that tomorrow, what I think about the practice of doing that. I'll talk a little bit about that tomorrow. But you can see, for the most part, you can see that the scene is not other. It's something that we've got large areas of sky. We've got large areas of vegetation. Big swaths of road. And then other little bits and pieces. Construction here means building. Obstacle, it's difficult to tell really what that means. That's another class in this particular data set, which is a little bit ambiguous. So for example, this is an obstacle. It's a door that's going into somebody's back garden, I think. But if you look at these data more closely, you'll get a sense of how many of these classes are represented in the dataset overall and which aren't. And I can tell you that, for example, boats and drones and cars and bikes and people, then relatively rare in this dataset compared to sky, road, vegetation, construction, background, et cetera. So it's really those classes that we're going to be focusing on as we go through this workflow. Okay. So now we have the dataset loaded. In this particular case, we have a Google Drive link and we have that data all available to us. You might have to get your data into some particular way that is amenable to these algorithms. And I'll talk a little bit while the model is training later on. I'll talk a little bit about the size of imagery and how that relates to the size of feature. And if I don't talk enough about that, then please chime in and ask me questions. Okay. So for the purposes of this lesson here, we're going to be just focusing on vegetation. But this is an opportunity for you to sort of, you could change this vegetation with any of the classes that you might be more interested in. And you might want to play around with some of these classes. I've specifically chosen vegetation as the class of interest for this particular clinic because A, this is a clinic that has been attended by ecologists. And so obviously vegetation is a pretty generic ecological class. But you might be more interested in other things. So I would, you know, I'd strongly encourage you to play around with these workflows for different classes. And you'll see a different, you'll definitely see a different response to how well the model works for different classes. The second reason why I used vegetation in particular is because vegetation classes are also represented in my other dataset, the semantic drone dataset that we're going to transfer this model to later on. So I wanted to make sure that I'm choosing one class that actually is represented in both. And what this piece of code here is really, it's just finding, it's just finding the number that's associated with that class. So these classes are arranged in a list of strings and each string is the individual class, but they're also ordered in a specific way. And when the model is trying to use that class, the model is using a representation of that class, which is an integer. So all of our labels are images that are composed of integers. And in our particular case, all of the number nines in our label dataset, they all refer to pixels that are vegetation and so forth. Okay. This is an important part here. Training neural networks, I know we haven't got to neural networks or training yet, but training neural networks often requires feeding the model images in small batches. There's a couple of different reasons why that's done in that way. Partly, most deep learning model training happens on GPUs and not CPUs. And GPUs don't have as much memory as your CPUs have memory. Your RAM is typically much larger than any of your solid-state memory on a GPU. The reason why GPUs are used is because they have many, many more cores. And so you can distribute your training much more effectively and much more computationally efficiently. And that what that means, what that really translates to is model training times that are reasonable. If you were to train this particular model on a CPU, it might take days. But on a GPU, it takes minutes to hours. And that's the difference. But a big downside with using GPUs is that you're limited by your image sizes and by your batch sizes. If you're providing the model large numbers of very large images, you'll very quickly fill up your GPU memory and you'll get a resources error and it will crash. So you're usually faced with two choices. You either have to make your images much smaller than they really are. And I'll talk about that, why that may not necessarily be a problem in a minute. But oftentimes you also have to make the number of images that is presented to the model small as well. You can't just feed all of your 2,000, 3,000 images to the model in one go. And you probably wouldn't want to anyway, because another big reason why we're training batches is because it's another good way to regular, regularize our data model. It prevents the model from overfitting your data, essentially. If you're feeding the images to the model in batches, it learns from those batches alone. It goes through the process of setting its weights based on those batches. But there might be other things that you do to those batches that help with model regularization as well. For example, you might scale those batches in slightly different ways, present those different versions of your batches to the model in order to give it much more variability. So it has to work harder to learn the general trends from the data and not just memorizing the patterns within the data. So that's essentially why we use batches. I'm going to stop here real quick and see if I have any comments on the chat. But feel free to ask any questions if you have them. Actually, I don't see my chat. Where's that? It's next to participants on their lower bar. Sometimes when you're sharing your screen, you have to press other on the right side of the bottom of the zoom bar. That's where your chat will appear. Yeah, I'm not seeing any either of those. Or click Escape, maybe. Sorry. Oh, here we go. Chat. Okay. Doesn't look like anyone's chatting. Has anyone put a chat? Has anyone put a comment in the chat that they know of? I think there are some comments in the chat. I don't know. What tools do you recommend for labeling images? In the beginning, why test data is blank black? I'm having a few issues with, this is, I'm using the zoom on Linux. And I think that everything's maybe a slightly different place. Unfortunately, I don't see my chat. So if anyone would like to re-ask that question, that would be, that would be fine. And now would be a good time. I had Sarah, you did an awesome job, I think. Can you. Oh, so these are not my questions, but what, it looks like Robert posted. What tools do you recommend for labeling images? Okay. That's a good question. I recommend, I currently recommend a website that's called make sense.ai. This is really cool for a couple of different reasons. Partly it's open source. You can actually download this from its GitHub page and run it locally. Or you can run it through the URL that's made available here. But it doesn't store any of your data, which is important for a lot of folks who work for agencies. It is just facilitating you to upload your data so you can download your labels. And then as soon as your session is over, you can be basically assured that your data is not gone off to some weird place. So you essentially drop your images in here. I don't know if I have one. Maybe I have one somewhere. No, I don't. Sorry, I wasn't prepared for that. You essentially just drop an image in there and you start labeling. I should actually just show you there here. I do have imagery, obviously. Let me throw one of these air escapes one in just to make it relevant to what we're doing here. So I've uploaded an image. I want to create some labels. Let's say I want vegetation. Let's say I want water. And sky. Or you can load labels from a file. Here it's actually saying it's going to try to actually estimate those things for you. Oftentimes it doesn't work unless you're working with things that are really specific, like people. So oftentimes you'll say I'm going on my own. And then you basically use this tool. You've got a couple of different ways to interact with this tool. You've got bounding boxes. So bounding boxes are just squares. You've got points. So points are sort of just as they say they're points. You select your label. Or sorry, you select your label like this. So it just labels those. But what we're talking about is a segmentation. So what you need is polygon. So this polygon class here allow you to essentially segment out your features. You can sort of do this fairly efficiently. I'm using a terrible mouse for this. And I'm also don't have any time. So let's say that that's a good one there. You would say, OK, that's vegetation. And then once you've done that for a whole bunch of images, you can go to export and you export your polygons as a VGG format file. So that's a good one. I would recommend that for generating your dataset. Does that answer the question? Yeah, thank you very much. OK. All right. So assuming that more people have questions of apologies for this chat thing. I did see it when I wasn't sharing my screen. And then as soon as it. As soon as I shared my screen, the chat went away. Okay. Yeah. If you go up to the top of your screen. And where it says you're sharing your screen, move the mouse up above that. And all the options should come down from zoom. And then there's a, the chat is in the three dots to the right. Oh, thank you so much. Okay. That's awesome. And then that'll open chat. All right. Thanks, Chad. Appreciate that so much. No problem. Okay. Yes, the class is being recorded. Thank you for the website. We've looked at labeling. There's lots of labeling tools out there. Quick Google search would give you an alternative. I've, I've, I've researched a few of them and make sense. AI is the one that I like. So I'm going to just type that in. To the chat. So everyone has it. Yep. Segmentation these polygons. You can do that with make sense. AI. All right. So keep that chat coming and I'll come back to your questions in, in a few minutes. So this, this is a, so it's going to go back to the batches. So this is a custom batch generator. This is a custom batch generator. It's actually a custom batch. So, I've done a bunch of batches that I like. Um, Keras has its own, uh, ways to generate batches, uh, a number of different ways. Um, but this is one I like because it's sort of intuitive. You can sort of see what it's doing under the hood. But also, um, This. There might be many ways in which you might need to do things to your imagery. Um, such as what we're doing here. You might need to, for example, inherited from this loop here, which is just one of these files here that's randomly selected from my catalog. That then goes into this for loop, and that's what F is. It's just a string of one file name. So it's very convenient for this particular dataset. If you want to just sort of find the label that's associated with every image, then really it's a question finding the file and just sort of replacing it in this case is very convenient to just replace that file name with a different extension and then a different folder because the images are arranged in such a way that they're just numbers. And so 0001.png is the label of 001.jpeg in the other folder. So it allows you to do things like that. It also allows you to, apologies, that keeps on jumping every time I open that file browser. So I'm probably not going to do that again. It also allows me in this specific case to say, okay, this is a label that contains the labels from all of my classes, but I'm only interested in vegetation. So we generated that number nine earlier on specifically for the purposes of giving it to this function. So we could say, okay, we can read in our mask, which is our label image, but in the end our mask is really only those pixels that are that number. And everything else we're going to call zero. So this mask is actually a binary mask consisting of two numbers, zero and one, zero for all of the background class pixels and one for all of the vegetation pixels. And so that's why I've set this up in such a way. But I think you'll also find, I think for a generic workflow, it's useful to be exposed to this type of thing because you don't see this as much in a lot of online tutorials for TensorFlow. They tend to go straight to their inbuilt functions, which tend to work for every case. Okay, so that's what that function does. And so now we're going to use it. We're going to use relatively small batches. We're going to use batch size of eight. So that means we're going to present to the model during every training epoch, we're going to give it eight images and eight associated labels. The size of the imagery we're going to use is going to be smaller than the original imagery. 720 by 720, I think it was what the original size was. I just made that slightly smaller so it fits on the GPU memory, but you could use the original size if you had enough memory on your GPU. I think for the purposes of this we'll run with the slightly smaller size. The other reason why I've used the smaller size is because that's sort of typical, that's more typical I guess than not, I think for a lot of the datasets that you might be using. Even though you tend to go out and spend a lot of effort in creating very, very high resolution imagery, because you might be sort of maybe piping that into other workflows such as photogrammetry or whatever that requires a lot of resolution. Typically with neural network training you're actually making degraded copies of your images and feeding them to the model. Integrated in our case because we're making the spatial resolution slightly smaller, but then the image obviously undergoes a lot of random transformations and we'll see what that means. That image is essentially treated, it's squished and it's squashed and its features are extracted from it and there's not a lot of preservation of any type of spatial relations in them, but there's specific ways in which convolutional neural networks do keep track of that spatial information and we'll talk about that. But in general you're giving it square images, most models that you'll see out there sort of take square images and not rectangular, obviously most images are rectangular, so you're not preserving the aspect ratio of every pixel but in the end it probably doesn't matter that much. What we're hoping here is that the model is able to extract features that relate to vegetation and vegetation as we all probably all know is somewhat self-similar, it's sort of kind of fractal in its structure. You have a lot of repeating patterns, there's a lot of image stationarity involved with vegetation. So for this particular class it's not going to matter a hell of a lot if we make the images a bit smaller or even if we squash the shape, change the shape of those pixels because of the specific way in which the image is going to be extracting those features it's not going to be too sensitive to those things, but it is possible that you may encounter other situations where you do need to preserve the aspect ratio of the pixel but I would probably suggest that those are relatively rare. Okay so we're using this function now, we've just called that function. This is a generator function which is a very specific python construct that allows you to sort of query the function on the fly. Most functions are sort of x equals function y and x is returned as a function of y and it's all done at one time. TensorFlow is sort of set up in such a way that all of the computations are done only when they're needed and that's sort of the whole point of TensorFlow and why it's so useful and so computationally efficient. But then also generators, they work along the same principle, generators are used a lot within machine learning because of memory limitations. You just want to provide the next batch of images to the algorithm when it's needed by the model and so internally what the model is doing every time it searches for a new batch of images is just calling a next command on a generator. So next is the sister of yield. So we go back to our generator, any function that you see that ends with yield is a generator function that's interrogated using a next command. So we use next to get the next batch of images and so that will be if we expand this a bit the length of x will be 8 and the length of y will be 8 because 8 is what we asked for. Okay, so if we plot those this time I'm just plotting just the image in the background and then the generated label. So it's sort of working in a similar way. I'm going through a number of these images and I'm just taking that label and what I'm doing is just color coding all of the ones in the label by green and then all of the zeros they become white. So very quickly you can see that that's working because it's the green pixels are overlaying what your eyes tell you is vegetation. Okay, so we're building the model now. We're getting to the point where we're importing TensorFlow layers. We're at the point where we can build the model because we have this generator function we have all of our imagery all ready to go. This is the format essentially that we're feeding the model we're just giving it these batches of X's and Y's. These model layers I'm going to talk about a little bit during the model training because the model training will take a few minutes so I'll come back to a few of these but you'll see that the things it's all organized as either callbacks layers or models. There they are three major Keras workflow components and at this juncture I just remembered that I hadn't actually mentioned Keras yet. So TensorFlow is our library that we're essentially using to do the heavy lifting on our deep learning model. That is essentially C++ code that's very highly optimized for distributed computing. Sending graphs off to different parts of your computer hardware and the graph tells that piece of hardware what to do just for an instant in time and what to return and so it's a very efficient way to do distributed computing because things happen on the fly. Keras is a way is a higher level wrapper to the TensorFlow API that is able to make it a lot easier for us to use TensorFlow API functionality. Keras is also really cool because it interfaces with a number of other different deep learning low-level deep learning sort of libraries out there. You may have heard of Theano and you may have heard of the Microsoft one that I always forget CMDK or something like that. I've never actually used that one but it interfaces with all three. It doesn't interface, it doesn't currently interface with another deep learning framework that you may have heard of out there called PyTorch. PyTorch is very powerful and very cool but we're not using it for the purposes of this tutorial. It's TensorFlow is more common within industry and it's more common in general. PyTorch is generally more common among computer science researchers, people actively engaged in machine vision research, things like that. That's not to say that it's not incredibly powerful and useful for everyone else but you just have to make a choice because Keras essentially is pretty good to use and TensorFlow sort of comes with its own version of Keras and so that's sort of the workflow that I adopted. I brushed over this little part here about actually making the model and I did that for a reason. All of these commands here are just sub-functions that are called on by this main command. This is a model that is not my own original design but I have sort of heavily modified it for the purposes of this particular workflow. Just a couple of modifications I made that I will talk about in a minute. This is the entire block of code that deals with your big model that's going to do all of these fancy things and it's pretty cool that it's that accessible. Even though it may look like complicated Python code, it's really not that bad especially considering the complexity of the thing that that code is actually doing. Okay so this is where I want to talk a little bit about class imbalance but before I go there anyone has any questions while I consult the chat? Okay so one question from Julia oh no I skipped Carrie. Carrie says how many images do you need to label before you start? It's a really good question. It doesn't have an answer and I think we'll revisit that maybe later on. The short answer is it really does depend on your data and it depends on the model too. So I'll talk about that question from the perspective of this particular dataset and I'll also talk about it from the perspective of the other dataset and again I'm going to talk about it tomorrow. So hopefully you got enough pointers about that particular topic as I go but please ask me more specific questions if you have them. Do you need to label background or just a class of interest? If you're using makesense.ai you can just label the class of interest because everything else would be the background. Background is it can be dangerous territory though if you if that background class has something very very similar to the thing that you are interested in. Is there a way to partition the class balance? Yes class balance that's what I'm about to talk about right now and then do images have to be fully classified wall-to-wall? Not necessarily. I'm going to introduce you to your workflow tomorrow that might help with that. Wall-to-wall I think what you mean by that is you need to label the entire image and the answer is not necessarily because anything you don't label will fall into a background class. David says guessing this is model endemic better to have more images label or fewer images with more precise labels. Again that's a pretty difficult yes I see the motivation behind that question but it's a pretty difficult question to answer generically because it really does depend on your data. My experience is that if you have precise labels then you know that your error is due to your model and not your imprecise labeling but I've also got ways to refine my labels based on other workflows. Is it easy to extend the multispectral images? Absolutely yes we will get to that tomorrow. So the reason why it's extensible is because we've essentially written the code ourselves and so we can add functionality to that code in order to make it deal with that extra dimension or multiple dimensions. It's not particularly common among especially amongst the computer science type literature that you might be exposed to you know the maybe popular science articles or blog posts that deal with how to do image segmentation they almost always work with RGB images and they almost always work with sort of sort of existing data sets out there that have been curated for the purposes of method development not for sort of application development. So especially in the terms of natural sciences so there are relatively few examples of those out there but I can tell you that they definitely do exist. There's many many many remote sensors who are obviously actively engaged in application of deep learning and a lot of their data might be geophysical it might be multispectral or hyperspectral. All right I'm gonna just I'm gonna minimize that chat just for a minute but keep keep the questions coming they're great hopefully I'm doing an okay job answering them. Some of them are quite difficult to answer because either I'm either going to cover it later on or because I haven't yet exposed you to enough things to really make the answer understandable. So I've just just asked you to have a little bit of patience and then I'll come back to a couple of those questions but one question was about class imbalance. Yes that's a big problem. Class imbalance basically means that you have many many many more examples of one class versus another class. In our case that means that we would almost always have more background pixels we'd have more zeros in our label image than we have the class of interest the ones. So if you were to look at a histogram of the frequency of zeros and ones they might you know sorry it might actually be like that you have got a lot of zeros and very few ones. That is often a very big problem for deep learning algorithms but it's only a problem because of mis-specification of a loss function. So a loss function is the thing that is actually doing the work in terms of it's the thing that the model is keeping track of in order to adjust its weight to come up with an optimal solution. The loss itself is just the is a number that you're essentially trying to minimize. This is a optimization problem. It's a minimization problem. You're trying to find the minimum of a function essentially and your neural network is the thing that is finding that minimum. It's set up to do so because of its architecture rather than any specific optimization scheme but then there's optimization schemes involved as well the optimizer the actual thing that it's doing the optimization. We'll talk a bit about that but what that thing is doing is minimizing a loss and our loss is the thing that we need to pay most attention to for class imbalance problems because the loss function is really the thing that's reporting back to the model about how the model is doing and if the loss function is biased towards any particular class then that's obviously we have a big problem. I will just sort of footnote that by saying that I've got a larger version of this tutorial that doesn't fit in two days that I'm going to post online that uses a different dataset and it explores some of this stuff in a bit more detail. It looks at optimizing a particular this model but optimizing it using different training strategies to come up with the best way and it sort of does that step by step and I've hopefully put it together in such a way that convinces you that certain loss functions are really bad for class imbalance problems and that other loss functions actually get around that problem. I'm talking about loss functions here because that is what I'm about to say but also another way to get around class imbalance is obviously by treating labels. If you have if you're labeling imagery in such a way that the labels sorry that the classes have about equal representation among the dataset then obviously that's not a problem anymore. We're only talking about cases where there's really a huge difference between the different classes. So one particular function that's really good for these binary segmentations in general is called the dice coefficient or dice loss. It's a good way to deal with binary segmentations because that's essentially what it was set up to do but it's also a good way to deal with this class imbalance problem. I'm going to talk a bit more about that tomorrow as well but for now we will sort of take it that on an article of faith that this is a good dice a dice loss is going to be good for us. It's very similar to a metric called the jackard index or intersection over union which looks at the intersection of two polygons and their union which tells you how to what degree they have overlap. The problem with the IOU score is that it tends to result in biases towards the class that is more dominant in your dataset. So in our case that would be the background class. So the dice loss is a subtle variation of the IOU score that essentially it weights things the same irrespective of their size which is what we need in this particular case. Okay and this is the IOU score here and I'm going to define it here just so you can see it but also we'll keep track of it as we compile the model because it'd be useful to see how these two relate. Okay so we've defined our batch function now we've got a function that will make our model and we've defined our loss function so now we're in the position of actually making the model and then compiling the model. So compiling doesn't actually mean in the sense of if you were to compile code you're turning it into zero and one so it can be interpreted by a machine. You're sort of doing that here as well but the main thing about it is that you have to specify certain things that it needs in order to function. You have to give it an optimizer. So the optimizer is the algorithm that's essentially minimizing the function. It's doing the stochastic gradient descent function. It's working out the stochastic gradient descent. There's a number of different optimizers that you might use in this situation. Another really good one for these types of problem is called Adam. So Adam is basically the same as RMS prop but it has this thing called momentum behind it. It's just a numerical sort of construct that allows it to interrogate portions of your parameter space a little bit more effectively but it's not always that you see it a lot. The Adam and you see RMS prop, you see them sort of used interchangeably oftentimes. They do different things but they often result in similar results. And here I'm giving my lost function. That's actually my function I'm giving it there. That's not a particular variable or that has a numeric, it's a function and similarly this is a function. My dice coefficient is a function and my mean IOU is a function. You can specify as many metrics as you like. The metrics are things that it computes at the end of every training epoch. It's just a way to report back how well the model is doing typically it's using the validation set. We haven't got to the test and validation sets yet. And the loss is the thing that's actually being used by the model to set the weights. And it's the thing that defines how the back propagation algorithm is going to go, how the weights are going to get propagated back through the network so the model can start training again. So these things, these metrics don't actually contribute to the model training. They're just things that you're reporting back to see how well you're doing. As you go, these models take a very long time to train. So if you're monitoring your metrics as they train, you can actually just stop it if you think that it's not going to work out. You can develop a sense for that. I'm putting these couple of things in here because they're not necessarily to our workflow, but they are kind of useful. This is a way to save your model out as a JSON file. It's a metadata format, a human readable ASCII format that you can see how your model is put together, but it also allows you to read it back in. All of these things up here are Python objects and so they require a lot of work to sort of sort of pack them out into things that you can actually see. But this model to JSON is a pretty useful thing that Keras has inbuilt. So if you need to give your model to someone else, that's a good way to do it. This function here is, oops, I haven't, of course, I haven't actually been running any of these cells. What was the last one I ran? Okay, so I need to run that one and then run these two. And you'll see once this compiles over here, you'll see, or if it ever does, you'll see, all right, they've just done it. So you go over here and refreshed, you'll see that you've got a new file, model.json, have a look at that, and you've got this model.png, which is this thing here that's also been printed to the screen. And what you're looking at here is this is what the model actually looks like. These deep neural networks are just layers and layers and layers and layers of things, the outputs of one layer, the inputs to the next, and on you go. A lot of modern, sort of hyper-modern neural networks have these different branches, these skip connections they're called. They happen at various scales and they're for various reasons. I'm not going to go too much into it. But the idea is that you sort of, you take your inputs and you treat them one way and you treat them another way and then you combine that, you sort of add them, you combine them together. So you've got one sort of path along this that's extracting maybe one type of feature and then another path that's doing something different and they get added together and so forth and so forth. And these models are large. They consist of many, many layers like this and all of these skip connections can be quite confusing. There's a reason why these skip connections are there. And I'll talk about that while the model is treating. Okay, so this is the point where if you go ahead and just run these cells, this is the point where actually downloading this data that I've got up on the web. But this time we're downloading just text files that I've already gone through the process of randomly splitting the data into three different subsets. We've got a training set, a validation set, a testing set. And the reason why I've exported them out to files is because I want you to read them back in so we see similar things. If we were to use different random subsets of the data, especially because we're training our models over very few numbers of epics of epochs, then we would potentially see different things. So this is just for the purposes of this particular lesson. We're just reading that back in from our Google Drive. And this bit of code here, the details don't matter too much, but essentially it's just opening a file and then just reading them in line by line. And then this last little bit here is just printing to screen how many different numbers of images that we have in each set. And you'll see that I've got more training files than I have testing files and validation files. And that's pretty typical. You need to present neural networks with a lot of information in order for them to perform optimally. Going back to the question, the first time I'm going to go back to the question about how much data we need, while hundreds ideally, typically, but not necessarily. And it does depend on your data. This data set here has obviously thousands, a couple of thousand. That's a pretty good number, but it obviously takes a very long time to manually label that number of images. So you typically start small. You make a model. You see how it works performs. You get a sense for how well it might perform with more data. And then you sort of act accordingly until a point where you've fed it more data and the model hasn't performed even better at which point. That's the brute force way of knowing that you're done. But there's more sophisticated ways too that we are going to talk about as we go. Okay, so this bit of code here is just commented out because that's just showing you how I made those different subsets. So I took half of the image set and then I took half of the next set and then half of the next set. So I have 50, 20, 25 split. And here this is just a piece of code that I use to write those files out if you find that useful. All right, so before we go on to training the model, I'll have a quick look here at the questions. Will the extensive tutorial be posted online free to access? Yes, it will, when it's finished, almost finished it. That's my goal for the end of the week. And I'm going to announce it to the same list, the same email list as here. And I'll give it to system to announce as well. Is that diagram for one part of the training? Yes, that is. Yeah, that's the model. And the data goes through this model many, many, many times. The number of times is the number of epochs, the training epochs. So if you have 100 epochs, then it will, all of your data will go through this entire model 100 times. It does so in batches. So if you have 1,000 images and your batch size is 10, then obviously you're passing sort of 100 batches through the, through every, through this model on every epoch, and then you're doing that epoch number of times, if that makes sense. And we'll see that in action now as we train the model. Okay. This is another piece that's sort of custom to this particular workflow. This isn't necessarily what you'll always see in tutorials that introduce you to these topics. But again, it's useful to have exposure to some of these relatively rare things, because again, it's something that you will probably encounter if you're doing this for real on your data. This particular class that I've defined here is a couple of different functions within it. One, all it does really is that here when they, at the beginning of every training epoch, it will just initialize a bunch of empty lists and a counter. And then what it does at the end of a training epoch, it will use the current state of the model to predict. So this, the model, when it comes to this point is here, and we're using it, it's being actively trained as we speak, but it's an object, it's a Python object. So you can sort of interrupt its training real quick to make a prediction. And we're predicting on one of the test sets, one of the validation files. And that reminds me, I didn't actually talk to you about what these different sets are. So we go back to this. So we've got these three different sets here. The training files, they're the file, they're the images that are randomly drawn from the catalog that the model is going to see. The validation files, they're the files that are also given to the model to use to predict on. So for example, in that plotting thing that I just showed you there, that's one of the validation files that it's drawing in that sense. But it's also the set of files that it's using to define these metrics. So we go back up to the top, we have these two metrics. It's going to be using the validation set to define those two metrics. And so it's important that we use the validation set because those are the images that have not been seen by the model. They've been touched by the model in the sense that we're using them for prediction. We're using them to define metrics that we keep track of during training, but they're not the things that are actually used by the bank propagation algorithm to set the weights. So these metrics here, which is also why I've got two of them, because I can have as list as many metrics as I like there, and it will just get essentially printed to screen and recorded, so you can have a look at them later on. Then finally, that we've got the testing set, which is a set of files that I like to keep aside for the purposes of just testing. The validation set, if your model is good enough, it's generalized, it's figured out what things look like in general and how to extract them from your imagery. So for vegetation, for example, it's figured out this fuzziness, it's figured out the fractalness about it, it's figured out what features it actually needs to extract to make that call. And then on the test set, you're actually seeing if it did a pretty good job. And so in many senses, you'll see training and, sorry, you'll see test and validation used interchangeably, but specifically here, validation is going to be given to the model during training, but not used for the purposes of training just for reporting metrics. And then test files, we're not going to see until the end. Once we have a model that we think is done, it's changed sufficiently, then that's where the test set come in. So this callback function, it's called a callback function, and it's given this sort of piece of code in here, which basically just tells the model that this is a callback function and that we're going to be using the Keras callback sort of framework in order to execute this function. And then what it's doing, and then all of this piece of code here is just updating those things that got allocated at the start. So we can basically carry all of these variables through self, which is the self of the program. It's a way basically to move variables in Python internally. And it will take a random dot choice. So it takes a random file from our validation set. It opens the file, it resizes it to what we need it to be. It makes a prediction. This expand dims, that's just because it comes in as a matrix and we need to turn it into a tensor in order to use it with IntensorFlow. And so that just in this case, it's just adding another dimension to it. And then it makes a prediction based on that form of the data. And then in the end, it squeezes that data to remove any singleton dimensions. And so if you're a MATLAB coder, you'll understand what squeezes already or if you're a Python coder, you'll probably know as well. And then this little part here just takes that prediction and turns it into an image that we can then see during model training. In this case, anything that's greater than 0.5, we're going to call 1. I'll go back to that. Here, I'm turning this into an 8-bit image just by stacking it in 3 dimensions and then time doing that whole thing by 255. And the reason for that is because in the end, I just want to show this combined image, which is going to be the image itself, the mask, and then the image mask by the mask, all in one image that then gets plotted during this command here. Probably more detail than you needed, but there you go. So we're not going to train this model for very long because obviously we don't have time to sit here and watch it spin. But we are going to train it a little bit. We're just going to do five epochs. Here, I'm just making a function that is going to allow me to save the current best state of the model as I go. So this is what it's called a checkpoint. I guess it's called a checkpoint because you can always go back to it. It's pretty crucial that you do that, that you save your checkpoints as you go because especially if you're working on Google Colab, anything can go wrong. You can get randomly disconnected, maybe a few of you already have. And obviously, if you're running this on a computer, you could run into issues as well. It's just a good idea to keep check of those as you go. But if you save best only equals true, it will only save the best weights. That means it will only save the weights out to disk again if the model's made an actual improvement based on the validation loss. And that is what I'm telling it here to monitor the validation loss because it's the validation set that I'm more interested in. These models are extremely good at fitting to the training data set. That's not what you're paying attention to necessarily. You're paying attention to how well it does on the data that it's not ever seen because obviously that's a much better indication that it's going to work for your problem. So that's what that means. And then I'm just adding that checkpoint object to this plot learning which is the class that I just defined above. And then returning this is the thing that I can then pass to the model which is essentially just a list of callback functions. Here, the file path, that's just going to be, this is the file path that the model results or the model weights are going to be saved to. The model itself is the architecture and the weights. And we already have the architecture. We have a way to reproduce the creation of that but we don't have a way to store the weights yet. And so this is what we're using for that. It's an H5 format file, HDF5. It's sort of somewhat similar to things like NetCDF. It's a portable human, not human readable, but has a good format. It's also the format that Keras uses. This is our training generator. So I'm using my image batch generator here to make a generator that's going to interrogate just the training files, just call those for training. I'm also going to pass my model a validation generator that's going to use the validation files. The class number is nine. That corresponds to vegetation that we set earlier on. The size is 512 by 512 and the batch size is eight. So the number, so as I said earlier on, all of that data is presented to the model at each training epoch but it does so in steps and the number of steps is basically just the number of files divided by the size of the batch. So there'll be 204 training steps. That means that there'll be 204 times that the model is fed eight images and eight labels during every training epoch. And so you can see that this is, we're talking about massive computation here. We're talking about things that really do need to harness better hardware like big GPUs, many, many cores, things like that. This isn't necessarily something you want to do on a laptop but obviously it is possible, especially for smaller data sets and for relatively small training times. All right, so I've already defined the model. I've got everything I need to start training my model. I'll explain what this means later on but let's go ahead and just get that going because it's 125 and there's basically now training on this cloud computer. Hopefully maybe yours has already finished training but I'm going to stop here and see if there are any questions before I start backfilling some of the details that I've skipped over so far. And if anyone has any questions that they'd like to use their voice to communicate, that would also be great. I'd also obviously want to know if I'm going too fast or too complicated. Hi, Dan. This is Chris in this line where you're generating the test data and the validation data. What keeps you from getting the same data in both sets? Oh, because they're interrogating two separate lists of files. So this validation files here was given to my validation generator and then my training files are given to my training generator and then when I specified the model here, I said, so train generator, that's always going to be what it expects first. That's the data that it's going to be training on. That could also be a list of files. So in this case, if I were to copy this and just paste it in here, if you were to sort of do this on an entire dataset without using batches, you could just do that. You could just give it the entire set but it's not necessarily the optimal way to do it. And similarly, you could stick that in there. You'll see that too in some tutorials, not every tutorial you'll see uses generators. We tend to use generators because we tend to have large datasets and things like that. Okay, so we've got our good. I'm glad that you're finding the pace okay. When you have a class that like vegetation exhibits a fractal nature, how do you go about choosing the resolution to use for training? I guess this all depends on the cause you have in the time but in general, how would you choose? Yeah, in many respects, like many things, you develop an intuition for what might work and what might not work just based on what you've seen working, what you haven't seen working. Things like vegetation are just so different from other classes in this particular dataset. There's a lot of self-similarity in our imagery in general but the specific way that vegetation sort of structure the shadows, the spatial scales associated with the alternation of the bright and the dark, they are very different than others. You can use many other techniques to query your data to see how well different classes might fall out from an unsupervised classification. So for example, you could use lots of different types of clustering techniques to see how well different classes within your dataset do cluster out in space and you can do that with your training dataset. You do that with your set of images and the set of corresponding labels. It's a different workflow than obviously what I'm presenting here but that is a good way to do it, I think. But generally, a lot of this stuff involves a lot of experimentation in general. Like when I approach a particular task, I'll take a model that I think might work okay but really what I'm doing is playing with my data to see what works best with my data. I will downsize my imagery to a point and if I think it's too blurry or too fuzzy then obviously it's no good, I have to split that up. Another thing that I didn't mention is that because we had square imagery to work within from the start so it wasn't too much of a big deal to actually just downsize that but if you had very large rectangular images then you're probably better off chopping that image up into smaller square chunks and then doing the labels on those. It's going to be quicker, more efficient to label perhaps but more importantly it's going to be more amenable to a workflow like this. If you have very poorly resolved data on not poorly resolved but relatively poorly resolved like Landsat or whatever 20-meter, 10-meter style footprints then obviously you don't want to be downsizing your imagery even further. Your model is already downsizing your image probably more than you're comfortable with so you want to try and keep that input data as close to its native resolution as you can but going back to the fractal nature of stuff I just meant that in terms of vegetation because I know that vegetation has a fractal structure I don't know about the fractal structure of any other classes so I can't really make comments about that. Yeah so can you speak to the effect of kernel size in the convolution block, bottleneck block and the res block functions? Yes I'm about to do that. Does TensorFlow have functions that could and could the model framework you built be adaptive for panoptic segmentation? Okay so to back up there's many many many different types of segmentation out there so there's instant segmentation where you are not only segmenting that class but also giving them a unique identifier panoptic segmentation I'd have to be reminded of what that is I definitely come across that term but it's escaping me at the moment so if you can let me know what that is maybe I can answer that question better but in general yeah TensorFlow has functions that you could do panoptic segmentation with for sure and instant segmentation those models tend to be a lot more complicated though so they're not necessarily entry-level models that might might be things that you progress to after you've discovered that a more simple workflow doesn't work well and we're jumping to you know a unit is a fairly it's a fairly standard sort of way to do things it's a very very big deep model it has almost 700,000 parameters whatever but in this particular case but there are other models out there that have you know tens of millions of parameters that are generally more powerful but not necessarily easier to work with sounds like okay so panoptic segmentation it could be similar to instant segmentation but yeah there'll be Keras workflows for that for sure as Altan said if you chopped up large images into small squares you get edge effects when you predict classes on a large image quite possibly but there would be post-processing workflows that I'm going to show you tomorrow that would explicitly deal with that so if you can stay tuned tomorrow we can talk about that then all right so going back to if no one has any other questions I'll talk a little bit about what this model is actually doing what convolutional neural networks really are and some of these details that I skipped over before all right so one more question what are that classifying features from elevation data absolutely that's possible yes it works in a similar way you don't have as much information to work with because you only have one band but yes this workflow could be adapted for elevation data it currently would expect three band images but it would work the same for two and one band images as well the problem with elevation data of course is that it doesn't necessarily uniquely describe a class so you'd have to be very careful about framing the problem essentially that you want your high elevation features to be sort of unambiguously high elevation and your low elevation features or classes to be somewhat unambiguously and then of course you could combine RGB images with elevation data too into a four band image which might be which might actually be the ultimate way of doing this okay so I'm going to minimize the chat just for a minute and then going to go back to explaining what these neural networks are you can see if we go back to the training here just to just check in on that real briefly you can see that we're already on the fifth and final training epochs so that's good we can see a few outputs you'll see different images to me because we're using randomization to call these batches so you won't see the same images to me that's not a problem but hopefully you should see that in general you can see that some of the vegetation is actually being segmented out of the images in a fairly reliable consistent way we wouldn't expect a model to perform actually that well over five training epochs but we'll see in this case it actually doesn't do too badly but we'll come back to that in a minute but what it's doing right now is essentially solving a giant numerical problem where you essentially put your inputs on one side of the problem and your outputs on the other side and this is just an architecture that's been carefully constructed to extract the features in such a way that it would generically be able to predict those things convolutional neural networks in general are used for image classification problems because they do two things really well they take advantage of multi-dimensionality so going back to a couple of questions about using other types of data other types of grass the data yeah these convolutional neural networks deal really well with that because they take advantage of local spatial connectivity the big sticking point with neural networks for the longest time in the 80s and 90s and early 2000s was that the main layer the hidden layer the dense layer is a very very inefficient way to solve this problem because it connects all of your data on each node to a dense layer and so you have these millions of connections between your data and your first hidden layer but none of those connections have any spatial awareness so what the big breakthrough with convolutional neural networks was that they came up with some way that would be an efficient way to extract features from high-dimensional data sets like imagery because you have this concept of a receptive field so the neuron for every portion of the network is not connected to every other portion but it is connected to a region of portions and those regions are connected to one another so it exploits stationarity with the image which is which is sort of repeating patterns throughout the image now going back to vegetation obviously there's all sorts of repeating patterns for vegetation generally plants of sort of self-clerk look similar to one another and very distinct so so CNNs are really really good at taking into account that so that's essentially why we're using convolutional neural networks as obviously many many more details that I could go into here but if you followed the intro tutorial that was listed up on the main course website then then you would have a sense for how inefficient neural networks were because we were trying to classify these tiny tiny tiny images and our model was really giant but using convolutional neural networks we've got much bigger images and our much smaller model and it works out better so that's sort of demonstrating that point specifically so I'll go on to the the UNET itself I did provide this video as well which was made by the authors of the original model that the UNET came from a paper that was published in 2015 it was a biomedical research group but also you know folks who are really heavy into image computation and it was a brilliant as a sort of a brilliant stroke of genius in my opinion because it was the first model out there that really did a good job of combining features at multiple scales and that is why that's essentially why the UNET is so popular if you did a Google search of UNET so I'm sure you would hit millions and millions of sites the reason why they're so popular in my opinion is because they combine spatial scales so if we go back to this we've got our input image here we've got our three bands our RGNB they get sort of condensed down through pooling and convolution they get condensed down progressively and progressively until you get to this sort of bottleneck feature here this sort of feature that's only a 1024 digits long sort of numbers long and that essentially contains all of the information about your specific class and then it sort of gets upscaled from that a couple of different details here one is that the downscaling uses convolutions that sort of spatially share weights you use a concept called transpose convolution to essentially reconstruct how each set of convolutions would then sort of scale up spatially so it's a little bit more complicated and sophisticated than just simple interpolation or something like that it's actually spatially-informed interpolation using weights from the convolution layers and then these gray arrows there's those points within our model if we go back to our model back here just real briefly and I mentioned these skip connections earlier on you see that some of these skip connections are really long like one for example goes all the way down to the bottom or near to the bottom and that's because the intuition behind that is that anything that's going off on this path over here that's going to be pushed through all of the different layers you know it's going to be progressively downsized and it's going to have all of its information extracted from it in a very prescribed way but in the end it's going to be very sort of low-level features that are encapsulated by that process very obvious differences between classes as represented by image features but also because that's going to lose a lot of your even though the convolution is going to be retaining quite a bit of your spatial information at the same time you are losing some of that spatial information you're losing some of the very detailed features and so what this skip connection does is it just sort of adds the output of this which is still quite large compared to your input imagery and it just sort of essentially just bypasses it and in the end it will just add it it will just combine those together or concatenate what it's called concatenate those together before it actually makes the prediction and it's designed to preserve both low-level features and high-level features which is also I think why it's somewhat useful for these sort of like intermediate scale problems you know this sort of size imagery where we're looking at maybe tens to hundreds of square meters but it also works for pretty large imagery and also works for obviously microscopic imagery which is what it was designed for in the first place it was actually designed for separating different cells so that's all I'm going to talk about for the purposes of just moving on here but I'm going to return more I'm going to return again to what a unit really is and what it's doing for us but I just wanted to show you the process of actually training this model we did this over just five training epochs and if you noticed if you saw some of the scores down here you'll notice that our dice coefficient which goes between zero and one it's pretty high our IOUs are a little lower and our validation sets are not too bad like 78 that would sort of would roughly translate to like a 78% accuracy and 65% accuracy so and that's for individual batches so you can see that it's doing fairly well but it's not as good as it might get over many many training epochs and you can see that if you go if you go have a look here it's quite remarkable actually that the model could figure out really where the vegetation was for the training data set which is the blue line to an accuracy of about 87% just after the first training epoch which is amazing really and not typical typically you'll see things start right at the bottom and then they'll get to that point they may get to that point but for vegetation and for this particular data set there's a very good class to work with because it shows that you can train this model pretty well just in a few training epochs but you can see that the model hasn't necessarily generalized that well to the data there's still some room for improvement there because you can see that even the training is the sort of scores going up and up the validation score is bouncing around quite a bit and it's doing that because it's seeing a new batch every time and that batch might contain something that's unusual and not seen by the model before and so really you know you'd probably in this case want to train for quite a few more epochs because you want that validation score to sort of start levelling out you want two things really you want consistency in your scores and of course you want your scores to be high in the first place and over here on this right hand side I just plotted the mean IOU as a function of dice coefficient in this example they track pretty well the correlated the mean IOU is significantly smaller though because it's biased towards that other class the dice coefficient in this case is a better metric for us because it's going to be enumerating the overlap of our predicted class with our ground truth class but in such a way that it doesn't get biased by the size of that class and everyone's going to get different curves to me because everyone used slightly different batches during the training but the expectation is that everyone would sort of converge to the same number after a certain amount of time so here is a convenient way to just quickly test that model what we're doing here is creating a new generator just from our test files this time so that's the set of file the third set of files that we haven't yet used I'm giving it the same argument some size and class number same batch size and it's just it's going to be generating sort of scores it's going to basically give the model eight new images at a time and it's going to just keep doing that until all of the images are used up essentially and the only reason to use steps in this case would be because this is now a CPU bound problem it's using the CPUs it's using one of your cores to do this and you may run out of RAM if you didn't have a lot of RAMs so I'm showing you a workflow here that involves this intermediate steps called hurt step called steps and that's for the purposes of feeding that to this portion of the function and you can see that my test score this is my average DICE coefficient now for the entire test set it's about almost 80% which is pretty decent I mean IOU of 66 is also pretty decent for a mean IOU score people tend to say that IOU scores are pretty good if they're more than 0.5 but that's not very satisfactory either so DICE loss is a bit better it's a bit more sensitive metric so here I'm just printing those two numbers out in a in a in formatted way so I'm saying okay my loss my loss here is 0.2 my loss can go down to 0 0 is what I want so obviously there's a little bit more room for improvement there that hopefully I can get to train the model for a few more epochs but already my DICE score for this particular cast is almost 80% which is pretty decent and so here I'm using the test generators to just generate a bunch of images and then here I'm just going to feed each one of those each one of those images in the batch to my model to the predict function here it's going to generate a mask and I'm just going to plot it and so this is just a really quick way to see how good the model teams to be doing a more visually intuitive way you can see oh in this case it's pretty bad this is the worst one I've seen so far so this is good we've got some it did really well for this particular image here it pulled out all of these individual plots did okay for pulling out the vegetation and this one did all right in some of these but it did pretty badly when it when it was confronted with a picture of open water with a boat everything went into the everything else class and it's picking up you can see it's picking up on the foam from the boat and it's picking up on the surface texture from the waves which is which is sort of what it's what it really wants to do because it's picking up on similar features of vegetation obviously in this class in this case it needs quite a few more training epics to get to a point where it's not going to confuse water and vegetation but in general that's actually pretty decent for a model that we've just sort of picked and trained for only five epochs but just to show you what that could look like obviously I went ahead and I did this for a few more and so what this this bit of COVID you've already seen this function a few different times when we've been downloading data from my Google Drive this time I'm just pulling in this H5 file that contains weights trained exactly the same data set and same workflow same everything but this time I just trained it for 100 epochs instead and then I and then I'm going to actually just load those weights to my model so destination is in this case is this file and you'll see it's been downloaded over here it's the one that has vegetation written out not veg vegetation and it has 100 epochs so I'm going to load those weights to my model when you load the when you save the weights in that H5 format it's already structured in such a way that the model knows exactly what to do with those weights and it puts them in the right place and then this time I'm going to run the same same evaluate function that I did before but this time I'm going to just use the just use the sorry using the model that has these new weights loaded to it and hopefully you'll see that the scores are a lot better you'll see this particularly you'll see this mean IOU score go significantly up and then here these next two cells are just going to be doing the same thing as I did before it's going to be generating a new batch once this is done it's going to generate a new batch and it's going to plot them but the one different thing that I've done this time is of course this time I want to actually upscale it to its original size my imagery was 720 by 720 originally the size of the imagery that I gave the model was 512 by 512 so this time I'm just going to do an interpolation I'm just going to resize it I'm going to take the I'm going to take the image resize that but I'm going to give the the image the same size as what it needs as what it expects which is 512 so here I've got the image that's that's been sort of inherited here and then it's just passing through here the prediction happens here and then the resizing happens there and then it just plots and you can see that it does much better this time this time we have a picture of open water that's similar to the boat example but this time it hasn't picked up on those surface textures it's learned over those 100 epochs it's learned how to deal with that situation better I think that is a pretty good place to end we have 10 minutes left I do have a whole section on transfer learning though that can wait till tomorrow because there's quite a lot of detail there I don't want to rush through and we have we have quite a lot of time tomorrow because tomorrow's lesson is a little bit shorter than today so I just want to end with this recap here and then I'm going to spend the next spend the last 10 minutes asking questions that you have so far so just a recap then we took this data set we made it a residual unit model we didn't get as far as talking about what the residual part means in that but we talked a bit about the unit and why it's useful I would encourage you to play with this a bit because we're looking at a curated example here for vegetation that worked really well you can see that it worked with a dice score of 98% so it's 98% accurate it's even its IOU score is 95 which is really really really good for this type of model and obviously I think a lot of you would find that acceptable if that translated over to your problems or your data but I would encourage you to play around with this a little bit particularly choose a class that's very small you know there's a drone there there's some objects in here there's some people you'll see that the model takes a lot longer to train for these very small objects you know because these very small objects they're not represented very much in the training catalogue as much as vegetation is we're giving loads and loads and loads and loads of examples of vegetation all the time but for smaller objects there may be there may be only you know a 10% of the imagery that even has that object in it so that's a good case to play with to see sort of how bad you might do for those very rare objects or very rare cases and it might give you a sense for how to actually label your imagery and transfer these workflows over in general these workflows work pretty well for these very distinct classes and very large classes the small classes do work too not saying that they don't I've got a one one a couple of different projects where they've worked really well but I've also had to play around with the training and the loss function and various things to get it to work the those sorts of topics that's sort of touching more upon that second one that I that second more advanced tutorial that I linked to at the end of my of the course web page that's not quite finished yet apologies for that that will be finished pretty soon just give me another couple of days and I'll get that back up so when that is up I'll share that around and you'll see how you might take this workflow and make it a little bit more complicated or at least adapt it to a more complicated data set and we've got eight minutes left I'm going to go over to the questions and see what we've got so Danica says from your diagram it looks like the final resolution would be Corsa then your original resolution that's right yep we're using 512 pixel imagery throughout and that's why we're upscaling it at the end yeah so it's going to be estimating something that's going to be the same size as the inputs that we gave it but potentially those inputs are smaller than what the originals were is there a way to set a seed yes of course that's a good point I don't know if I did that so that's really well yeah so in here we can use random choice and we can set a seed we can give it a seed there so seed seed is for the random number generator that we're using and by setting a seed by setting a seed we're essentially ensuring that we're pulling the same random files as much as we possibly can yeah there's a section that shows you how to do that can I clarify class versus label apologies I'm using those terms interchangeably they are the same thing the class is the thing that you're interested in the label is the representation that you've made of those of that class in your image so I guess they are slightly separate things it goes back to that point made earlier on about how detailed do you need to be in your labeling ideally you need to be quite detailed in your labeling because you want to capture just the thing that you're interested in so I guess that would be where I would make the distinction between class and label I guess the class is the thing that you're trying to label imperfectly okay so I didn't see any more questions there's anyone any questions they'd like to use their voices to communicate so we didn't get all the way to the bottom but I think that's not a problem because almost all of you are signed up for both for both days and basically tomorrow we're going to start here and we're going to move on to the to the next part of the workflow but I want to start here because it demonstrates something that could be potentially useful for folks and that's transfer learning so what we did was we create a model for a particular class and then we made some weight we could actually take those weights and apply them to a different model that we're going to point to a different data set for the same class so you'll see just a sneak peek of tomorrow we'll see that you were using this thing called the semantic drone data set I chose this data set specifically because it had vegetation in it too but it actually has a few different classes of vegetation trees grass vegetation bold trees so what we're going to do tomorrow is combine those classes together and see how well our model weights that we transferred from our previous model for the air escapes data transferred to our semantic drone data and that and by initializing the model with those weights we're sort of giving it like a hot start so if you're a numerical modeler you might be familiar with the process of hot starting a model and essentially that's what we're doing here we're just sort of giving it a bit of a leg up in terms of how how it sort of starts to tease out this big problem so by giving it a a set of weights that worked well for another data set and then training on top of those on top of that we give it a much better chance to converge more quickly towards what we want and that's what I'm going to start with tomorrow so go back to the questions so would you start building your model with these type of stock classified images then use yeah potentially that's a good way to do it so that's not what I'm talking about in terms of transfer learning you could take you could train a model on existing data set like this where the labels are already made for you because the labels labeling is a very time consuming and difficult thing to do and in fact is the most difficult thing about the whole thing in many ways so yes I think training at least developing an understanding of how these things work using other people's data is definitely a good idea before you start to embark upon a big labeling exercise because maybe this particular model is not going to work for you and you won't discover that until too late but also if you do decide to use this particular model or any type of model yes transfer learning is your friend take the weights from that same class or similar class that's been sort of maybe learn over a bigger data set or many more training epochs or whatever and then transfer it over so yeah if you don't have many thousands of images of course that's obviously where you have to start you have to start with some form of transfer learning I think to get you to some to some point and the great thing about it is that you know you'll see in the workflow that we we go through tomorrow you'll see how well that it does transfer to a completely different data set with different vantage different altitude different number of pixels different spatial resolution everything's different the only thing that's the same really is that it's an RGV image and it has vegetation in it and hopefully that will demonstrate to you my point so this is the last question I'm going to answer because we're going to stop in two minutes so if you want to look at image classification of Landsat images would a CNN be a better option than what we did today and how different is that to set up we didn't that is what we use today we did use a CNN a unit is just a type of CNN so CNN is a convolutional neural network which is a specific type of deep learning algorithm that is useful particularly for imagery and for image classification the unit that we used is a type of CNN and the specific type of unit that we used that we didn't really get in too much of the details is a type of CNN as well so this would this same workflow would translate over to Landsat images whether or not it would be as effective I don't know and I encourage you to have a look at that but you would set it up in essentially the same way the only thing you would probably pay attention to would be you know choosing a size of image that worked well with the with the model and that fit inside your hardware okay so that is two hours up