 Hi Evan, should we make a stop? Yeah, that sounds great. Let me share my screen. Or I can share my screen, oh no you can't. No, I think I got it. Awesome, everybody can see that? Nice, well welcome back everybody. It's great to see all of you and your boxes and your names. This is part two continuing where we left off on part one. Evan, and that's Dan. The outline for today is we're moving from the, I just wanted to refocus ourselves where we are in the Doodleverse, give a quick intro to Jim, discuss how we, and discuss a bunch of topics associated with Jim, how we move, how we're moving from Doodler to Jim, what models are embedded within Jim, the config files, and then this workflow for the typical Jim workflow, which is making a config file, making a data set, training a model, and then we'll do, most of the class will be taken up with a live demo of Dan and I working through Jim and some aspects of Zoo, which are sort of show how to use these models. And then we'll wrap up with a question and answer and sort of talk to you about what's next in the Doodleverse. Feel free to drop questions in the chat or ask them if you want to unmute and ask. We'll try to get to them as best as we can. Okay, here we are. So just to reorient ourselves, again, this is sort of the design pattern for image segmentation pipeline. We talked about labeling data in Doodler and we're now talking about this, and then at the end of this is this, you have a training model in Zoo. We're really talking about this middle part right here, where we're taking labeled data, we're augmenting it, we're training a model and we're trying to come up with the best deep learning-based image segmentation model. And that's really the core work of Jim, this middle part, whoops, let me turn off that. So this is where we are today, talking about Jim. And we sent out this Earth and Space Science paper, which sort of gives you a lot of background information if you want to dive deeper into some of the detail that we don't go through in a more academic way as opposed to a more code or a technocratic way that we're going to discuss today. So we're hoping that everybody got the instructions and we're able to download Jim if you'd like to, but the general way in which you start Jim and move from Doodler to Jim is you're gonna wanna clone Jim, go in the Jim repository and make the correct directory structure that's listed in Jim. Then in Doodler, remember we talked about that, use utility files in the utils folder. You're just gonna wanna run this one utility gen images and labels from Doodler and take the results, take the images and the labels folder from that and move it over into the Jim data folder and move it over into the folder structure that you have for Jim. You can move or copy if you wanna use the Linux commands or you can drag and drop if you want to. And that's the simplest and easiest way to get data ready for moving into Jim. So it's actually quite, we've developed it so that it's seamless, just drag and drop for you to get ready. Just to back up a little bit, the whole Jim works, instead of Doodler is running machine learning in the loop with your labels, but it's all using scikit-learn. Now we're transitioning over to something a little bit more high power and deep learning models. So instead we're using TensorFlow and the Keras API for TensorFlow which is the high-level API that has this really nice feature of progressive disclosure of complexity. So it's quite easy to learn and then you can go deeper and deeper and deeper into Keras and into TensorFlow to create the type of model that you want. So we're using TensorFlow for this and we just wanna mention that for working with TensorFlow and for doing deep learning, you can use your CPU but it works much faster if you have an NVIDIA brand GPU that's connected to the TensorFlow install that you have. So I've had left with things that are eight giga, GPUs that are eight gigabytes and above but you can probably adjust the model to make it use smaller GPUs but that's just from the standpoint of what hardware you need to get Jim running fast and for big models, this is what we're talking about NVIDIA GPUs. So Jim makes it easy when we're talking about these TensorFlow models well, what are we really talking about as a model? Well, Jim makes it easy for you to work with this family of deep learning image segmentation models called UNETS. UNETS are a classic architecture from deep learning the Ronneberger paper that we cite in the Jim manuscript is the one that outlined how this fully connected, fully convolutional I mean, fully convolutional network works and it's called U because it has this rough U shape where you have an encoder branch where we're taking images and dropping them through convolutions and pools convolutions and pools and shrinking them down to small size to what is referred to as the bottleneck between the encoder and the decoder and then blasting them back up to these maps that have the specific number of classes that you have in your labeled imagery. So it goes strings through the encoder and expands through the decoder and the encoder and the decoder are connected by these connections that you can see. So information is passing through the network down through the scales and also across the scales from the encoder branch the decoder branch. So that's I think where I wanna say we're trying to make it easy here but you still need images and labels and they're still to work with to make models be performant and to work well. And we'll go over that. So in general, when we're talking about making a model if you were just gonna from scratch work with TensorFlow these are the some of the pieces that you need the high level pieces that you need to get going. You need a data and I think explaining them now is going to help contextualize what's inside of Jim and what Jim is actually doing for you and why the configurations matter. So I just wanna run through this text heavy slide to explain the components that you would need if you were developing a deep learning image segment actually any deep learning project. You need a data import pipeline so you need a way to convert your images and the associated labels into tensors to adjust the size of the imagery to augment the imagery if you want. So subtly adjust them so that you're the deep learning pipeline the deep learning model is able to generalize better generalize. You need to define the number of images that are in a batch and you need to normalize the imagery. So do that standard deviation scaling or do a zero to 255 scaling. You need to define the model architecture which is that setting up that unit that encoder and decoder you need to stack all the layers up and connect them all together so it understands how to work. You need to define a loss function which sets the error that is needed for the backward pass. So a neural network does a prediction step and then it adjusts all the weights through a backward pass you know the loss function to see what the error was on the images that you predicted and then in optimizer which helps to actually adjust those weights and biases in the backward pass. You need to compile the model so that it's ready to run very quickly on your device and ready to go. You need to set the callbacks which is when a model stops and when to adjust the rate at which the optimizer is operating on the weights. And then you need to call this step to actually fit the model and run all the batches of imagery through the model and adjust the weights. Do the actual work of training. And then at the end you need to visualize the trigger metrics. So what the loss rate was and any other metrics that you defined. So for a regression problem that would be mean squared error for a loss function for or the metrics are a loss function for a segmentation test or things like dice or intersection over union which is the jacquard. And you need to visualize the outputs. So look at some example outputs to see if you've done a good job if the model has done a good job. So all of these things you need to do and you could build it from scratch but Jim is all about building this opinionated tool set opinionated workflow for you to be able to do this and experiment very quickly because a lot of these things are hyper parameters need to be adjusted. You might need to adjust the loss rate a little bit more. You might need to play with which loss function the learning rate a little bit more you might need to adjust the loss function if you want or make subtle changes in the architecture. So we allow you to experiment in Jim but the scaffolding is set up for you. So it's quite opinionated in how we set up all of these different pieces. And I'll pass it to Dan. Thanks Evan. So just before I begin here I wanna make a note and I put it in the chat that you should refer to the Wiki as much as you can when you're starting to get going with Jim projects. There is a test data set that you should probably want you probably want to run in order to make sure that you fully understand really how the directory of files are put together and whether or not your computer has sufficient power to kind of train these models. You should note that the test data set is fairly small. So it should run fairly quickly if you have a GPU but you can also run it on a CPU only and it will take much longer but it will still complete within a reasonable amount of time because it's a relatively small data set. I think it's only about 70 images or something like that, 70 images and labels. So yeah, please refer to the Wiki as much as you can. We've tried to do a fairly complete job of kind of stepping through the different parts of Jim and why they matter and how you can get going. And then you're gonna get a real crash course over the next hour or so because I'm gonna show you, do a live demo of a couple of different projects that have put together. One is using the data set that we labeled collaboratively last week. I added a lot more data to it to get a more powerful model and you'll see a more kind of production scale model. And then I also wanted to show you an example of a scenario where you might already have labels and images from another project or from another data set, i.e. you haven't gone through the process of creating your own data set using Doodler. You've just found a data set on the internet and you want to fit the model to them. Okay, so let's dive in. I'm gonna just show a couple of slides here before I share my screen and then we'll get going on those two different case studies. The config file, mastering the config file is basically the entire, it's like 90% of the work of getting to grips with Jim. The config file is kind of set up in JSON format. So it's readable in, it's very easily readable. A lot of you I know are numerical modelers so you're probably quite familiar with these kind of old school config files that you kind of set all of your parameters to make your model run. So I'm fairly sure that a lot of you will be already quite familiar with the process of kind of iterating over models like experimenting with different parameters in order to make those models run optimally. So there's a lot of different things in here and we kind of break it up into a few different areas but this is a single config file that runs all three of the scripts that you see inside your Jim top level directory. So there's three scripts. There's one that does the creation of the data set. It takes your images and labels and turns them into a format that we can pass to TensorFlow for efficient throughput of training in the model. I'll talk about that. There's the training script itself which trains the model. And then there's a basic implementation script that allows you to then take your trained model and point it at a folder of images for the purposes of actually using your model for the real purposes outside of the scope of validating the model which is all done within that training script. And this config file will, it has the parameters for all of them. We considered kind of having separate config files but there's a lot of shared parameters. So we opted on basically having one config file for all of them. So I'm not gonna step through each of these just yet because I'm gonna do that on the live demo but just know that this is the config file for the test dataset that you should always see. You'll see a couple of things in there that won't make sense to you yet, but that's fine. We're gonna step through this step by step by step. Is that a little pop up here? No, no. Okay, so I think that's probably all I wanna say for the config file right now. And remember also you can always chime in here and I know that this may be a little confusing until we really get going. These are just screenshots from the wiki. You can, how this works is that the first thing you'll do is that you'll run this make ND dataset. ND stands for n dimensional because we want it to be clear to folks that this isn't even though a lot of the examples that we use are three band images. This is actually set up to work with one band imagery like grayscale imagery like a sonar or radar or something like that. Or it could be like a single layer from like a model output or something like that. Or three band, which is kind of probably most common because that's the RGB visible spectrum imagery. We've like personally I've used a lot of RGB imagery. And then it's also set up to run with N band. So you can make 4D, 5D, 60, 10D models if you need to. And that's that situation where you have like, for example, multi-spectral or hyperspectral imagery where you've got a big stack of coincident rasters that have exactly the same extent and exactly the same pixel size and they're all coincident in the stack. So I'm gonna run through an RGB dataset creation today, but when I do it, I'll talk about how you would then add additional bands. The paper that we published that actually used five bands that used RGB from Sentinel-2 and Landsat satellite imagery. And then additionally, just to demonstrate, it used the near infrared and the shortwave infrared which will pan sharpened and put on the same grid. All right, next slide. I thought one of the most confusing parts of the config file I think is the setting the learning rate scheduler. So instead of using a fixed learning rate, so I should talk about what the learning rate is at this juncture. The learning rate essentially, this is a model optimization problem, right? As Evan said, you train it iteratively, it sees a batch of images, it passes that image through the encoder and the decoder, then it matches what it found as the label to what you know to be the label from the training dataset. And then based on that mismatch and based on the specifics of the loss function that you're using, it then uses back propagation to then set those weights through the model and then it begins again. And the expectation is it's just gonna get better and better and better by increments. But a very crucial part of that is what's called the learning rate. And the learning rate is essentially how much it nudges. So if you're thinking about like an optimization problem where you're trying to minimize a function, then the learning rate comes in with lost functions all the time, even if it's hidden by the implementation that you're using. With deep learning, the learning rate is usually exposed and it's specified quite clearly because it's a very important hyper parameter. So it's probably the most important hyper parameter, I would say personally, in my experience after the batch size, the batch size is like a really big lever on kind of getting your model to work well, but it's very hardware dependent because the larger batches consume more memory. And then the next biggest lever, I think is the learning rate. And we kind of went through quite a lot of different iterations of gym in the early stages to see whether or not in general, models worked best with a fixed learning rate or an adaptive learning rate where the learning rate is specified based on how fast the model is converging to its solution. Or there's kind of maybe a more new concept which is the learning rate scheduler. So the idea behind the learning rate is that scheduler is that you actually specify what the learning rate is for every single training epic. The models may not actually ever get to the end of this curve, but they certainly will usually get to the top of this curve. And the intuition behind this is that initially, the modeler has absolutely no idea. You initialize it with random, completely random numbers for the initialization of the model weights. So it takes quite a bit of time for it to really figure out exactly how much to nudge that solution at every moment. So typically the guidance is to start with a fairly small learning rate and then ramp up quickly to a larger one. And at that point, and so you'll see in the config file, there's this start learning rate which is kind of where you're gonna start here. You may decide to start maybe in the middle of the curve or somewhere between the middle and the bottom of that curve. This example I'm showing on the screen is where we're kind of starting right away at the bottom. And then there's a certain number of ramp up epics where you kind of, okay, that's the number of epics I'm gonna, that's the steepness of that first part of the curve. Then there's the sustained epics which is gonna keep that maximum learning rate for a specified number of epics. And then there's an exponential decay in that curve and that decay rate is set by that number. I tend to use 0.9 and that's what's specified in the config file. So that kind of covers the basics of that learning rate scheduler. It's sometimes necessary to change that learning rate scheduler. And what we've noticed is that usually, I'm kind of just changing the minimum and the starting learning rates more than anything else. I'm not necessarily, and sometimes I'm changing the number of sustained epics as well. But really? Yeah, sometimes, yeah. Okay, I always, I only adjust ramp up. I make ramp up really slow. Okay. Cool. Well, we learn things from each other all the time. You know, and you develop your own intuition for what works for you and the problems that you work on. Probably should stop there. I'll talk a little bit more about the learning rate scheduler later on, but that's basically where you're at. And those numbers that you see, this is just from a paper that I've written. And those numbers are actually just the value of the loss, the actual loss. So you're, you know, it's really just kind of a pinpointing in time based on the learning rate, what those numbers are. And that's, I find that a helpful trick just to plot that up, just because then you can see, okay, I very quickly went from a very high loss to very low loss. And that might be where you kind of decide, like Evan does to change that number of ramp up. So you can see, okay, what slope do I really need here? Then you can get a sense for like, okay, is this sustained really necessary? How much is that loss really changing as a function of me, of increasing that sustain? And then again, like you can see, okay, well, what if you're really gonna go to town on this, then obviously you'd have to do quite a lot of experimentation for all of these. But, you know, you can play with that exponential decay and you can even set the number, the maximum number of training epics. I tend to kind of just set that at 100, it very, very rarely will go out beyond the 100. And that's, you know, and that's because the models tend to converge quite quickly because I've made good choices about my loss function and about my batch size and things like that because I'm experienced in doing this. You may kind of just set things up and things don't converge and that might be because, well, it could be for various reasons and we can talk about that in the Q&A afterwards. What else do I want to say about that? Oh, one more thing about this. So one thing that I would like to communicate is that the learning rate and the loss function kind of go hand in hand. So what we've noticed over time is that there are various loss functions that are encoded in GIN that you can use. Dice loss is the general recommendation for something to start with. It's much more suitable for problems where you don't have equal numbers of pixels of every class in your training dataset. The canonical way to train these models that was adopted early on was using the categorical cross entropy function. That is also encoded in here. But there's a big difference between those two different loss functions with respect to the learning rate. The categorical cross entropy tends to like a larger learning rate. So you're gonna set something like e to the minus two or e to the minus three rather than e to the minus three, four, five, four for dice. So the dice likes a lower number. Okay, next slide. This is the part where you're training the model. Some of you may be already training models, which is awesome. If you are, then hopefully you see something like this on your screen. If you're not, then you'll see it on my screen in a minute. You're first prompted, and you'll see that I'll go through this slide, but you're first prompted. We've got a really simple kind of gooey-based thing where you've first said, okay, this is the dataset that I want to fit to. And then the next question and the last question you'll get is, this is the config file that I want to use. It's very, very much recommended to, if you're going through a period of experimentation with models, which you probably will, because it's very rare to get it perfect first time. Be experimental with, take an experimental approach, just change one thing at a time. You know, if you have the time to kind of step through, you want to play with different batch sizes, for example, creating a config file for every time you change something. And then it's your first form of documentation, right? You can hand that to anybody else and say, this was the sequence of things that I did. It's also crucial because when you come to actually use this model later on, it's gonna actually ask you for the model weights file, but it's then gonna see whether or not there's a config file that then applies to those model weights. So those two things go hand in hand and you should never change the name of the config file and you should never move it out of the directory that you've set. And again, I'll kind of re, I'll go over that again and I'll show you what that looks like on my screen because I went through this process yesterday. I trained a few models and I'm gonna show you what the different outputs were. And you'll see that I've got weights files and config files and that's how it should be. Okay, so without further ado, I guess. Yeah, you should stop sharing your screen. I'm gonna take a sweep of coffee. Thank you for everyone who submitted data last week. We, as we said last week, we're gonna create this Zenodo release. Just actually we're gonna do it by the end of the week. We're gonna include everyone who's contributed data so far. So thank you for that. I think most of you have emailed us and consented to that. We've got a list of your names and we'll be in touch. We're gonna start with that data set. It's actually a data set that has been, it's a lot of you will have, sorry, I'm skipping out my words and apologies. A lot of you will have labeled satellite imagery. We gave you a lot of satellite imagery. There's a couple of you that labeled other types of imagery, nape imagery. When we put this class together, we kind of had a much larger class list and we made folders and we kind of split it up equally between satellite and nape data sets. It just so happens that most of you ended up with satellite imagery. So that's the data set that we're gonna fit to. There's gonna be a subsequent release. There's a couple of you that contributed nape images and you're gonna be contacted slightly later on. We're gonna make another data set that includes your data. So you're not gonna miss out but we're gonna obviously make sure that those people are actually labeled for this specific thing yet on that release and then there'll be another release. And that is the data set that we're gonna fit to today. So if I share my first screen here, too many screens. So this is a four for people who labeled nape. This was a four class data set, right? Water, white water, sand and everything else was in an other class. That's right, yeah. So I guess just step through this. So this is your top level directory. So this is gonna be, when you see me in my screen over here this is where I will be. I will be here on my command line prompt. I'm also going to have my condit environment activated. My condit environment is called gym. If you're not set up with this condit environment and you don't want to actually step through this like with the test data set, then that's totally fine. You can just watch me and you do the same thing. So this is where you are. It's very simple and how it's put together. These are the three scripts that I was talking about before. This, that's just the instill that's the condophile instructions. And then there's these utilities that do various things like help you troubleshoot and various things like that. And if there's time later on we'll step through a couple of them. One of the utilities is quite useful if you're just getting started and that's called test GPUs. All that's gonna do is just make sure that you have GPUs and that they work and that your TensorFlow can see them. And if you really run into trouble with GPUs then I suggest maybe that's your first port of call. And if that still doesn't work for you then I guess we'll see you on GitHub. Okay, this is the data set that we're talking about. So I've got, it's actually been complicated here because this is the data that we all contributed last week. And then there was one more late arrival last night, Kyle Wright, he emailed. So we've actually got two different data sets here that I've kept specifically separately because I went ahead yesterday and merged the first seven data sets. So this is an ongoing project that you're contributing to where we're trying to segment satellite imagery into those four classes that Evan mentioned. We're interested in where the water is, where the white water is, I either surf zone where the waves are breaking and we're interested where the sediment is and then everything else is kind of just lumped into this other class. It's like no data, clouds, towns, vegetation, you name it, everything else. And that's just to really simplify the problem here. If you consult the, this is kind of similar in scope to some of the data that we put into the Coach Train data set. And there's links to the Coach Train data set from the paper. You'll see that we use the Coach Train data set. We use a slightly different version of some of these data for that paper. One of the big differences between that paper and what I'm gonna show today is that we're just working with the RGB images today because that's kind of just the most common scenario. So long story short, I have all of these other data sets that I've already made or that other people have kind of collaborated with Evan and I to make. So what I ended up doing because there wasn't necessarily enough data from just, there was 113 that we got from the class yesterday and I decided to just then merge that with all these other data sets. So what we have then is a seven part data set where we have many, many more images. And Dan, I think this is a good time to just mention that, what, so I've made an okay model like enough for to show that it would work with 20 images. And I just want to mention that for people who might want to just doodle their own things for their own problems and get there. But I think on the order of 100 to 1,000 is usually what I would say is the number for generating a model that has some generalizability depending on the number of classes. So there's no single heuristic that we can provide. This number of classes and this size imagery, you'll need this many labeled data, but that's sort of the number that I shoot for. I don't know if you have any other comments that you want to add to that. No, absolutely. Jim is totally set up. It really does depend on the problem. If you've got a fairly unquote, unquote simpler problem where you can get away with fewer labeled images, then it will totally make a model that works well for you. That might be as low as 20 images. And then you're off and running. It could be even lower than that perhaps, but probably I'd say 20 for the absolute minimum. And then depending on the scope of your problem, you're going to need more. So this is kind of a large scope problem. We're trying to find a model that will find, get a model that will find water, white water sediment and other for any image that might be in the world of any coastal zone, right? So this is a fairly ambitious research project that I'm showing you. What I plan to do here is I'm just going to show you the folder and decisions that I make and the outputs that I get from this larger one. And then I'm going to go back and I'm going to actually run a smaller problem. So we can actually see it complete in a quick, like you go through the dataset part and start the model training part while we have the Q&A. And that's just to demonstrate that yes, you can get a good convergence on a model just from a smaller dataset. You'll also note that the test dataset that you should have, if you're playing along here you should have downloaded and are looking at at the moment that's only about 70 images and that's just sort of a single location. Generally, if you're just interested in a single environment a single location or a single specific kind of dataset then yeah, absolutely just doodle a few of them and then make a gym model and see how well it works. And then at that point you might be satisfied with what you see or you could doodle a few more and then go through that process again. So it really does depend on the nature and the scope of your problem. But as I said, I'm gonna go through this larger problem and show you what the outputs might be for this larger problem. I can show you some lost curves, I'm gonna compare them and then I'm gonna step back and actually do like the live demo part of this. So these are the, you know, this is the way that you typically would set up the, you have initially you just have these two folders images and labels. Then, and you can, you know, you'll see that, you know there's loads of different sizes of images in here because I've managed lots of different dates that I can just step through them. You know, you'll see that they're small because there's satellite images they're from Sentinel and Landsat. So the pixel sizes are between 10 and 15 meters. So, you know, a very small image actually is quite a large area. And then if you're working with different resolution data then obviously you're gonna choose an appropriate size of image to work with. Often you're gonna be working with situations when you're, and ideally I would say you're gonna be working with situations where all of the images are the same size like you've chopped up a larger image or you're just running with the images that, you know, maybe the slightly downsized from the original or something like that. As we talked about last week with Doodler there's definitely a sweet spot when it comes to the image size. You're gonna be working, you have a set of images that are a certain size and then what the gym is gonna do is then is gonna squish them into a new size and that's called the target size. So if we look at the config file this is the config file from these jobs the very first line is that I usually write out is the target size because it's kind of the most it's the first decision that you'll make essentially. I'm running with a target size for this particular model of 512 by 512 that represents because these pixels are 10 or 15 meters that represents quite a large area but if you're in a situation where you've got really tiny pixels like you've got a drone image or you've got a numerical model output or something like that then really you have to, that's on you you have to really kind of determine a what makes sense for this problem like what resolution can you get away with if you've got this really large image and you need to make it smaller then what can you get away with for the purposes of just labeling those pixels you'll then subsequently obviously upscale the labels. And also like what it will depend crucially on like your available hardware as well these models are typically run on GPUs and we're not Google so we only have like one GPU or two maybe at the most like most of us we don't have a lot of resources to buy lots of these expensive GPUs. So we're in a situation where we acknowledge that our imagery is kind of over resolved perhaps and that we can totally get away with shrinking the images within the scope of just training the model. Remember that that image will then the features that that image that the feature extracts uses will actually, you know within the architecture of the model it will make a feature representation called a bottleneck which is actually tiny, really, really tiny. And then it's using that to scale upwards and it's doing a few computational tricks like skip connections and residual connections to kind of make that upscaling, you know add spatial resolution back into the solution but you're still always almost always in a situation where you've got an image and then it's being shrunk and then sometimes you're even changing the aspect ratio of the pixels too and in this situation I'm definitely changing the aspect ratio of the pixels because all of my input images are different sizes but whether or not I'm shrinking the image is actually dependent on the actual data set. I have purposefully it's kind of a it's a conscious decision that I've made through experimentation as well that I'm using imagery of different sizes because I want the model to generalize better to my problem. You'll notice that this image here of duck from Sentinel-2 that's much larger that's 600 by 600 pixels, right? So that image will get shrunk down to 512 but this image here is much smaller it's only 77 by 251 so that one's gonna get not only is it gonna get blown up but it's also gonna change its aspect ratio as well. Dan, Aleha is asking in the chat and I just wanna make sure everybody's aware with this how do you figure out the target size if the images are varying size? And I think as I'm typing this question and as you're speaking directly related to this topic I think it's important to mention that if you are looking at a specific landform that changes in size based on the scale of the imagery you're using hopefully you're developing a model that will be able to adjust and adapt to that sort of situation. So for example, if you're using aerial images that are from very close flights and very far away flights let's say the size of the dunes in the building for coastal setting or the size of the buildings or the fields or cars or roads are gonna change but if you have enough labeled imagery you should be able to... That problem is soluble for the model and it should be able to accurately detect all of the specific feature of interest that you've labeled. So that's I think the first thing that I wanna mention about this and the second is that just as Dan is saying that this target size is 512, 512 Jim only is really works with some powers of two target sizes work really, really well as well as if you need rectangles you can do something like 768 by 1024 but really this 256, 512, 1024 these are the magic numbers for Jim because of it works for the size the cardinality of the model as you're going through the encoder decoder branches. Yeah, absolutely. This is, we're laboring this point a little bit because it's a really, really crucial point and that's a great question, Leah, thank you. If you have like it really is kind of dependent on the nature of your imagery and your class set like we tend to use broad classes here because we know that if we specify broad classes then those features are not going to change like they're not gonna disappear if we change the image resolution or if we downscale them the necessary kind of, you know it's really just that that junction between how much hardware you have or is available to you and how much time you have to train a model obviously if you're using larger imagery it's gonna take longer because your computer's working harder it's got more data to actually digest. So at least initially I would say kind of choose broad classes that solve your scientific problem and choose image sizes that you can fit within that. So if it means chopping if you're working with larger data sets like authors or model outputs if it means just chopping your images up into smaller and smaller pieces to get more and more classes to see then that's what you should do. But if you can get away if you've got it, if like in this situation where we're using super broad classes it's really quite obvious with the thumbnail you know that that's quite water and that sand then you can do what I'm doing here. I'm gonna show you though later on I'm gonna show you a completely different problem which is come from a data set called flood net where it's very different. There's many, many more classes some of those are quite specific those are fairly small like the features that those classes are in the images are quite small like cars and things like that in big scenes and the images are much larger. So hopefully by the end of this you'll have a bit more of a complete sense of what we're talking about but at least for now if you just, if you need if you're target sizes necessarily large and you don't have a very large GPU then you've got two options try to get more compute power or move it to the cloud or something like that so you can actually access a larger GPU larger in the sense of larger memory DDR6 memory or chop the images up into smaller pieces and hope that the model does well on that scale of image and that you can then piece it all together afterwards and these are all things that we're kind of grappling with every time we start a new project I've got something like 17 different projects over the past year that has used gym to segment images and classes like 17 different decisions and trials and experiments of like, okay, what target size do I need? What batch size can I get away with? And how good are the model outputs for the specific problem that I need? So it's very much if you're kind of diving into this world it's very much be prepared that you're going to do this experimentation a little bit in order to get what you need. But that said, you can also run this on 20 images and at certain scale and you might be satisfied with the outputs and it really does depend on your problem. Okay, so these are the images I've talked about the target size for the reasons that I've explained above as 512 by 512 is basically somewhere in the middle of the smallest size images and the larger size images I have most of the images I have are square so most of the images are not going to be like the aspect ratio of the pixels is not gonna change so I'm running with 512 by 512 but as Evan said, if you have like typical format imagery then you could use something like that 768, 1024, you can then double these numbers you could do 15, 36 and 24, yeah you like powers of two work pretty well for the cardinality of the model. There are other really specific sizes that also work we've never found, well, have we ever tried? Yeah, I think I tried. I tried and failed to find a formula that predicts what sizes work within the cardinality of the model. Hopefully someone's done that somewhere and we can find that at some point. Usually though, as I said these numbers tend to work pretty well. 512 by 512 is a typical one 768, 1024, et cetera. Here are the number of classes that I specified here's the model that I want to run there's currently quite a few different models that are built in but the go to one the one we recommend with trying first is the res unit, the residual unit and that differs from the vanilla unit that's talked about in that Roneberger 2016 paper because it has these things called residual connections which basically they take they features from one layer and they literally bypass them and they add them to the next convolutional block so the model gets to see the outputs of a specific convolution layer and it also gets to then add them to the inputs of that convolutional layer and it tends to make the models deeper and more robust and better generalizable and we demonstrated that through a number of different data sets. So that's where you should start. There's this weird one in there it's called a satellite unit which doesn't actually have any publication associated with it and it's just a completely flat unit where it doesn't change the number of filters on every convolutional block. The kernel size is quite important that's the actual size of the convolution kernel that it's using to extract features. So as you pass the image in it's doing convolutions over the whole thing and then the size of that kernel is basically then going to dictate the size of what's called the receptive field which is basically the window that it's using to pass over. The convolutions in just one of the lessons from a deep learning 101 class would be that one of the reasons why convolutional neural networks are so good on spatial problems like images is that they share weights spatially and there's definitely a sweet spot in terms of the size of the kernel that you use and how those weights are shared across the image. It's kind of related to the concept of stationarity. If your image displays a high degree of stationarity then convolutions are going to work really well because they're going to exploit the fact that there is similar features in different parts of the image. That size is a pretty important lever. As I said, you generally want to use as large a number as you can fit on your GPU. I've got a single RTX 2080 Ti in my computer here. If you're a Linux user, you can also do this, I think on Power Show and Windows, I might be wrong about that. But I've got this command that I kind of have in the background all the time. I use the Linux command watch which basically just keeps an open pipe of a particular command and then the command is NVIDIA SMI and it's basically showing me what GPU I have. It's also useful for troubleshooting because as you train models, you'll notice that the amount of memory that the model is consuming is updating in real time. It's showing you temperature and things like that and it's showing you what devices are being used on your GPU. For example, I've got my browser open and that's consuming a little bit of memory. If you're kind of doing a model run and you're right at the edge of your memory consumption, then you wanna make sure that you're closing everything that you can in order to give your GPU the best chance it has of utilizing the GPU. Oh, I'm doing a lot of talking here. I guess there's a lot of things to talk about. That's quite an important thing to know though. If you are kind of doing a little bit of troubleshooting and you're gonna be doing this a lot. Yeah, just to tell you it's 1250 right now or East Coast, it's the 50 minute mark. But I just wanna mention that all of these configs have all of this backstory behind them. So if there's anyone that you specifically wanna know about, let us know, but all of them have this long history of literature behind them. Yeah, and also a lot of trial and error in terms of what we expose and what we've basically decided to not expose. When we talk about the opinion aid workflow, what we're really saying is that we've kind of taken some of the opinions away from you. There are many more decisions that you could make. These are the ones though that are gonna be most crucial to the outcome that you see. So batch size is important. Use a high batch size if you can get away with it. The batch size and the target size are obviously gonna combine. You're gonna have tensors that are 512 by 512 by the N number of bands, so that's three. And then you're gonna have labels that are gonna be what's called one hot encoded. So it's gonna take your integer 1D label that you made and it's gonna then make a stack of N rasters and it's gonna pass that to every, and it's gonna pass that along with the tensors. So you end up giving your GPU quite a lot of pixels to work with. Okay, and that's all I wanna say about that. I've obviously got four classes to that specified there. If you have many more classes, then you're gonna be consuming much more memory. You're basically gonna double the size of, if you use eight classes, you'll double the size of the label tensors that you're passing to the model while it's training. Here's the loss function. Mode is a strange one. Just use or if you have any other, unless you have a specific reason not to, that just means that you're gonna use both augmented and non-augmented imagery. One of the things that the make datasets command does is it creates two sets of files. One is just the files that you gave it essentially just kind of pushed into the NPC format archive so it can be then made into a TensorFlow dataset for a more efficient throughput on your GPU. It's also gonna make an augmented set and that augmentation is set by a whole host of parameters down here that you should just leave unless you have a kind of a specific reason to. There are things, the amount of rotation, the amount of zooming in and out that you'll see that you want shifts to the left or to the top, horizontal flipping, vertical flipping, et cetera. So these are the basic augmentation parameters. There are many more, but these are the ones that it does. And this is kind of just the limit scope of the experimentation that you can do because the augmentation is really as a regularization technique. It's there to make sure that your model generalizes better and it's very important for the most part, but you could go to town on every type of augmentation. And if you wanted to play with the augmentation, then you'd be in the realm of kind of modifying the script in order to do so or working with us and we could kind of get that in there. These are the, we've talked about the learning rate scheduler, I'm using quite a lot of ramp up epics. You know, I'm gonna have it ramp up over 40 epics, sustain for five, you know, so it's gonna kind of really slowly get up to its top and then come back down. I'm using a valid, so this is important. This is the validation split. So make data sets will take all of your data. It's set up to be reproducibly random. So it sets up seeds, a certain seed for the numpy and the TensorFlow operations. So therefore you can run the models again and again and again and it will always use the same training and validation files. That's important because you want to make sure if you're doing experimentation, you want to make sure that it's using the same files. So you know that the changes that you specified in the config was the thing that changed the outcome, not that it saw a slightly different set of files on the training and validation. That's less and less and less important. The more data you have, it's crucially important if you're working with a small data set, which I imagine a lot of you would be at least initially. So the validation split really then is the only thing that you have in order to kind of modify the behavior of the model with respect to the training and the validation samples that it sees. The training samples are the things that actually set the model what it uses to determine how good the model is at every stage of training. The validation set is the thing that then determines how the gradient should be then back propagated through the model. So when it finishes training, when it finishes an epic, it will look to see the discrepancy between the validation set labels and the validation labels predicted by that current state of the model. And it will use that discrepancy to kind of nudge it one way or the other. So it's very important, but those images are not the things that the model sees during training. It's just the things that then set the model weights through in the back propagation. So they're both important in terms of defining the scope and the training of the model. My recommendation is always to use a large validation split for the purposes of generalization. You're usually in a situation where you've got image, like training and validation images that is generally quite similar. Ideally, you want a completely independent set, a test set that is maybe larger or different in scope, like it's a different place or it's a different time or something like that. If you really want to know how well your model is gonna generalize in production. But usually your training and validation sets somewhat similar because they're the things that you labeled, right? You can only label so much. Therefore, you want to keep your validation split as high as you can, I think, because or as high as you can get away with in order to see good convergence in your model. And so I typically use a split of like 0.6 or in this case, I'm using 0.7. 0.5 would be kind of equal amounts. And it's not, I mean, you see a lot of papers that use validation splits of like less than 0.5 and I always kind of think that's quite dubious practice but it does depend on the intended purpose. If you're just trying to segment a single data set and that's fine. But if you're trying to create a model that you then want to give to someone else or apply more widely, then it's not necessarily the practice. All right, that's all I want to say for now on the config file is quite a lot here. It's a lot to take in, it's a firehose information. If you're not using a GPU, this is where you would set your GPU. If you're either using a CPU if you don't have a Nvidia GPU, then you'd set that to minus one. Minus one just means I don't have a GPU, use a CPU instead. If you have multiple GPUs, then you can set them using commas. I've got one machine that has three GPUs and that's how I typically run it. It will distribute my training across three different GPUs. It's just quite cool. It's one of the things that TensorFlow Keras gives you. And that's how you would do that. But on this computer, we've just got one. All right, so I took my images and my labels and the first thing I did actually before I went down this process was I made, I used the Doodler utilities. So if we go back to last week, I've got a window here that has the dash Doodler, condor environment activated. I'm inside the Doodler utils. And actually I use the function here called gen overlays from images and labels just to make it kind of runs through, ask you where the images are, ask you where the labels are. And then it generates a folder called overlays and it just shows you each image and you can then scroll through them and see, okay, yeah, that's worked pretty well. I'm happy with that. And everything's lining up nicely and I'm good to go. So that's another thing that I would recommend doing is to use that overlay script. Then I'm in a position where I've got what I need to run the make data sets. I've got my config file, I've got different config files in my gym project. I've made a new project here called gym project and I created a folder called config and I put my config files in it. I created a folder called npzprojim or whatever you want to call it. These are my npz files that I'm gonna use for my gym training. That's something that you need to make. You kind of just need to make an empty folder to do that because you're gonna specify that when you make your data set. And then the third one I made was this folder that I called model out. It doesn't need to be called that but an empty folder that is just gonna be like where the outputs of the validation step, the evaluation step at the end is gonna make. And I will go through each of these things as we go through just so you're kind of oriented a bit. So I went through the process of make data sets. It first asked me, what folder do you wanna put these in? And I said, this one, npzprojim. Then it asked me, what config file do you want to use? And I said, this one, I just specified it. And then it said, where are your labels? And I said, here. And then it said, where are your images? And I said, here. And then the last question it asked was, are there any more images? And what that's doing is it's setting you up for this kind of ND problem where you have, as I said before, maybe you have other images that are the same coincident band. Remember that we're set up just to use JPEGs. So JPEGs don't store more than three bands. So you have to kind of, it's just a little quirk of the program at the moment is that you have to store your separate bands as separate files. And then you just specify which folders contain those files. So for example, if I had my near-infrared band here, I would have the same number of near-infrared bands in here and I would just put them in there. And if I had other bands like, you know, band four, oh, sorry, band five, then I would put them in there, but I don't have them in this specific example. And that's from a quirk of TensorFlow that it takes JPEGs and TIFFs is still an experimental thing. So it's easier to work with JPEGs always rather than TIFFs. Yeah. Yeah, and that's just because, I think, TensorFlow and Keras, they were just originally set up to kind of work with pictures of dogs and cats and things like, you know, compute vision problems, which are typically just photographs of visible band stuff and not necessarily for geoscientists like us. But we're working on it. I think we've got plans to kind of do something better in the spring and you're always welcome to join us in that effort. Okay, so then it kind of just stepped through and it made, you'll see as you step through it and we'll do it in a minute, but I just wanted to kind of just show you the outputs just in case something went wrong. It creates a whole folder of MPZ files. And as I said last week, you can actually open these files. Unless you can just double click on it. On Windows, you can open it with seven zip or with another archive view or anything you can open zip folder. And you'll say it's a little bit different from the Doodler ones. It's actually, it's got the array zero and array one, which that is your image and your label. But it's a specific, it's been, it's been smushed into that target size and then the labels have been one hot encoded. So instead of a one band image now because I've got four classes, I've got a four band sparse array. So zeros and ones and those zeros and ones just refer to where those classes are in the scene. So it's like an address. Then we're keeping track of the files. So you can always go in here and look at, okay, that's that file. That's actually the full path of the image that you gave it. So you can always go back and see whether or not there's problems with the files or you can always see exactly what files actually went into the creation of your model. And then number of bands is something that it needs just internal reasons. But again, that's kind of just propagated through. And it does that for an augmented set. So org and then it also made the non-augmented data known no org. It created augmented samples and it created non-augmented samples. And that's kind of just for the purposes of, a lot of times if things go wrong, it's at that make data set stage. Like you haven't got your file names exactly right or they're not JPEGs or they're not something or if you're using n-band imagery, then they're not exactly coincidental or any number of things could go wrong. And these are just ways that help you troubleshoot basically and to verify you that the model inputs are correct. I strongly recommend doing looking in this folder every single time you run the data set. Every single time. Okay, so it made the data set great. And then we're in a position. But what it did first actually was it resized images. So I just want to note that the first thing it does, if your target size is different from the actual size of the imagery, which is often the case, then it will resize your image. It will make two new folders called resize images and resize labels. I made two different sets because I wanted to experiment a bit. I made 512 by 512 and I also made a 768. So I went in after it did that, I went in and I renamed them. So I could keep track myself of what it was doing. And you can look at these, it creates an internal thing, but you can see how it's kind of strange, right? You've got these smaller images that are stretched wide and you've got these larger images, for example, here, which look about the same. And that's okay. You've determined through experimentation that that's okay. Really, the proof is in the pudding. The model outputs are good, but you'll see in a minute. So the first thing it did is that it resized those images. I always, always, always, just for accountability purposes as we're counting, I always just keep that classes.text file with my images as well. Even though it does encode the images, it encodes them as integers, right? Like the samples, the NPC files, it doesn't know what zero means or what one means and what two means and what three means. So it's always good practice to just keep that classes.text file or make it if you need to make it. Then it made my model. And I did, yesterday, I took this dataset and I made three different models because I had three different config files. I wanted to just spend a second here just explaining what I did in order to do that and then we'll actually do the proper demo here. We'll see it run. So the only different, as I said before, you just wanna change one thing at a time. And the only thing I did differently in between the first model run and the second model run was change the batch size. So initially I started with a batch size of 16. I looked at the model outputs. I looked at my over here and I noticed that it wasn't consuming all of my GPU memory. So I felt, okay, great. I can make that batch size higher and I think my model is gonna do better. And it did. So the difference between these two is just that single line batch size. Again, it's just for counting purposes and those weights are then used afterwards. And then the second thing that I did, between the difference between version two and version three was just, again, one thing I changed the maximum learning rate. I went from point E to the minus three to E minus four. I decided to reduce that maximum learning rate a little bit, because I wanted to make sure, I wanted the model to converge slower by reducing the learning rate. You're always gonna make your model runs that your convergence a little slower for a well-posed problem anyway. And that's because it's gonna nudge those weights through the back propagation steps a little less each time, right? So I've also limited, like by virtue of doing that, I've also limited the range of learning rates that it's using. And that's just something that I've just learned through experience of something that generally tends to work quite well. So I ran through this process. I basically ran make data sets once. I created that one set of NPC files and then I ran train model three types. First was version one config. Second was version two config that was version three config. And I wanted to just show you what the different loss curves were. So you can kind of, we can step through these loss curves. These are, again, these are something that you're gonna wanna gain a lot of familiarity with. You're gonna wanna definitely look at every time and train a model. It's quite simple. On the left, you've got the, the jackup index, the mean IOU score. On the left, you've got your loss. Your loss, the value of that loss is gonna be entirely dependent on the loss type that you have. We've got, we're using dice here. If you had categorical cross entropy, that those might start higher, like it would be a higher number typically, like two or three or four even. Similarly with hinge, hinge loss. So you're gonna have a slightly different value, but really what you're looking at is it, all of these loss functions will ideally converge to zero, right? So you're looking at how close they are to zero in the end is essentially what you're interested in. The other thing you're interested in is how much difference there is between the blue line and the black line. You ideally want the two lines to be on top of one another. If you have a really good model, then those two lines will be on top of one another or very similar to one another. How similar is really that's part of the art of this, like if there's like 0.05 difference between them, then that's usually okay and pretty good. If there's like 0.1, 0.2, 0.3 difference between them, then that's generally not so good. But the exact threshold for what's good and bad here is it depends on, and you'll really only get a sense of what's a good model by looking at the outputs. But this is definitely something that you should start with. This is okay. It's not particularly good. We can do better than this, or at least we think we hope we can do better, right? So the validation accuracy here, in the literature, they say anything over 0.5 is generally quite good for a multi-class problem because the mean IOU score is quite strict, and it penalizes bad pixels quite heavily because it's looking at the contribution of each individual pixel. So if you have a very small island of pixels that are incorrectly labeled, then that will reduce your score quite heavily. You've probably read and been told that neural networks, A, they're a universal function approximation, so they will fit to absolutely anything, but whether or not they generalize well is really that's the art of this, like actually getting that validation accuracy high and that validation loss low, that's the work. It's always the work. Your training activities are usually quite high, and anything over 0.7 is usually really good, like 0.8 is really, really good for these multi-class problems. If you've got a binary problem of just like, for example, water or no water, then your expectation shifts up a bit. Anything over 0.8 or 0.9 is generally quite considered better, but the more classes you have, generally you'll see your mean IOU score come down a bit. So if you've got 30 classes, then the mean IOU score of 0.6 is quite good, and you'll notice that when you see the outputs, it really just does depend a little bit on the number of classes that you have. So it's not a perfect metric for sure, and there are better metrics. I'll talk about metrics in a second, because we spit out many more metrics than just this, but this is kind of your very quick overview of what went down in your model. There's a number of things that you can glean from this curve. You'll see that it's jumping up and down quite a bit with my validation. It's not too bad, but generally it's much less smooth, and that kind of speaks to my low batch size, I think. I've got a fairly low batch size for the size of imagery that I have. The 16 is kind of middle of the road, kind of batch size, if you can go higher, and that's what I ended up doing. And you'll see that it bounced around quite a bit, because then the batches themselves are small, therefore their variability within the batch is just as great compared to the variability outside the batch. And you'll see that also reflected. These are kind of just mirror one another oftentimes. If you see really big spikes in your lost curves, it means you've probably got a bad data. You've got like a couple of bad images in there that the model's thinking, huh, I've not seen anything like that before, and that causes it to spike. And so if you have that spike, then definitely look at your overlays, definitely look at those augmented and those non-augmented samples, and you can weed out the bad data points because it happens. We're not all perfect labelers all the time, and we make errors, just basic errors, when we're kind of wrangling our data sets together, converting between different image formats and renaming images and all that kind of stuff. That's kind of where errors come in. All right, so that was my first model run. This was my second run. The difference between them was that it had a slightly larger batch size. You can see that the model converts slower. There's slightly less variability in my, in terms of big jumps between one epic to the next, there's maybe slightly less variability because by virtue of that batch size, you can see that the mean IOU score is higher for both, yeah, these aren't scaled the same. So you'll see that, look at this 0.6, for example, here, you can go over here and see, okay, well, we cleared that this time, and here we've cleared 0.8 on our training. So that's a better model. It's better because of all of those reasons. And then the last thing I did, when I looked at that, I thought, okay, well, there was a number of things I could have changed here. I was tempted to change the ramp up. You know, I specified quite a very, quite a large ramp up, 40 if you remember and sustained a five. If I was to do this again, I'd probably lower those, but you'll see that, you know, I had a number of different things that I could have changed there, but I just wanted to see what the effects of doing one thing was, and that was to change the maximum learning rate. And you can see that by changing the maximum learning rate, it didn't do much at all, right? So here I'm in this situation where I'm like, okay, back to the drawing board. This is actually made it worse. I've got, you know, ended up, this is the model that worked the best because the stats are the best. And then by changing that maximum learning rate, I actually made the model worse. So I don't wanna go in that direction anymore. I would go in the opposite direction maybe, or I wanna change something else. And I wanted to kind of just give you a really honest like, this is this process that I go through in order to try to get arrive at the best model, because this is honestly, this is kind of similar to what you will go through too. But what you change and when and why will change depending on your reading of these curves and what you see in these outputs. Before I take a look at these, I want to make you aware of what, you know, it creates, when it finally finishes the model training, it takes the validation set and it applies the model to each image in the validation set and you get a whole bunch of different metrics per image. And then that's, and it stuffs them into a CSV file. And that's quite informative. It's quite informative to look at that because you can kind of just get a sense of, you can make distributions of this and just get a sense of, okay, do I have a really large distribution of values here? And then it's up to you, you can, these are different metrics that you can use. They quantify slightly different things about the nature of the problem, like the Matthews correlation coefficient, you'll generally get high scores. Frequency weighted IOU, that's more appropriate than, can be more appropriate than mean IOU if you have a situation of very large class imbalance. So I'm going to leave it up to you in the interest of time to look up what these mean, but I just want you to know that they're available to you. And then also you can look at each class, you can actually get a score and there's quite a lot of information in here. Each one of the samples and each one of the classes has precision, recall and F1 scores and you can then use them when you're reporting out your stats or when you're troubleshooting. And then it's often good to just look at individual outputs. Like these are, you'll see that it creates two different outputs, it actually goes through and it looks at how good the image was on the training sample, this is a nice one, this worked really well, has a high dice score, has a low callback, leave the distance. Those things are talked about in the paper, I don't have too much time to talk about them right now, but these are other metrics that are useful outside of the mean IOU. This is a nice one, but you'll see that it's on the training data set. So, the expectation is it's gonna do better on the training than the validation. So you should scroll down to a similar one, this is an augmented validation sample and you can see that it's done pretty well. This was a sample that the model never saw when it was during training, but it saw enough similar things to make a good call in this situation. So you can see that in general, you can easily see that this model has done fairly well. Whether or not it's good enough for my purposes is really gonna then depend on what I do with this next. Like, this is an intermediate step, this is as far as we can take you with Jim, but I wanted to say that obviously you're all scientists and you're trying to use this tool to get a scientific testing hypotheses and generating new data layers and things like that. So in the end, whether or not this output, which looks quite good, whether or not it's actually good enough, that's gonna depend on your intended purpose. And obviously that's something that we can't help you with, but it's probably worth mentioning, but I'm fairly happy with this model. So I went ahead, I decided that this was the best model, but I've got three models and the next and final part of the process here was using that SEG images in folder script. If you remember, before we had this SEG images in folder, this is gonna ask you first, where are the images that you want to segment? So you tell it where those images are. I made this folder inside my Jim project. I made this folder and I put a bunch of images in there that I was interested in segmenting. These are images that never were labeled. Obviously you're gonna have way more unlabeled data than you have labeled data. So these are the images that were never labeled. And I went ahead and created. So you pointed at that and then you tell it where the weights are. So if you remember the way that you've set this up is that your config and your weights, it's gonna create a weights file for every one that you did. Inside that weights directory is a few different things. For each model run, you'll actually get five files. The first file is MPZ file. This contains all of the metrics. So these are all of the things that you see inside that CSV file and what you see plotted in those loss curves. And you can do what you like with that. And it's split it out. That's the loss curve, the learning rate curve that you use everything. It's entirely reproducible. That's the entire point. We've tried to make it easy for you to really as scientists report all of this out to hand it to someone else for scrutiny and et cetera, et cetera. So all of that information is inside the MPZ file. We haven't written utilities for that because we expect you to be able, your intended use case for that is gonna be completely different every time. But these are quite useful. These are the actual validation files that it used. And again, remember these are the MPZ files but remember each one of those MPZ files has that file name inside it. So you can easily reconstruct like exactly what images went into the training and the validation. And you can see that there's a lot of augmented files in there. And then the training files similarly. And then finally, we've got these two different sets of model weights. You can give the second images in folder script, you can give it either set of weights. But the full model weights, that's the situation where you can give to someone else who has a different computer architecture to you. So you'll notice that it actually puts not just the weights of the image but also the architecture of the model inside it too. So that's something you can give someone else. And that's why there's two different sizes there. In general, I would recommend just using running with the full model weights. Like think of this one as like that's the internal thing that it was using when it was training. And then this one contains more information that you can then like load up in different contexts outside of gym. There is a third way as well that, oh, maybe I won't talk about that because we don't actually encode in gym yet. But there is a third way to create these models and it would be a simple matter of kind of just taking this model and then the architecture and then saving it out to a different format that is even more portable between different devices. That's in gym and the utilities. Oh, it is? Okay, good. Yeah, it is. It's called generate save model. Oh, that's right. And I'll just say that I don't use the H5 files at all. I use that generate save model and that's the most portable function. The reason why that exists is because the loss dice loss is a custom loss function we wrote dice loss. And so you can't pass that to a person. If you make a model with dice loss, you can't pass it to somebody and they can reconstitute it or give an error and say the loss function needs to be defined. And this version save model actually replaces it with a built-in loss function. So that makes it ultra portable. Sweet. All right, so that's in the utilities there. So you got the different things here. We don't have time to talk about all of these utilities but they are documented or at least in stages of being documented on the working. I think we've done a fairly good job of documenting them on the weekly. Okay, so I then pointed, so I took the weights file, I pointed it, it ran through each image and it creates this folder called out, simply out. It's up to you to rename it to whatever you think is appropriate. Even though the best model was V2, I wanted to look at the V3 model. And then for each one of the input images it creates three different files. One is the overlay file, which is kind of easy to look at, right? That's, you've got the image on the left and then the output as a semi-transparent overlay on the right. It creates the colored label too. That's not necessarily something that you always need but that could be useful for subsequent processes like you're gonna kind of use it for mapping and things like that. And we can talk about that right at the end. And then it creates this NPC file. And this is kind of everything that it did. This is kind of storing all of the different information that it did use in or extracted during the model prediction step. So for example, it gives you the softmax scores. The softmax scores are these kind of pseudo probabilities of every class of every pixel. It's the thing that we then take to determine what class each pixel is. Typically, we would just use the argmax function. So it's gonna take the maximum softmax score in the stack and then say, okay, that was the maximum. So I'm gonna call it this class. It, you know, it encodes things that might be useful downstream. Like there's the grayscale label, for example, that you may wanna use. These are all numpy arrays, right? So you can just read these straight into a Python script really easily. And there's a whole bunch of other metadata things in there that like that's the config file that you use, right? So that's important too. If you ever lose that config file or modify it, then you can come back to this and it's written out. Oh, no, sorry, no, that's not right. They're the actual config files that you use. So don't ever lose the config file. You always need that config file. It should always stay with your weights, as I said before. But it has some of the different things in there, like the number of classes and number of bands and stuff. There are a couple of different decisions that you need to make when using that script. They're all there at the bottom, there's three of them. The simplest one to explain, so write model metadata is true, that creates this NPC file. If you say false, then it's not gonna create that NPC file. And that's just for situations where you might have like limited storage or something like that. But generally you want to keep that true. There is an option instead of using the argmax function, which will just determine what the biggest note, what's the best softmax score for every pixel. You can use an adaptive threshold, but that's only for binary problems, right? So binary problems like two class problems, yes, no problems, that will use point, like your argmax will be, is it below or above 5.5? But you can, sometimes your distribution of values are skewed one way or the other. And it might be more appropriate to use an adaptive threshold. So it's actually determining what that threshold is for every single image. And it might not be 0.5, it might be 0.4 or 0.6. That's what the Otsu method gets you. And that's why it's called the Otsu threshold. And then finally, there's this thing that's a little harder to explain, which is something that's called test time augmentation. This applies to both binary and multi-class problems like we have. And what that's really doing is it's basically, it's taking the image, and instead of just providing one output for every image, it's providing multiple outputs for every image. And it's doing that just by doing literally augmentation on the input image. So you give it a sample image, it will predict on that, and then it will flip it horizontally, give you that, flip it vertically, give you that, and then, you know, and like do different transformations on the image and then untransform the resulting labels. So you get then a stack of labels that you can then average. That's all that's doing. But it can be quite a powerful thing. It just depends. I typically said it's a false unless I know that I need it. And whether or not I need it is just determined experimentally. Evan's gonna talk a little bit about in the end about segmentation zoo and what we're doing there in order to make some of these decisions a little bit easier to follow, a little bit easier to construct your custom solution for your own implementation. I will say that, you know, once the model is trained, that's not the end of the work. You still need to kind of decide how best to use that model on your data. And that's what we're talking about here. Like these simple decisions about how to use that model and how to best optimize the model outputs. And then finally, I wanted to say that the pred second images in folder script allows you to give it multiple weights as well. So when you run the script, the first thing I ask you again is where's the images? The second thing is where's the weights? And then it will say, are there any more weights? Like, do you wanna add any more weights? And what it's asking you there is, do you want to run the model in like ensemble mode? Do you wanna provide multiple models and then get an ensemble model output? And so I wanted to do that too. And that's what I did here. And I looked at the outputs and they're very similar for this particular example, but there's definitely situations where you get a much better prediction. If you do that, your models are generally, they have to have the same target size, but other than that, they could be constructed in different ways, like with different loss functions or with different batch sizes or whatever. And so they're a slightly different realization of that model. And you just over, basically using this as an over sampling exercise, you were just kind of using these models again and again and again and then averaging the outputs in the hope that you get a more stable outcome. Oh my God, it's 10, 30 already. I've been talking for a long time, haven't I? I guess this was a large thing. Do we want to go through this like live demo thing or have I kind of conveyed sufficient information here to, you know, cause I've talked to- I think the helpful thing is for people to actually see you run the command line. Okay. So that people understand what it looks like and what to feel comfortable with. Okay. Both of them don't need to be, like we don't need to go through and fully train it, but I think it's really helpful to see the windows. Okay. So maybe I should do that. I'm just on a smaller dataset then, right? Because I'm going to want a few images. Maybe I'll just do that. Okay. So I'm here in Signitation Gym. The first thing I'm going to do is make ND dataset. It's going to ask me where I want to put my output files. I'm going to navigate to the directory. And I'm just going to run with this dataset six, which is a smaller version of the date. Oh, sorry. No, I'm not going to do that. I'm going to run. Yeah. I'm going to run to my gym project. This is my NPC for gym. And I'll just dump them in there for now. I can tidy them up afterwards if I need to. So I'll say, okay. Then I'll select my config file, which is just usually the next level directory up. So you don't have to navigate too far. I'm going to run with this same one that I did before because it has all of the same settings like the target size that I want, et cetera. Then I'm going to navigate to my labels. So I typically, as you see here, I've got a folder that is called the name of my project. And that contains all of my labels. And then it contains a sub folder that has all of the gym stuff inside it. And I just do that as much as my personal preference just to make sure that I can keep track of where all of these things that relate to one another. So the first thing I'm going to do is label files. I'd say label files first because they're more important than many respects. And then I'm going to navigate to my images. And here's where it says, do you have more directories of images? As I said earlier, that's kind of pertaining to this situation where you might have other grayscale bands that are the same in this situation I don't. It says, your first clue as to whether this is going to work is that, okay, you've found the same number of images and labels. It uses, internally it's using this, it's sorting those files, right? So it's making sure that the images and the labels are related to one another. You should always make sure, of course, that your image and your label are named the same thing because that's how it's gonna, that's how you know for sure that it's actually gonna find, but they, oh. But it's, the root of the image can be slight, the full name can be slightly different, but as long as they have the same start. So you'll notice here that, you know, that first image is the date, the sensor, the band, the ID of the person who labeled it and then label. It doesn't have to have that label at the end, but it's just for my own purposes that I wanna do that. As long as it has like the same root, then it will find it. It uses this, the sorting algorithm it uses is called NAT sort. So it's, it uses natural sorting of the images, which is important because we're all used to different conventions for naming files and NAT sort is the most general way that it will sort the files out numerically. So, you know, here it's kind of issued a bunch of instructions and that's quite common. It will basically say that, oh, sometimes TensorFlow can be compiled in different ways or it has different instructions and compiler flags that it's not used or whatever. For the most part, that doesn't matter. If you really wanted to troubleshoot that, then it gives you the information that you need to, but it's just a warning. It's nothing that you really need to pay too close attention to. One thing that I always like to see written up on screen is that it's actually using the correct GPU that I specified. The GPUs are kind of just specified as number. If you have more than one, then you'll have zero, one. But typically you're just using zero. The devices, the PCI ID number is zero. It's going through that it may already, it resized the files, made the non-augmented version of the files and what it's doing right now is making the augmented version of the files. The reason why it's, so it has this little wait bar and the reason why it's like one of five is because inside the config file, I asked it for five copies of the data. So what that means is that I'm going to get five versions of every single image and label pair. So it's going to flip it horizontally or randomly. It's going to apply a random augmentation, transformation based on these parameters here and it's going to make five copies of them. You could just have two if you want one or you could have 50 if you needed it. That's up to you, typically five is good. But that's again, it's one of the little levers that you can pull here. And can we look at watch nvidia-smi? Oh yeah, good call. And top or htop if you have. Sure. So here, this is the nvidia-smi. You can see that I'm not using all of my memory here, but it's using the memory. You can see that this Python script that I'm running is using close to 10 gigabytes of memory. Top, yeah, I can do top. Do you have htop installed? Not sure if I do, I don't know, I don't know this. I can install it, but it's going to give me the same rough information, right? So I can see here that it's using, oh look, zoom is using most of my CPU. Python is using the second amount of CPU here. And you can see that it's quite well optimized. It's using very little CPU memory here. Anything else you want to talk about here, Evan? No, I just think that those are the two things that I often use when I'm running a model just to see that everything's working appropriately. Htop to look at the cores, how many cores are being used and the utilization and the GPU utilization with nvidia-smi. Yeah, I would have to do a little installation to do htop. Oh, I did my password wrong. So while that's doing that, I'll go back to just see, I'll go back to here where I specified I wanted my model out. You've seen this already. It's just making the files. We can have a look at some samples. You'll see that they're very similar. This is just the subset of the data. Here we can now do htop. And this is what Evan was saying. We've got the top here, we've got each core. I've got 24 cores on here and you can see that they're barely being used but all of them are being used. It's highly parallelized. And that's what you want to look at. All right, but you may not be a Linux user so that might not be available to you. I'm not a Mac guy. I don't know. I'm sure most of this is available on Mac too. I just don't know. Nvidia-smi is available on every platform as far as I'm aware. And then you can see it's done. So it's just using a baseline amount memory here which is just keeping my Firefox and my Zoom open. Okay, so now I've made the data sets. It's done everything it needs to do. The next thing I do is train model and it's going to ask me for that directory of data files that I'm going to navigate to. And that was named gymproject, npz. Sorry, I'm going a little faster as we've only got 20 minutes of left of class. And then here it's asking for that config file which is here and off it goes. So it prints out a couple of different numbers which is just, that's like the number of test samples and validation samples, creates and compiles the model. Again, sometimes you get these error messages. They're just warnings though. Like that there is just saying, I loaded CUDA DNN, right? Great, we don't need to know that. But this thing here, for example, this is kind of really specific like chip set stuff where you're kind of like, you might have like a particular chip set, like an Intel chip set. If you really wanted to go through that, it doesn't seem to impact performance. Here it's stepping through. So we're on the first training epic. It has 165 steps, training steps. So here that's 165 training steps and it will have 115 validation steps I think. And each step is just a batch. So it will go for each epic. We'll see all of the data, but it breaks it up into what's called mini batches. So it'll just feed in 60 or 20, whatever I've specified in my config file, it'll just feed in that mini at a time. And each time it's passing through forward, passing backwards, and then off it goes. It just goes back and forth, back and forth, back and forth. And then it compiles all of that information and it kind of does the validation step. And so it accumulates all of that information. And this is why TensorFlow is kind of sophisticated and awesome is that it really helps, it takes all of the work out of training models like this efficiently because it's making sure that there's always enough imagery that's getting passed to my GPU. If we go over to my Nvidia SMI, you can see that it's using a good amount again of my GPU. It's making sure that I'm not using too much or too little. It's giving the operating, it's leaving a little bit room for the operating system so I can actually make this call, but you might notice that my video slowed down a bit because it's using my GPU and I've just got one on this machine. But it's making sure it's kind of communicating with the operating system continuously to make sure that nothing's gonna break too badly. Don't worry if it gets hot because I think ADC is when it starts to throttle the GPU. But you could cook eggs if you wanted to in your computer. Right, so, you know, this, like I look at the fan and that's really, really, really high then it's kind of really working quite hard. What's more important is to look at this number here. And you'll notice that, so during the training step, if I go back to here, so it's on the validation step at the moment, it's using slightly less of the GPU, but during the, and you know it's on the validation step because it's paused, right? It's stopped updating the training effects and it's just now stepping through the validation sample which remember is 70% of the data. And now it's on Epic three. And so you'll go back here and you'll see that this number should climb back up to 100. So what that means is that we've done our job correctly, well, mostly correctly, in that it's making sure that the utilization of your GPU is maximized. The only reason why it's dipping below 100 actually is because I'm on the Zoom call. If I wasn't on the Zoom call, that would be pegged at 100, but it's just, it's periodically communicating with the operating system to make sure that these other process IDs have enough memory. I wanna mention that the, when you look, when you see these batches being passed through, there's some great information that you can use to debug. First of all, it's the loss in this case, if you've set it to dice loss, that's just the one minus the dice coefficient. So you can just look at the add the dice coefficient and the loss together to make sure that those are one. If they're not, you know, something is a little bit messed up. Usually it's just some incorrect thing with your, how you've, your labels and your images. So it's a good indication that you should go back and look at those AUG and no AUG samples. The dice should also be above the mean IOU. That's another example. And you should actually have numbers and not NANDs. If you have a NAND, an NAN, appear for dice or mean IOU or a loss, you know that something has gone awry usually with the data. And there's other more sophisticated things that have gone wrong if your data is verified to be correct. And that has to do with some of the opinionated stuff that we've done. Specifically that the model itself is using mixed precision to train. It's not using 32-bit floating point integers for every 32-bit floating point numbers for the entire, all the weights and biases in the model. It's using reduced precision, so 16-bit precision. So sometimes there can be underflow or overflow problems that occur and you can, that results in NAND loss. So if that occurs, just stop and let us know. You can do that by again. Dan, can you open up the actual printed out model, the Keras plot model figure? Nice thing. Yeah, isn't, can you, does the plot model still get rendered, not the summary? Oh, I haven't installed it. This is a new machine and I haven't installed it. Oh, okay. But I've got an example of that. Though I made sure I had an example of that. Where did I put it? Oh, God. Now you're trialing me here. Like, there's something like this. Yeah, this is an example if you need to look at the architecture of what your model looks like, the train script will also export this typically if you have it installed. If you have the pieces installed that the content environment should install it. But this is a great example of what the actual model is, what the model looks like. So if you want to look at the model at the same time that it's operating, you can see it or sort of if you need a figure for a paper or something like that. This is one of the outputs in the train file. Yeah, and in our gym paper we actually modified this output and we kind of just flipped it on its side and we color coded the different layers and that's really it's just a modification of this. And this is like, this is when we talk about residual connections then there's one here, there's one here. These little inner loops, they're the residual connections and then these bigger arrows, they're what's called skip connections. So the difference between the vanilla unit and the residual unit is that the vanilla unit has these skip connections which preserve spatial resolution between the encoder and the decoder layer. But the residual connections are absent from the vanilla one and that's basically, you see the add functions. This is what I was talking about earlier about it's kind of it's branching off in two different ways and it's then combining that information cleverly. These are the sorts of time wasting activities that I tend to do and the model is running because you sort of want to flip back after a few abbecks and look at the and just go down the list and say, okay, the mean IOU is increasing the dice is increasing, the loss is decreasing and it's the same for the validation loss, the validation IOU and the validation dice. These are just the indicator. So it's nice to stick around for one 10, one to 10 epochs to make sure everything is moving in the right direction. Yeah. You know, and you can kind of stare at these quite a lot. You know, there's so much information in this screen that you can spend quite a lot of time when you're first running the model before you step away and go play guitar or make a cup of tea or whatever you've got to do, answer emails. You know, you can see here like my validation, I'm looking at my validation dice and I'm comparing that to my training dice. And I can see that, you know, as expected, my training dice is a little higher but it's not horrible. My validation loss is not horrible. The dice coefficient is going up, the loss is going down and the losses are comparable. And these are all things that you kind of want to make sure that are working, you know, obviously you want to make sure that they're not NANDs and things like that as well. As to how long this is going to train for, I could not tell you because I'm not trained on the specific dataset, but it could, it's not going, I can tell you now it's not going to take 100 epochs. It will probably be done after a little better amount of time. We've got 12 minutes left of our class and I should probably, do you want to step through that second images and folder script or should we now transition to talking about the rest of the Zooniverse or the Doodleverse? I mean, we've got this, as we mentioned before, we've got this other thing that's called segmentation zoo and that has these implementation scripts. Evan, do you think it's worth taking a bit of time to talk about that or should we open the floor for questions at this time? I think we should do questions. I linked to that one, the one, the most basic notebook in zoo is linked in the chat. That shows you what it's like to open up a model and send one image through. I think it's interesting just to understand how to send one image through and what the results are of sending one image through and the steps you need to make. And that's for if you want to do something outside of sec images and folder. So that's linked. I'm happy to go over it, but I think it'd be more helpful if we just had it open for questions. Okay, it sounds good. So I've lost my questions because I can't actually see. When I go into sharing mode, I think every other window and zoom disappears soon. Yeah, you can stop if you want. Should I stop sharing? Okay. There are no questions right now, but if anybody has any, and I don't think we have, the wrap up slide was really just to say thanks everybody for participating. And if you want to, please reach out on GitHub to let us know if there are any issues at all. And if you'd like to participate in the spring, we'll probably do some sort of hackathon or sprint to get some of the, some things that we want done in the doodle bar. So it'd be awesome to have more participants. So that any specific. Yeah. So please, if you have questions at this moment, I know there's a lot of information that we implied, a lot of that is duplicated on the wiki, you know, most of the things that I've said today and Evan said today are reproduced in that wiki. So we understand that you don't have questions right now, but you might have questions later once you've actually had a bit of time to digest this and if you've actually run through this, but I'd be interested to hear if anyone's actually been able to run on the test, like if anyone's actually got to the point where they've made their condo environment and that they've run on either their own data set or on the test data set. And if you've encountered any problems or if you have any questions at this time, now would be a good time to do. If it's still got 10 minutes left of the class or any other questions might have. So Stephen asks, what might be the issue if the ensemble model looks to put out an almost identical output as a single model prediction? I mean, the short answer would be that your models are very similar, then providing very similar results. There may be some details in there though, like depending on exactly how you ensembled your data, sorry, your model outputs. I think the way that it's set up in the script is that it takes, so each model will provide a separate set of softmax scores. I think it's just a very simple implementation where it takes all of those softmax scores and then just averages them. And then it does the arc max. So if you get a very similar output, that probably just means that the average softmax score over all of your models was very similar to any individual model, which speaks to those models being very similar themselves. But it will depend. Yeah, of course. Anyone else got a question? And you can unmute if you don't feel like typing. Okay, good seeing you, Julie. Sniffer. Yeah, thanks. This was good guys. All right, catch you later. Yep. Can we briefly explain Sniffer? So Sniffer is not directly related to Jim. Sniffer is a program that's being developed by Sharon Fitzpatrick, who's on the call here. I'll allow you Sharon to explain what Sniffer is if you like, but if you don't want to, then I will explain. All right, I'll explain. The purpose of Sniffer is to, it's called Sniffer because it sniffs out the bad images in your dataset. That's the original intention of it. We wanted a program that we wanted to be able to basically categorize imagery that good or bad images. So for example, with this particular model that we run today, you'll probably notice that if you're familiar with satellite imagery, you'll probably notice that quite a lot of those satellite images were quite good, right? They were crisp and clean and they didn't have clouds and we had artifacts in them, which is quite common for satellite imagery. So we've used Sniffer, for example, to manually, like it's a web interface that you load your images into and then you click a button that says, yes, no, good, bad, good, bad. It's recently been modified to have a kind of a linker scale where you can grade how good or bad. It's like one to five or something like that. But it's basically, it's just a small utility that we've developed for the purposes of kind of classifying whole images. Yeah, no worries. So when tuning the models, do you typically have a workflow for that? Yes, we've briefly touched on quite a few of those decisions that you could make in a config file. I'm sure my workflow is a little different from Evan, so I'm gonna say mine and then I'll let Evan speak to that. So the first thing I do is obviously I'm gonna decide what target size I have because that's gonna be dependent on the amount of memory that I have in my GPU essentially. So that's the very first decision that I make. That goes hand in hand with the batch size. I'm gonna play around with different batch sizes until I've got the biggest batch size for the target size that I can fit on my GPU. If I really know that I'm gonna need a much larger batch size, then I'll have to use a different computer or that has multiple GPUs or whatever. But for the most part, that's not the reality. The next thing, so I'm gonna train a model. The next thing I'll probably do is then look at those outputs and decide what configuration thing I can change easily that's gonna make the biggest amount of difference. And those two things are gonna be either the loss function or the learning rate. The loss function is the easiest thing to do because you just have to specify something else like KLD or Hinge or whatever. You can look up all that, we've documented some of these loss functions and you can look them up. They're just standard Keras loss functions. But that's gonna make a huge amount of difference. Then the learning rate, I would say would be the next thing where we talked about modifying that curve. So it's using different starts and different ramps. Then I would say if you're kind of still in the process of troubleshooting the model, then I would say the next thing that I would tend to go to would be the kernel size. We tend to use a fairly large kernel size. Like I typically would use like a kernel size of seven or nine. That's a little larger than you see typically written out in machine learning papers that use Unets. And I think the intuition that I have over it, and it's talked about in the paper a little bit I think is that we've got spatial data. So Tobler's law applies, things that we've got spatial autocorrelation in the imagery. We've got this, we're thinking about this concept of image stationarity where different parts of the scene similar to one another. That means that, like for example, if you have a picture of a dog against the background of grass, then you've got a specific object in the image that you're trying to capture. So there's less stationarity in that image than there would be in a spatial scene of like water or landscapes, where you can choose a corner of the image and it might be similar to another one. Then if you're still not getting good model results, then I would then turn to regularization things like dropout. One of the things that we didn't talk about was the dropout but there are many things that you could do with the dropout, you can change the amount of dropout, you can change how much that changes for every layer, you can change the type of dropout, you can specify whether there's dropout on the encoder or the decoder. That's all the model things that I would send to do but really what I really do is I do the first couple of things, I'll look at the batch size, I'll look at the learning rate and the dive sponge and then if I'm not seeing improvement, actually I'll go back to the data and really it's the data I think there are many cases that is going to make your models better. You can quite easily get to the point, and this is the beauty of gym I think is that you can do this experimentation so quickly that in a single day you can have loads of different models and you can really reassure yourself that you can't do that many more tweaks on your model. So my advice is to go back, make better data, make more data, include more data and troubleshoot the data if it needs troubleshooting. Over to you Evan. Yeah, I don't do any of that in that order. The thing that I do is I make the batch as large as possible first, like I try to break the GPU. So 16 or 24 is the batch. And once I've done that, I might change the, I'm looking through these now. I might change the kernel and make it larger but seven is I like prime numbers, it works really well. I change the patience to be larger, make the patience 20 or 30. So that's how long we'll run the model even with low learning rate to make sure there's a little bit of a decay that you can squeeze more efficiency out of the model or more success out of the model or optimization. Then I change the ramp up also a little bit higher so that it takes longer to warm up. And then I switch immediately to augmentations and I try to over augment. Again, I'm trying to break the model by over augmenting it. So there's a sweet spot with augmentation where if you are under augmenting, there's more, your model could get better. If you're over augmenting, your model is getting worse. And so there's some sweet spot with augmentation. So I do that. And then I don't do anything else and I just label more data. Like it's always, for me it's, I'm always at the limit of low data problems. Yeah, me too. So always, always. Like for me it's just biggest batch, biggest augmentation and then go back to the data. The secret here and why I wanted to say that Zoo notebook is helpful is if you look at it then you can actually look at the probabilities for each class, the softmax scores or you wanna look at the outputs and say, what is misclassified here? What is done incorrectly? And those are the images, not those specific images but you wanna find more images that are like that that have that hallmark color and pattern on the RGB images and doodle those. So you're in this virtuous cycle of seeing where the model is failing and then trying to correct that failure and then going back to making models. And I think that that sort of virtuous active learning style loop is what you quickly wanna get into. So the goal always with low data is to try to break the model and see where the breakages are occurring. Yeah, good advice for sure. And I'm always in a low data environment too. Like I know, I just know by intuition and experience that I always need more data. And so if I'm in a boring Zoom call or if I'm stuck in front of the TV or something I've got Doodler on, I'm making more data because that's what I know I need to do. All right, we have past the hour here and there's people signing off. Don't wanna cut into your time too much. This recording will be available and will be emailed soon as Lynn just said. So you can always run through this again. I know there was a lot of information on this but again, go to the wiki, digest the wiki. I strongly urge you to go through the test data set before you try to attempt this on your own data and good luck to you all. Thanks for joining us. Thanks everyone. Thank you Dan. Thank you Evan.