 Hi everybody. Thank you for joining us this evening and for letting us into your homes. We are still doing this NCAR Explorer series from our own houses this evening. And my name is Lorena Medina Luna. I am an education and outreach specialist at NCAR and the lead organizer for these NCAR Explorer series. And today we're very excited to offer this lecture, which is seeing the atmosphere through machine learning with Dr. David John Garnier. Today we'll be using Slido, which allows you to join in the conversation by typing in your questions, voting up questions. If you see that little thumbs up. That's the question that you are also interested in. You can upload that question. And you can also take part of some polls. And we do have an active poll right now, which is going to be a word cloud that will show once I introduce the speaker and right before he gives his lecture. So go ahead and scroll down to your page to participate through Slido Helping us today, Aliyah and Dan will be using the Slido interface to take your questions and I'll just be hosting and introducing our speaker. So NCAR, which is the National Center for Atmospheric Research. It's a leading world organization dedicated to the study of the atmosphere, the earth system and the sun. Located in Boulder, Colorado, but today again brought to you this lecture to you from our homes. Today's speaker is Dr. David John Garnier. He is a machine learning scientist in the computational information systems laboratory or sizzle and the research applications laboratory or RAL at NCAR. His research focuses on developing machine learning systems to improve the prediction and understanding of high impact weather and to enhance weather and climate models. During his time at NCAR. He has collaborated with interdisciplinary teams to produce machine learning systems to study hail, tornadoes, hurricanes and renewable energy. He has also developed short courses and hackathons to provide atmospheric scientists hands on experience with machine learning. Dr. Garnier received his PhD in meteorology from the University of Oklahoma in 2016 and completed the advanced study program postdoctoral fellowship at NCAR in 2018. In addition to his duties at NCAR, he also serves as a chair of the American Meteorological Society of America artificial intelligence committee. So there's a lot going on and I'm really excited for us to hear more about his work today. And one of the questions that we had from our word cloud, Dan, if we have some participants for the word cloud, can you put up the results because we're interested in knowing in your everyday life, what are some things that use machine learning? Google is the biggest one. I search a lot. We have satellites, modeling cell phones, iPhone, weather prediction, which is some of the things we'll be talking about today. Maybe recommended news. Yeah, so how is all of this machine learning interacting in our lives? So cars are also one and deep learning. So with that, I'd like to welcome Dr. David John Garnier for your talk today and we will be taking questions at the end of the lecture. So hold on to those questions because he might talk about them as we go through tonight. Dr. Garnier. Thank you, Lorena. Good evening, everyone. I'm going to go ahead and share my slides here. Does everything look good? Awesome. So again, good evening, everyone. My name is Dr. David John Garnier. Tonight I'll be talking about seeing the atmosphere through machine learning. As was evidenced by the word cloud, machine learning is everywhere. It's now embedded in our everyday lives. As evidenced by the word cloud, we saw a lot of responses about people seeing machine learning in Google and Siri on their phone, on their smart speaker. One thing I didn't see as much was machine learning is actually now in your car. So we're in the bolder, greater bolder area. So a lot of people have Subaru's and a lot of all the Subaru's now have this eyesight camera so they can actually monitor the road and it uses machine learning to spot the lines on the road and make sure you don't drive off a cliff, for instance. Machine learning also helps drones fly so they don't veer off in some crazy direction and keeps them stable and allows them to do those crazy formations that you see at like the Super Bowl halftime show. We even see machine learning now at the grocery store. Walmart in this example is using machine learning vision to keep track of all the potato chip bags and other items in their store so that when something runs out, someone can come in immediately refill it. Behind the scenes machine learning plays a huge role in our modern economy. A lot of companies use machine learning to predict when certain goods are needed so they can order more of them and whenever you make an order on Amazon, these robots will work in the Amazon distribution center and bring the shelves to the employees to pick the items off and box them up and send them on your way. These are just a few examples of the myriad uses of machine learning. In tonight's talk I want to focus on first how does machine learning work. We kind of assume it's just this big black box but really under the hood it's there's a lot of relatively straightforward concepts underlying it. Since I am in weather I want to talk about how machine learning can help us forecast extreme weather with examples focusing on hail storms and hurricanes. Towards the end we'll talk about what are some of the challenges associated with machine learning and as well as what can machine learning learn about the atmosphere. So first what is machine learning? To provide some context I wanted to define a few terms surrounding machine learning because it's easy to get confused with AI and machine learning and deep learning and robots and all these different pieces. So first I wanted to define artificial intelligence which kind of encompasses a large part of the field is now kind of a catchall term for methods for computer systems to perform human tasks, whether that be playing chess or answering questions from someone on your phone or recommending a movie or predicting the weather. There are a lot of ways to accomplish artificial intelligence. One of them that was popular especially back in the 1980s was something called expert systems. The idea behind this is that you ask the expert a bunch of questions about how they make their decisions and then you write that up into a computer program and then you just run the computer program. And this works pretty well in very focused situations but because their experts don't entirely know how they're doing the things they do, we couldn't write down all the rules to do that. So a lot of people started working more on machine learning, which is a computer system that learns to perform tasks by reviewing large amounts of data. Essentially you have a large amount of data where you have some kind of example of something the machine learning model would see and then you have what the outcome should be and you show it the input, the view of the scene, you see what it does and if it doesn't do the right thing you penalize it and then you repeat this process a bunch of times and eventually the machine learning model learns what to do. And when I say model, model is another term for kind of a focused computer program. You have machine learning models, we have weather prediction models that are based on physics and equations, they're both computer programs but are kind of different constructs with them. Within machine learning, we also have deep learning which has gotten a lot of hype in the past five years or so. Deep learning is a subset of machine learning that focuses on neural networks with multiple specialized layers and these specialized layers focus on learning spatial and temporal information and other kinds of structure and data. Deep learning has really led to a lot of the big advancements and things like computer vision. Recognizing faces and cats and dogs and also tying machine translation and lots of other applications we'll talk about in the talk. What do you need to build one of these machine learning systems? First, the most important step is making a well-defined problem. It's easy to just say I want to do machine learning for X thing, but if you don't define the problem well, you will not get a good solution. So the first question to ask is what is the goal? What are you actually trying to do outside of just build a machine learning model? Are you trying to predict tornadoes well in advance? Are you trying to get your package from Amazon to your house in a day? What is the goal? What data are available? How big of a data set is already available? How long does it go back in time? What is the coverage of the data? What are the constraints? Do you need a prediction now? Do you need it in a week? How big of a computer do you have to run? Is it a phone or is it a supercomputer? How is the performance judged? What kind of error metric are you being compared against? Do you need to do really well on one certain case or are you looking at how you do on average? And then what is the current approach? If it's an important problem, someone's already solving it somehow and there's probably some issues with that. So can we overcome those with machine learning? Next, you actually need a data set. Most of our machine learning data sets look like some kind of data table. We have metadata, which is data that's not used in the prediction directly, but describes it. That can be things like a date, a location, a time, a person's name. We also have inputs. In the weather field, this would be things like temperature, pressure, wind, etc. It could be a picture of a person's car. Then the output is what we're actually trying to predict. So if we're trying to predict hail, we have a yes or no on hail. It could be something else, like what color is the car? We have a machine learning model. So there's usually more than one machine learning model that might fit the problem. So we pick a few that seem suitable and we test them out and pick the best one. Before we kind of get to the present of machine learning research, I wanted to dive back into a bit of history of machine learning. Even though a lot of the height for machine learning has really built up in the past decade, the underlying concepts that drive machine learning date back decades. The first kind of neural network predecessor was Rosenblatt's Perceptron. Here's a picture of the computer that was underlying in Rosenblatt back in the 1950s. In the 1960s, we had some further developments in the initial push for machine learning research with things like the first tic-tac-toe machine and the development of the nearest neighbor algorithm. In the 70s, we had the first AI winter when machine learning funding was drastically cut and a lot of the excitement of the initial type of machine learning research died down. During this time period, the research continued and we had things like the first multi-layer neural networks, which is the kind of the forerunner to our modern deep learning, happened back in the 70s. In the 80s, we had another explosion of machine learning research. In particular, we had the first decision tree systems were developed as well as expert systems that I described earlier. And convolutional neural nets, which we'll discuss more in detail later, but those are the systems that are now underscoring our computer vision. And you might say computer vision that's like recognizing images, so to say, looking for cats and dogs. This was first developed in the 80s, but it really didn't reach its promise until more recently. In the 90s, we had the development of something called a random forest, which we'll describe more in detail later, but it's an ensemble of decision trees. We also had things like computer aided cancer detection and we had deep blue, which was the computer program that beat grandmaster Gary Kasparov and chess pretty famously. In 2000s, we had machine learning research continue. We had the first paper that actually called it deep learning in 2006. We also had a number of competitions set up around testing out the potential of AI. The most important of these was the ImageNet competition that basically created a data sets of millions of images and to have a benchmark for all the different teams around the world that are working on machine learning for identifying things and images to compare their methods against. And so the development of things like GPUs and the iPhone that were really important for the, basically the deep learning revolution. So we're in the 2000s, a lot of the building blocks for machine learning had already been developed, but they had to come together and in the 2010s, this is when we saw a lot of the big advances. In the ImageNet competition in 2011 we had the first convolutional neural network model that could perform 10% better than all the other machine learning methods tried before then and that basically really kicked off the interest in machine learning and started seeing a lot more corporate investment. Just a few years later, Facebook unveiled their first facial recognition system. And a couple of years after that we had AlphaGo, which is a deep learning AI system that played the game go and defeated the top go player in the world, Lee Seedle, shown in this picture right here in 2016. And based off a lot of kind of interest curves and things like Google searches and stuff that AlphaGo and kind of some of the machine translation work that Google was releasing at this point with machine learning. The hype really started taking like skyrocketing right around this time, since then we've also had explosion of digital assistance, autonomous vehicles, AI being everywhere. In addition to kind of the broader history of machine learning, I also wanted to talk about some of my personal history. I got into machine learning first in 2007 through my first machine learning internship. This was at the University of Oklahoma. After my freshman year, I had applied for a research experience for undergraduates. I initially got rejected, but then because I was the only person with computer programming experience, my application was selected by Dr. Amy McGovern, shown right here, to work on a summer internship doing machine learning for storm type classification. So we're trying to predict whether a storm would be a isolated pole storm or a linear storm. And so I spent a summer doing clustering and decision trees on a bunch of storm data. Here's my computer setup at the time. And this experience really got me hooked on doing machine learning and I kind of kept working at it and Dr. McGovern kept me around. After that for quite some time, I ended up becoming her master's student and then her PhD student and now I'm still collaborating with her even though even now I'm at NCAR. So it was the kind of happenstance, but it led to a very fruitful collaboration and kind of changed the direction of everything. I wasn't really thinking about machine learning before this, but then the summer internship kind of changed everything. The other major catalyst in my machine learning interest was in 2008 I attended my first AMS AI conference. At the time it was a fairly small group we had about 32 presentations that year. And thanks to support from Dr. McGovern, I was able to keep returning year after year. And there I met a lot of other mentors or a really tight knit group in the machine learning community, especially people in this picture like John Williams and Sue Ellen hopped and Philly T so and Vladimir Krasnopolski. A lot of these people I ended up working with in one form or another. John and Sue both brought me to NCAR for my first NCAR visit and then Sue became one of my postdoc advisors. So kind of the networking aspect was really crucial and kind of seeing the broader array of machine learning research. And for a long time we were we were kind of discussing among ourselves we think machine learning is really awesome why isn't anyone listening to us we're, but then around 2017. So I got my PhD around 2016. Suddenly people started listening to us and people started showing up and submitting talks and we quickly grew from about 30 talks to this past year having almost 200. And the interest in machine learning and the atmosphere of science that shows no sign of abating. There is not just this conference but there's also many other ones that have sprung up we have in 2011 we had the climate informatics workshop. And a couple years ago we had the first no AI workshop and now there's so many AI workshops. I have been losing count. In addition to start my NCAR connection in 2014 through 2015 I spent a year at NCAR through the ASB graduate visitor program. Here's a picture of me. Not working at NCAR but Rocky Mountain National Park. So do in fact have fun in between doing all the machine learning but This kind of broaden my horizons beyond just doing machine learning for severe weather forecasting I did during my time here I worked on machine learning for solar energy forecasting in car as a really strong track record and machine learning for a lot of different applications but Wind and solar energy were one of the real big success stories of the 2010s we worked with Excel energy to develop a machine learning forecast system for them that saved their customers a lot of money and helped Excel justify building a lot more wind power in the state of Colorado. And we've been developing more solar systems and this is an example showing how we could do solar energy prediction in Oklahoma. And now we're doing solar energy prediction in places like Kuwait, where there's a big ongoing project with them. So, so that was a bit of history of like my both the broader machine learning journey and my personal machine learning journey. Now I kind of want to talk about machine learning in the context of weather forecasting. So first, traditionally weather forecasting is done by capturing a lot of observations from satellites ground stations balloons airplanes boats. And we feed them on to our big computer models. So these are these are all physical equations that we run forward in time out more days and weeks. And we run multiple versions of these to see all the different weather scenarios that might occur. And then we send all this data into our weather forecast offices where our human forecasters have to process all of this data in order to make their forecast. The challenge they're running into is that there is so much data now that it can be hard to find all the relevant patterns in the short time they have to make a forecast. AI can step in AI can also AI machine learning can also look at all the observations and all the model output and feed it in summarize it correct errors in it and both feed that to the forecasters or also just directly output its own forecast through through things like the weather channel app or, or other application systems. Ideally, the forecasters can add a lot of still add a lot of value on top of the AI systems and the I see the AI systems as most importantly providing guidance for the forecasters to to enhance their ability to find patterns and find other things. But there's also room for AI to to produce its own forecast and help serve serve people in cases where where the forecasters. Like you want more quicker updates or other things the sometimes the you can buy us you can use more of the AI and less forecasting but in general, I think it's the we get our best results when all these systems combined work together to provide a really a strong calibrated forecast that's targeted to the needs of our users. For our first weather example I wanted to focus on hail prediction. For a lot of our Colorado viewers hail has been major hazard in your life for the past few years. Most notably in May 2017 there is a $2.3 billion hailstorm that hit golden and West Denver that that shut down a mall out there for about six months destroyed a lot of cars and houses. And, but as I believe Colorado is costly to weather event in history. A more example closer to home for for in car this is the parking lot of the in car foothills lab. This was a lot of small hail but it was so much that it flooded the parking lot. So hail can cause hazards in numerous ways. If you want to get large hail. How does this happen. First you need a thunderstorm preferably a super self under storm. If you want to grow hail you need some hail embryos and you need to basically really small hail like kind of really small proto hail stones. And you grow and you send them into your storms strong wide rotating up draft and then along the way they pick up super cool liquid water and grow and grow and grow until they become very large. Once they become so large that they can't be held up anymore they fall down in the process if you want the hail to be large to the surface for meteorological purposes of course. You want your melting layer to be close to the ground. So basically the cooler it is the less the hail will melt before reaches the surface and also if it's drier that that can also help. So if everything comes together you can get really large hail the problem, the challenge in forecasting hail is that a lot of things have to come together at once in the right place, and that makes it very uncertain and very challenging. Another thing that can come in is is that it can look at all these factors and weigh them together and look at different situations and find where hail might be more prevalent. I also at this point want to acknowledge this work has been funded by the National Science Foundation and Noah and collaboration with with my colleagues at the University of Oklahoma and NCAR. They're all listed down here so so this is see throughout and if I haven't emphasized already this is a team effort that to build these kind of machine learning systems that requires input from from a really diverse array of experts. The first machine learning technique I want to talk about is something called a decision tree. It's one of the simplest machine learning techniques but also very powerful. The decision tree is essentially what what is called it's a, a kind of a flow chart. So you start with a decision node, which has it as a question in this case the question is, is the wind difference with height greater than 15. If yes, then you go down the yes branch, if no, you go down the no branch. It takes you to another decision node, and you follow that one way or another, and this can go for for quite some ways, but eventually you'll reach a leaf node where where the actual outcome is determined. In this case, if you go down these two yes branches we get a 76% chance of large hail. You can calculate it by sending a lot of data through this decision tree and seeing based off of the data that ends up down here, how many of those examples or cases produce hail or tail versus how many didn't and that allows you to calculate a probability. You could also calculate like the size of the hail by averaging the sizes of the hail examples that made it to that particular node. So decision trees are very flexible. They're also very interpretable because you can see you can basically read the entire model out from the screen. It's just these all these questions so you can make rules out of it. So, so the question is how do you actually get all these questions how do you grow the decision tree. First, you need a bunch of data. So we have our data example here from the storm prediction center sounding analog retrieval system, which has a slightly unfortunate acronym these days, SARS. But it's a very useful data set in terms of seeing if you have small hail versus really big hail, two inches being like something that will damage your roof pretty effectively. The first step you want to do is look through all the possible questions that you can ask about your data. And this involves going through your input variables such as storm potential energy and wind difference with height and trying every single threshold and saying is the wind difference with higher than 10 is a greater than 20 is a greater than 30 is a greater than 40. And with each question you check how good is that question that at splitting the data into more similar subsets. Eventually you pick the threshold in this case 15 that best splits the data, and then that allows you to make make the first lease notes of your tree. And once you do this, you have a question that causes the largest reduction prediction error, you repeat the process it's kind of a brute force search that you repeat on each subset of the data. And as you do so we start finding more splitting thresholds that break up your, your big area into smaller and smaller subsets that are more concentrated and more exact. So we start seeing a pattern here where with small storm potential energy and small wind difference with height we generally have small hail and you have large of both. You tend to get large hail, but in between. It's a little bit more uncertain. But before I go on, one of the challenges associated with decision trees is that, while they are very simple and very interpretable. They are often not very accurate because like the, they can only make do these kind of simple splits. While you can do a lot of them, it's, it's sort of limit you rent you can run into problems where it sort of makes no noisy subsets that have very few examples in them. And it's often not the most ideal model for everything. But decision trees and in a group can can be very powerful if you if you treat them in a special way. So the, the approach that often gets used in place of a single decision tree is something called a random forest. Random forest is an ensemble of randomized decision trees. So the reason why they're randomized is if you took the same data and put it through the tree growing algorithm you would get the same tree every time. So if you want diversity in your decision trees you need to randomize the training process. And the first step of that is you resampling the training data so this means you put it all in a hat. You draw your all your training examples out of the hat again. But every time you draw you also put them back in so you can it's called resampling with replacement. In this process, some examples will get repeated multiple times and some examples won't appear at all. But doing this allows you to get a basically a more varied training data set. You also want to grow your trees with a search through the random subset of the inputs at each branch instead of all the inputs, the speeds up the training process and it also makes the trees more varied. So if you have a lot of input variables this can also be useful at making your tree more robust so that it searches not just the strongest features but maybe the second strongest it will pick up on those and that and that strength will add to the diversity. Once you have grown a bunch of decision trees you average them together and get a prediction that's guaranteed to be on average better than any of the predictions from the individual trees and better than the single decision tree prediction. These models are often what I call the second best model for machine learning on most problems because they don't require a lot of tuning they're very robust to noise and they're, they're fairly compact, they're fairly they're still fairly interpretable. They have a lot of great properties so they're commonly used for a lot of more traditional machine learning problems. So what does it look like when you go from a decision tree to a random forest. Well we take our decision tree here. And then we replace it with the, we average a bunch of trees together, and we get something that looks a bit smoother it still has some of the jaggediness associated with the single decision tree. So we can see some of the areas where where there's a little bit more uncertainty and it has more of a smooth gradient and certain directions instead of this sharp jump in probability at a certain point. How have I used decision trees in practice to do machine learning. In this example we're going to use machine learning for hail prediction. We started off with a numerical weather prediction model so so this is one with solving the physical equations of the atmosphere. We're in particular using something called a convection allowing numerical model which allows has high enough resolution, which means the grid cells are small enough to capture individual storms. So we run an ensemble of these so a whole bunch of them with different initial conditions. We extract all the storms from this updraft speed field so we're looking for for the strong updrafts because those are the best candidates for for hail. So you can buy information about their environment like the temperature, the, how the wind changes with height, how the temperature changes with high what the moisture looks like all the ingredients the one might need for hail. We match them with radar estimated hail sizes so we know where the hail actually occurred. And now we have the inputs and the outputs for machine learning model. We train our random force to predict hail occurrence whether or not one of these hail swaths will occur with one of these updraft swaths. And then we also predict how big the hail is going to be, which is called a size distribution. We combine all these all this information together and we use that to generate both swaths of hail size as well as probabilities so in this day we had some the green areas show where there's a greater than two inch hail predicted the kind of contour show the probability of hail over a 25 40 kilometer area 25 mile area. And we can see that this lines up pretty well work with where the large hail actually occurred so this is so we consider this a pretty good forecast. Obviously there are some false alarms that the storm model predicted hail where and texts where no hail was observed but the environment was still favorable in that area so so there was some reason to believe hail might happen. The algorithm is also good at filtering out areas where the model may have had storms that were something that was slightly favorable for hail but no hail occurred and by reducing these false alarms. This is where the machine learning model can be useful I can help forecasters zero in on the main threat areas and well in advance. So this is a forecast that was run like a day before the hail actually occurred. So, so we can kind of help nail down those areas and there's even some signal predictability well in advance and that's really exciting. We're currently in the front we've been testing this out running in real time for the past couple years. One thing we found is that by adding an additional calibration steps to trying by calibrating we mean trying to make the probabilities more in line with what the forecaster would expect. So we find that in combining this with the machine learning model, we can get performance in terms of something called the briar skill score which is a measure of probability, like how good a probability forecast is that we're close to that of the human forecasters. So, if the human forecasters use our product they could be even better. That is our hope with deploying these kinds of systems. So that was kind of an introduction to hail hailstorms now I'm going to sort of shift focus a little bit to deep learning. Random force are good and all but they don't necessarily work great on on every on every problem and they can only take in certain kinds of data. So, there are a lot of other machine learning techniques that that that have other advantages associated with them. One of them is something called a neural network. Or an artificial neural network. The artificial part is in contrast to a biological neural network, aka your brain. But the connection beyond there is fairly loose I should say an artificial neural network basically consists of a series of layers. You have an input layer where you feed in all of your data. You have a series of hidden layers, the hidden layers transform your data from its original form into a form that's more amenable to solving the problem that is that is at the output layer. So, so basically, if you're taking like temperatures and you want to get hail or the intensity of hurricane. You would stack a bunch of hidden layers together and they would mix and match different different features together and make something that that looks that that could more easily tell the the large hail and the small hail apart. There are a bunch of circles in these layers and each circle represents an artificial neuron, sometimes called a perceptron. Each neuron consists of a series of weights or kind of number, an updateable number indicates the importance of a single input so you multiply the weight by the input value. And then you sum them together so you have a what's called like a linear regression inside inside the body of your artificial neuron. Then the, what makes this special is that we add some non linearity to it we apply another function called the activation function. This can act as a switch to turn on and off the value so if this sum is negative they'll set it can set it to zero and if it's positive they'll let it on through. And in doing so we can we can then weight different neurons together and come up with different representations of all kinds of functions and things from everything from a single number to an image to a time series. It's a very flexible kind of framework. The thing that traditional neural networks do is they they connect all basically they have weights. You have a neuron in one layer will have connections to all the neurons in the previous layer. So this adds a lot of flexibility but it also makes for makes it possible for the neural network to do something called overfitting. This is where it learn fits the data it's seen too well, it gets too close to it it's too, it's too perfect and it does so by fitting to noise it will like to assume that everything should be like this exact example. But really in life there's a certain amount of smoothness things we can explain so we don't want our models to overfit we want them to. To fit well enough but not but kind of capture the main structure. So, with that, we can also see that how it how it makes the structure also differs from a random forest because it has a set of continue a set of continuous functions that it's adding together. You get a smoother surface compared with the random forest or the decision tree which are which are so much more jaggedy. The random force also does not extrapolate or basically doesn't assume any values greater than what it's seen before whereas with the neural network it can extrapolate patterns beyond where the data original like the bounds of the data originally why. There's more traditional neural network setup where you connect all the nodes together works really well for tabular data, but it works less well for things like images where you have basically each pixel of the images as an input into the neural network. But because we know like in our in our images in our say weather data, there are certain patterns that we know has some kind of spatial structure to them that that they that occur in a certain location but not all over the map. So if we assume that are the thing we're looking for is localized, and that were the thing can occur anywhere in our image. We can adjust how our weights are set up so that they form into this little box, which we call a convolution filter. The idea of this is that we run the filter over the box we multiply it by our image where, when you multiply and sum it together, the values are high. That means the filter aligns up with whatever's in the image. So saying, hey, there's something here that you should look at. And where it's low it's saying, no, you shouldn't look here for the for this thing. Another key part of our neural networks for images is doing something called pooling. So if we want to run the same size filter over a larger part of the image, we can even make the filter bigger, or we can make the image smaller. And that's just by just taking the maximum in the given area. So, so in doing so it's, you can make a smaller model, like with fewer weights that you have to learn fewer, like, unless computational complexity so so less computer cycles, by making your image is smaller and you still retain a lot of the important information in doing so. So these two pieces together you get something called a convolutional neural network. In our weather example we can take a radar image something like you'd see on TV when there's a tornado or hail storm coming. We then feed it through a set of convolutional filters, we have multiple ones because we're all looking for different features. We look for where the storm is some of them will look for where the storm is not. And some will be kind of redundant but they will look for different aspects of the storm. And we can then stack, like, we take that first set of convolutional filters we do a set of pooling on it. And then take a second set of convolutional filters and they will actually combine all of these previous images together and look for combinations of features. We get higher level features or more complex features. And like an image, like a, like a object recognition kind of context you look for like circles or, or diagonals at this level but then when you get to the deeper layers you're looking for faces and wheels and airplanes and other high level concepts so here we may be looking for a super self under storm versus a squall line, for instance. And then for a given problem at the end we actually have the part that predicts the hail. In this case it predicts where, where is the hail, like these, those are the features at the edge of the storm are positively associated with, with the storm producing hail and in this case it gave the storm and I need percent chance of hail. So from this we can do hurricane intensity forecasting. This is an application we can use with, with deep learning. And in this case, why do we care about hurricane intensity forecasting well intensity is defined as the maximum wind speed of a hurricane intensity depends more on small scale forces so it is harder to predict than the track of the hurricane which is like which way the hurricane is going. In this case, our hurricane intensity forecasting errors have decreased over time, but they have not decreased as fast as our track errors. So, so it's an open problem how best can we improve our intensity forecast. And because there are so many potential features to look at machine learning offers a way to help us narrow down that search. So for this particular problem, we used data from the hurricane weather research and forecasting model so this is a numerical model numerical weather model that is designed specifically to forecast hurricane so it's centered on the on the storm, and it follows the storm as it goes through the, the, the ocean and on to land. Here's all the tracks of all the storms in our model. We use three years of model forecast train the model and then we have a separate testing set because we want to make sure the hurricane model performance all in terms of hasn't seen we tested in 2018 with 18 storms from from this data set. We're trying to predict the change in the wind speed in 24 hours. And we're really interested in what's called rapid intensification, which is when storms would like, basically the the wind speed goes up really fast. This is a problem when, like, if you're going to evacuate and or not evacuate and if the storm looks weak, and then you decide to stay home and then suddenly it's a category five monster the next day then then you have lost 24 hours of preparation. So, this is a this is a big problem. We use two methods we use a random forest and give it a bunch of features that describe the storm. And then we get use convolutional neural networks but we only give it images. Basically images of different weather fields at different levels so it has a 3D picture of the storm to look for patterns in that. The red line is that both machine learning models do do really well, especially at longer lead times. So in this case, lower error is better so this is mean absolute error of our intensity forecast and this is the lead time and hours ahead of when the forecast is valid. The red line is the original h4 prediction of the change in wind speed. So both machine learning models seem to find features. The convolutional neural network does a little bit better at some times and the random forest is better at others. So in this case deep learning wasn't like a lot better than the wasn't really better at all than the random forest, but it can look at different kinds of features and potentially help us identify things in space that may be important and maybe things we need to look more at as researchers and identify further. So, so next I want to talk about some machine learning challenges so so we show a couple of examples of where machine learning can can be very effective. But there's also a lot of outstanding issues that face the machine learning community today that we still need to resolve. One is called as extrapolation. Machine learning works really well when you have a lot of data points in a given area. It's easy to interpolate and see the changes in behavior when you have data around it that kind of constrains things but when you don't have any data. The assumptions of your machine learning model become a lot more important, because it hasn't seen that area before and it could perform. Well, if it's, if the assumptions are good but it can perform very badly if the assumptions are not good. There's also the problem of distribution shift. This is the kind of more technical term for everything suddenly changes about your your world and and now the assumptions don't the data you had don't apply anymore. We've all seen this in the past few months with the with the COVID pandemic. Here's an example of toilet paper that's the store predicted that that we only need so much toilet paper based off of pre before COVID but then after COVID everyone stayed home and need a lot more toilet paper. So all the machine learning models for now invalid and that kind of that was this kind of systemic failure is led to some of the supply sources were seen. The machine learning can also run to the problem of over optimization. So you have one metric that like error that you want to make the most accurate machine learning model possible or you want to say maximize one kind of metric of like engagement and YouTube videos for instance. The problem is, often we need to care about a lot of different factors and if we only focus on one thing then we can get the lead into some very say deviant behavior by the machine learning models. It could recommend say more extreme videos. It could be compared to the one you're initially watching but on the same topic, and that can lead to people getting say radicalized by watching a series of YouTube videos and there's an article about this in the New York Times in 2018 and then a number of other books and articles since then that that have brought this issue to light. So, so there are ways to train with multiple goals in mind and that can help address some of these issues of over optimization. So there's also the problem of under representation bias. This has been talked about a lot more, especially in the past couple months, but machine learning models learn from their from the data they're given. And if the data contains, like, say, say there are not enough women or or underrepresented minorities in the data set then the model will perform worse on the for those people in comparison with with the majority of people it has been trained on. It can result in outcomes like this image here where the model is trying to take a blurry face and enhance it sort of like what you'd see on a TV show. In this case it took a picture of Barack Obama and turned it into a white guy. And, and this is because this model was trained on a lot of pictures of primarily white celebrities. So, so these these kind of issues and how we build our training data sets and how we like, way different groups in our training and evaluation process, make a big impact on how good our machine learning model is this comes back to that well defined problem I mentioned at the beginning. If you don't have a well defined problem and a good, good ways to evaluate to check for these kinds of issues then we'll end up with some really bad outcomes. This is not a problem that that this is also a problem that the weather communities to worry about in the climate community and because our data sets also contain different kinds of biases, do their how we collected the data for hail. We often collect a hail hail reports by asking people to compare the hail stone with common objects like baseballs and tea cups and softballs and grapefruits and marbles and half dollars. And as a result, we get certain objects that are more common or over represented in our hail size distributions. And, and where we don't have an object we don't have hail sizes there. And this. So this affects our view of what the hail sizes look like. We also don't collect hail reports from everywhere we only collect it where there are people. So you can in the hail databases you can see things like cities and roads and and farms and another and other things that that aren't entirely randomly distributed like you might might expect a storm to be. We also see this another weather instruments. So this is an example of weather underground personal weather stations. Here's an example of one here. These are like like a $500 to $1000 equipment that you put on your house. And you can send the data to weather underground and they will the broadcast it to the world and archive it for you. It's a it's a nice feature but it requires a lot of initial investment to set up. The advantage of this as we get a good local view of where we have these stations. The problem is these stations are not equally equitably distributed. If we look at Longmont which is where I live. This is a map from the US census and the Washington Post that shows the red dots are where white people live and then the yellow dots in this case are Latinx people live. And if we look at the relative distribution, we see that there are far fewer weather stations in the areas that are a larger population of underrepresented minorities because and this is due to some historical issues. And you see this pattern in other cities as well. So recognizing this if we want to get more equitable representations of our weather data. One thing we could do is work on making lower cost weather instruments that we can distribute or make available in these areas so that people can provide weather data and get better weather forecast for their area that are more calibrated to to their particular areas. So there's a problem but there are potential solutions we can build on from this. To further analyze and evaluate some of these issues we can also use techniques in the area of explainable machine learning this is an area I've been working in for a little while now and is an area of great interest and has a lot of potential for development. The idea behind this is that our machine learning models. Generate predictions but we often don't want just a prediction we also want to know about why the machine learning model made a prediction. So we can use techniques like partial dependence plots. This the idea behind this is that we take all like a take all of our data, say, temperature, we set all of our temperatures to the same value. We send this data through the machine learning model we take the mean prediction of it. We repeat this process for for a bunch of different temperatures where where the line in this in this partial dependence plot is flat. There is little sensitivity to this particular to temperature for predicting say hail storms. But then when there's a sharp change in the line that means the prediction is changing as a result for hurricane intensity we find that things like the maximum wind speed from from the weather model. If it's increasing, then then the change in intensity is increasing so that's kind of a sanity check. We also see for minimum pressure which is highly related to it you see that if the pressure is increasing the storm is going to weaken the pressure is decreasing the storm is going to strengthen so again no sanity check. We can look at also things like wind shear the wind shear is decreasing the storm is strengthening the wind shear is increasing the storm is more likely to weaken but only for a certain range of values. So this is this is information that can that can help us check if the machine learning models learning the right thing but it can also reveal. Maybe other variables that might be important and might have some other sensitivities so so it's a way to reduce the complexity down quite a bit. We can also interpret or deep learning models which are a lot more complex but have some we can we can simplify the process by looking at what it's learning just in the input space. So we do this by taking our image we feed it through our train neural network and we compare what it predicts with what we a desired label so if you want to see where what it should look at to make a big hail storm. We send the data through and then we we we send the signal backwards to the input and we come up with this heat map that will actually say like you should increase the reflectivity here to make a stronger hail storm or decrease it here to make a stronger hail storm. And this can identify regions of interest you can also if you run a multiple times you can actually build a synthetic storm from just what the neural network learned and here's an example of one. We can see things like where the temperature and moisture is high near the surface that tells us that it's an increased chance of hail we see things like directional wind shear with height. Which is the change change in wind with height and we see things like the change in temperature with height is also really important. We can also do this for different parts of the neural network in the process reveal that the neural network learns different kinds of different kinds of storms like supercells bow echoes poll storms. In each of these has a different chance of hail and occurs in different parts of the country. So we can use this to analyze our large data sets and reveal climatological patterns like supercells tend to the produce how tend to occur in the southern plains. Whereas the bow echo storms tend to occur more in the say the Midwest or upper the northern plains and the pole storms that produce hail only tend to occur say in Florida or in Arkansas. So, so we can use this for climatological analysis we can use this for individual predictions has a lot of potential that working with other scientists at NCAR and other universities to to really examine in more detail. Finally, I want to show another example from Hurricane Michael, which hit the golf, the Florida Panhandle as a category five back in 2018. The contour show where the neural network is focusing its attention in terms of trying to make a prediction of increasing or decreasing intensity. So it's looking at the wind field as it evolves. And the, and the different contours kind of highlight the areas you'll notice most of them are in the core of the storm so it's focusing on where the scientists would expect the most important information is but it occasionally also looks at, say the outer regions of the storm so maybe that's providing some hint as to some other factor that we should we should further investigate. So in summary machine learning sort through large complex data to find key patterns, combining with physical weather models we can improve our weather forecasts and provide better guidance to our forecasters. And we can use explainable machine learning and deep learning to guide our forecasters and researchers to potential features of importance. So, thank you for listening and now we'll take questions from the Slido. If you have any other questions after this that we didn't get I didn't get you on the slide of please feel free to reach out to me at my email or Twitter. So I'll turn it back over to Lorena now. Awesome. That was amazing. I know you just did a one week workshop, all about machine learning, and it was five days, and it was many people giving presentations. So it's really great that you were able to do a one hour summary of it all. So thank you so much because there's a lot out there. And it's really amazing to see how everything continues to evolve from 1950 until now. Thank you. Yeah, it's been quite. Yeah, there's been a lot of developments and it feels like things are kind of accelerating in this time with more and more people getting into machine learning and deploying on a broader array of problems. This is just kind of a small sampling of what what machine learning can do. And I hope it excites people to learn more. For those who are interested, the workshop is called the AI for our system science summer school. If you Google AI for ESS, you'll you'll find our website. We've also posted all the videos on the the sizzle and car YouTube page if you want to go learn learn a lot more even have a hackathon with data sets out there so you can try the machine learning yourself. Awesome. And one of our questions because it seems like there's are there are a lot of people that are interested in this Daniel, can you put up the first question about drawing a general path and advice. So the question is from David P. And it'll pop up on the screen. And the first question was, could you draw some general path or advices for starting at machine learning applied to weather prediction. And what are some common mistakes or misunderstandings. Okay, thank you. Thank you for for this question, David. Let me let me think through this really fast. I would say that my my biggest advice is start simple. It's with all the really complex machinery models being being hyped out there. Recently there was a machinery model release that has 600 billion weights inside the model. It's a natural language processing. You might think that you need a massive supercomputer, or, or, or multiple massive supercomputers to get to get started but really you can, you can do machine learning on your laptop. There's a lot of smaller data sets available. So at the start of the AI first system science summer school we have some hackathon problems on there. I've also put together hackathons are like short courses on on AI for the American meteorological society annual meeting that there's we've done an AI short course the past two years it's been sold out each year. And so we're planning to do another one which will probably be virtual but then there might mean you have a chance to do it. I encourage you to check out those data there's publicly available we have we have Jupiter notebooks so so it's a for those who are not familiar with them. It's a kind of web visualization tool where you can like write Python code and run it and make plots and it's very interactive. So so that some of those are a good place to start. A lot of people sometimes will also use like Kaggle competition so Kaggle is a, as a website for machine learning competitions they have some kind of tutorial machine learning problems on there to kind of just get started. But for general machine learning it's a good place to see what people are doing and download some real data and try it out. And in general you might want to find a problem you're also really interested in and kind of work at it try. Download some basic machine learning libraries like scikit learn and try out a linear aggression then try a random forest then maybe a neural network and see how see how they work. Some some common mistakes and I think misunderstandings one is making is that some people don't understand the difference between a training set and a test set. The training set is the data you actually used to update your machine learning model and have it learn, whereas the test set should be independent it shouldn't should be data the machinery model hasn't seen before, or isn't like anything the machine learning model has seen before. So when you're evaluating your model you want to use the test set to make sure that you're like not overestimating your performance. That's what people make. Trying to think. Yeah, so there's a lot of other ones. But I think there's some resources to get started and that's something you'll run into and if you also look at things like the AMS AI conference we have all our talks online. It's also a good place to see lots of examples of how people are doing AI for weather and climate. So I encourage you to look those up as well. I think you and it's great to hear that there are so many resources out there for everybody to try things out. Thank you. Should we go ahead and go to the next question Dan. It was uploaded and the question will pop up again on the screen for everybody to have access to it. And it asks, how can we use machine learning to categorize extreme weather events. That's a good question. There's a lot of different ways to do I think categorization. One way is to is using machine learning to look at say images of like say satellite data or weather models or climate models and identify things like weather fronts and hurricanes and other tropical cyclones and atmospheric rivers, anything that's kind of a more of a blobby shape, you can use more traditional expert system approaches or fancy deep learning approaches to do this. But the general idea is that you can look through this complex grid of data and find the main features you're looking for. And then you can do lots of statistics on those features and see where are they occurring. How are they, how is, like, how are the properties of them changing with time so so like climate change is a, is the thing we're really concerned about an extreme weather associate with climate change is still an area of active research with a lot of uncertainties associated with it. But one thing we can do to estimate some of those uncertainties is use these kind of extreme weather segmentation algorithms we apply them to our, our weather and climate model output and then we can see how like, are the number of hurricanes and the number of hurricanes in our climate model increasing or decreasing or that how our snowstorms changing as an example there, there is a, there's some research by Collins or ziki that worked on that. There's a group at Lawrence Berkeley lab, and per bot and Karthik Hashanath, among others, who have been working on kind of the deep learning for extreme weather for a while as well as a group at NOAA, which is a research lab with Jim Stewart. Yeah, Christina Cummler that they're also been working on this on satellite data. So, so there's a lot of groups that are working in this area they're really interested in it that that are doing some great research. So I encourage you to check out what they've been doing. It's awesome because there were there had been a question about how it relates to climate. So it's great to see that you're using severe weather, like, big storm impacts and using that moving forward. So our Dan, can you show us the next question please. Yeah, the process I shall also acknowledge my postdoc Maria Malina is also currently working on looking at extreme weather looking and how how it changes the climate change and how deep learning can help predict that. So I encourage you to check out some of her presentations that she should be publishing more on this soon too. That's awesome. Thank you for letting us know about Maria Malina and her work with you on this. The question that we have is what is the most surprising thing that you've learned about working with machine learning. That's a really, that's a really tough one. I think the most surprising thing. One that you don't expect is that it's, it's really hard to get a slight improvement in your machine learning model performance and it's really easy to make things a lot worse. There's a lot of different settings that you can change in your models and, and it's always tempting to just keep, let's do one more tweak one more run. And sometimes you've, everyone's know how you find the right thing that makes a pretty big difference. But often the biggest things that make the biggest difference are cleaning your data, doing good exploratory data analysis, working with experts in the field who are who know a lot about the problem and help like to find it for you in a way that helps them the most and makes it like makes the solution more obvious. So, so I think that like those are some of the like kind of hardest learned lessons going through the machine learning process it takes a lot of work. It also takes a lot of human effort in the machine learning process while we talk about a lot of the computers and automation. There's a lot of people who have to work on this to build this infrastructure. And, yeah, like it's a that's why I think no one like people think machine learning in some ways machine learning can take people's jobs because it's automating certain kinds of jobs away but the whole machine infrastructure requires so many people that it's also creating a lot of new jobs, including my current position in the in the process. Yeah, so it's definitely leading to kind of our next question. Dan if you can post up the next question please. And the question is mostly about model complexity versus accuracy so can you discuss your thinking on the model complexity versus accuracy trade off in your machine learning work. Yes. So this is again where we're the importance of having a good test set and good evaluation systems is, is pretty crucial. Model complexity can be useful in terms of allowing you to have a richer representation of your data inside the model, especially if you're using things like image data or anything really complex. If you use a less complex model you won't be able to take advantage of all the information there. But sometimes with more complex inputs comes a lot of noise. And that's where extra complexity can can sometimes be a problem, especially if you only if you have a limited data set which in cases of extreme weather or like climate data we we often don't have a lot of unique examples. So, so it's important I think to start with a less complex model and then test additional complexity and kind of an iterative process and make sure you're actually adding value over your simpler model. And sometimes you do sometimes you don't. And it's hard to say a priori so you have to, you have to that's why we do the research where we are we why really important good evaluation and testing and good problem definition is important for this. Thank you. And our next question is about CS which is computer simulations or computer science, I think your science. So both computer science and atmospheric science are important to your work, but which field do you need more in depth understanding of that that's a that's a tricky one. I like my by training I am a I'm a meteorologist atmospheric scientists but along the way I took a lot of computer science courses, especially machine learning and AI and visual analytics, some of these different areas that are important for building good machine learning systems. I would say, like to be effective. You really need to know. I think it helps to have a fair bit of understanding of both. It's hard to become a, like a top expert in both areas but if you can learn this, at least enough to speak the language of the domain you're working in and understand the problems of it. And then, then you can think you're more effective at solving it rather than just being completely atmospheric science expert completely a computer science expert. So, learning. Strong foothold on one side of the other plus enough language and interdisciplinary communication experience to be able to talk to the experts and on the on the other side can really can get you a long way. And some of it also depends on what like what aspect of the machine learning process are you are you most interested in working and if you're doing like kind of theoretical algorithm design and yes you're going to need a lot of computer science knowledge and experience in mathematics to be able to work in that area. If you're doing more say applying existing machine learning models to problems you like it may be more important to to basically helps to know how the models work but you don't need to necessarily know all the the like proofs and fundamental math behind it but you should have some idea of how it works and how it works well and where it doesn't so that you can apply it well within your atmosphere science problem. That's awesome and I'm sure that's also comes into play with collaborations where you might not be the expert on one part but you have a collaborator who might focus more on that other aspect that you're Exactly. I work with a lot of different collaborators because even within atmospheric science there are so many different sub areas that that are like have their own specialized language so like the the severe storms experts in the hurricane experts are like two very different communities that you know they're both meteorologists they they have different models they use in different terminology and different priorities so so even doing that little relatively little jump is like walking into a different world much less I work with atmospheric chemists and haven't had a chemistry class in high school so so that was a definitely much had to rely much more on my atmospheric chemistry collaborators to help guide the way. So awesome. Let's go ahead and take the next question. What is the first thing that you suggest to students who want to learn and apply machine learning to atmospheric science. Yeah, I think the that's a good question. I would say try to do some reading about about machine learning and kind of see what's what's already been done out there and made a lot of recommendations about like the the AI summer school the short courses the MS AI conference. I'm a co author on a bulletin of the American meteorological society paper was led by Dr Amy McGovern that kind of provide a lot of summary examples of machine learning and for weather prediction. That's been like, well, well read and I think that's another good starting point to kind of see what machine learning can do and explain some of the algorithms and how they work. We also have a follow up paper that talks about some of the interpretability techniques that just came out last December. So both of those I think are good resources to to to look at as a starting place. I think like try, yeah, get your hands dirty like actually get some get some data and try out some basic machine learning models whether it be in scikit learn or the first machine learning program I use this thing called WECA, which is which is still around and it's a Java GUI so it's like just a point and click machine learning program so you can load in some data and you can click random forest and click the train button and it will like it'll run and spit out some numbers so you don't have to know any programming to get it to work so so you could use something like that even to if you really want to get started with something basic and it has a whole bunch of different machine learning algorithms in it to try out. There's also a lot of other like more like user friendly interfaces that that different people have built for machine learning now. Like TensorFlow JavaScript has some has some nice demos for instance. And you also mentioned that as a student you did an internship the research experience for undergraduates. So I think that might also be kind of a nice way to get your hands involved in seeing whether or not you like machine learning in atmospheric sciences and and car does have they're currently doing an internship program they're made way through the program for the computation information systems laboratory the side parks program. And there's uni data so and card does also have opportunities. The applications open in the fall so if there are any students or if you know any students that are interested in this type of work. Definitely apply in the fall. Yes, I highly encourage you to apply for those programs they. I did already the are you and in the hallings program. And both of those like had a big impact on my research career. I'm currently mentoring a side park student and plan to mentor more in the future so so you might have this if you want to have a chance to potentially work with me. We also have the source program which is which is targeted at more thing more underrepresented students in the atmospheric sciences. And provides a lot of really good mentorship I've I've also served as a source co mentor before so so I yeah encourage you apply for those there are you programs also all over the country and all kinds of different areas. Probably quite a few machine learning now may not necessarily for weather but some are for definitely some for weather but but there's a like if you want to machine learning in some fashion you could probably find an internship program for that right now. There's also lots of boot camps and other kinds of short courses. If you're if you want to go that route. It's a good skill to have but I should also say that machine learning isn't everything there's a lot of other really important skills to have out there. Things like, say communication skills of being able to translate the machine learning models into useful decisions is a really really important task and we need people who are not just programmers but also can understand like understand the social dimensions of these problems. So, and we have colleagues that colleagues in car like Rebecca Morrison Julie DeMuth who are who are really interested in in these kind of challenges with communicating forecast to extreme weather and So, so with we like we kind of need people that they can, there's a lot of ways you could say use machine learning but not so I work with the algorithms directly but work with how they impact people. And I think that's, it's a really important area, and it's going to it's already hot topic is a growing area of importance and to do that while we need social scientists and business people and And a diverse array of expertise brought to bear to make head headway on these problems. That's awesome. And we are getting close to 830pm mountain time so a couple of we have a couple of other questions that we might not get to but one of the things that keeps coming up is that uncertainty in the input that you're putting in. So one question that I had seen was, you know, determining what aspects are like, for example, there was a question about a convolution filter. And how to determine like which filters you're going to use and how does that affect the results. So it's a lot of questions that are talking about the input effect in the output and you briefly did mention it in your talk about, you know, this human error that you're putting in, but can you speak a little bit more to that. Yes. So, so how how do you get pick your inputs and how does the machinery model pick the inputs I think is some of this. Some of it is is on is on the end of the human designer of the system. Basically, often you're, you could like pick a kitchen sinks worth of input so you could try everything under the sun but in your likely to run into some inputs that are will be right like correlated but not in a causal way so so like vocabulary and shoe size or shark attacks and ice cream sales or because they both happen in the summer as a so just because there's a really there there may be a correlation but not all of them are causal anyway like working with experts to figure out like a reasonable set of inputs is a starting point then in the machine learning training process the like with the convolution filters and like the the the numbers in that filter are actually updated by the the the automatically by the training process so they start out as random there they don't there's like just give them random values and then as you show it more and more examples the neural network will change those weights a little bit every time it sees a different example and and as it sees more and more examples that push the weights in one direction or another. Those will random perturbations will lead to one filter doing one thing and another filter doing another thing. Sometimes they're kind of they end up with some redundant filters that they like multiple filters will find the same a similar thing but if you have enough of them that's that's less of a problem. Most decision trees they kind of do a brute force search but they will still pick them when they think are the most important features and ignore the ones that are not important so the so a decision tree in a random forest can be more robust to noisy or bad features a neural network less so but you there's other ways you can pre test the quality of your features or do other filtering beforehand to help with that. Yeah, and then in your talk you also did mention that you you still go back to those observations so you had a graph of like max wind pressure wind shear and if your model doesn't really match what you're expecting then you know something kind of went wrong. So it's good to have those checks. Yeah, that's where the explainable machine learning kind of techniques are really helpful was that they can act as a sanity check and you can get something called the clever Hans effect is based off of this horse that seemed like it learned language but really it was just responding to think how its trainer was was reacting and doing certain actions based off of what the trainer was doing. So you so there's way so there's also examples of where like a cat and dog detector was really accurate at predicting cats and dogs and one data set but turned out it was only because all the pictures of dogs had like a blue blanket in it. And all the pictures of cats didn't have a blue blanket so it wasn't being able to use explainable machine learning can help like identify those kinds of situations. Great. Well thank you so much David John for all the work that you've been sharing with us and maybe in the future we can do an encore to this lecture where we can just kind of sit down for half an hour and continue to take more questions with you. So we'll talk more opportunities like that. And for everybody who has been watching thank you so much for participating in our polls questions and asking your own questions. Our next event will come out on our Explorer series website and on our Explorer series website you can also sign up for our mailing list if you'd like to get email updates on what's happening next. Well thank you everybody on the call and thank you David John once again for being our speaker for tonight. I look forward to. Yeah. Sorry. I look forward to that. Be happy to answer more questions in the future. Thank you very much Lorraine. Thank you. Brett and Daniel and Aliyah and thank you everyone for watching and hope you have a great night. Thank you everybody.