 I think I get started. Welcome to the last day of the ASP summer colloquium. You have made it, but it is my pleasure to start the last day by introducing Libby Barnes. She's a professor at Colorado State University. Her research is focused on climate variability and change, but also data analysis tools to understand them. The topics of interest include earth system predictability, jet stream dynamics, Arctic mid-latitude connections, and sub-seasonal to seasonal predictions. She is especially known for applying machine learning algorithms to geoscientific data and was one of the first to really be interested in interpretive learning and using these tools, not just for predictions, but for understanding processes. She has many sponsors and many distinct honors and awards, and for example, she got the career award in 2018. Libby, we're looking so forward to your talk. Thanks so much. Thank you, Judith. Okay, everybody should be able to see what they're supposed to see. Can I get a thumbs up? Perfect. Thank you. Okay, so I titled it lecture to remind myself that this is supposed to be a lecture where I get to tell you about some of the beginnings and some of the basics. I say basics here. I want to be clear. Nothing about this really is basic. Machine learning is complicated, and so what I really want to tell you about those give you a flavor for sort of the fundamentals, at least in my group where we all start. And then at the end, I will, since this is about us to us, talk about some applications just to us prediction, but really I think my contribution today is going to be laying some groundwork for explainable AI and geoscience. So that's the plan. All right, so let's go. First, really important to make it clear that this is a group effort. This is my current group right now. And they are all helping me think about these topics. So a lot of what I will talk about today is things I've learned from them, and things that we've learned working together. So keep in mind, I'm merely sort of the vessel to communicate a lot of a lot of the science that we're figuring out every day. And I will say, people here, I didn't just force them to smile, I actually do think we're all having a lot of fun. So here we go. So geoscientists, we have a big toolbox. You might look at my toolbox here and see some words that you use, or you recognize spectral analysis, correlations, or regression. We love to draw lines through things, right? We do this all the time. So trend detection, and of course, dynamical models really are a tool. They're another one of these tools in our toolbox to try to understand the earth system. So to write off the bad, I want to say that we think about machine learning as yet just another tool. I don't think it's the solution to everything. And I think a lot of these tools in the toolbox that we already have are absolutely essential. And honestly, you'll see today part of sometimes what we are a couple with machine learning to do the science that we need to do. So in a broad sense, one of our jobs as scientists, at least this is how I view myself as a scientist, is to sift through piles of data. We all have numbers coming out our ears and extract useful relationships that apply elsewhere. So this phrasing is called out of sample in the machine learning world, but we often just think of it as like learning physics so we can understand other things that haven't happened yet. And in many ways, looking for relationships in data is really exactly what machine learning methods are designed to do. So in a way, these tools, I think, really fit well within the toolbox that we already have. So why would you want to use machine learning for science in the first place? So there may be others, but this is how I think about most of the work that I've seen can fall into one of these three categories. So one is I want to do it better. That is, for example, Judith was just talking about parametrizations. Many of the parametrizations we have right now are not perfect, and maybe we can use machine learning to make them more accurate. That would be a good thing, cool. But sometimes we already know what we're doing as scientists, but maybe it's slow. So that is we want to do it faster or cheaper. Another example is the radiation code in models is very slow, but very good. Can we use machine learning just to speed it up? And that's quite distinct from doing it, you know, like getting something more right. And then finally, learning something new. And Judith mentioned, this is something that I think as scientists, we have a really special place to do this third bullet, that is to be looking for new relationships. We didn't know what we're there. And that's really what we do as scientists. So I see this third bullet is really relevant for research and pointing out, and this is important, it may be slower, and it might have even be worse than other methods, but it's possible to still learn something new. Okay, and this is important, because sometimes people will stop and say, well, you didn't make the prediction as well as this other dynamical model. And I'll say no, but look what I was able to learn from the data that I wasn't able to learn from this dynamical model. So it's important to remember, I think that your work in machine learning potentially doesn't need to fall in all three of these at the same time. Okay, so machine learning 101, I understand that, you know, Nisha's already talked a little bit about machine learning, and some of you I think have even been doing tutorials of machine learning, but I decided to take the approach that maybe not all of you are familiar with it. And so let's do some 101. So some brief concepts here. So some types of machine learning, there are more types, I just want to be clear, but these are some two main flavors. The first is unsupervised learning that is looking for previously undetected patterns in a data set. They don't have any, there's no like right answer or wrong answer, but you're looking for ways to group your data together. You may be familiar with cluster analysis. So that's a type of unsupervised learning. I'm going to pretend I'm actually lecturing here guys. So as for those of you with videos, has anybody ever done like EOF analysis or principal component analysis, you can raise your hand. Okay, look, you've done unsupervised machine learning. Oh, there we go, we got a virtual hand. So that's another form, you know, you're thinking about taking your data set and breaking it into these modes, these orthogonal modes. So that's machine learning. But another one that you may be more familiar with or hear about in the news is supervised learning. The idea where you have sort of a right answer and a wrong answer or sorry, a right answer and you're trying to teach the machine to get the right answer, not the wrong one. So some examples of this again, anybody ever fit a line ever? Okay, yes, regression is a form of supervised learning. Now, most of you fitting lines, you probably use the known analytical solution for that. But if you had iteratively tried to find that best fit line, that would be often considered machine learning. Decision trees and random forests are a type of supervised learning and then artificial neural networks, which is a lot of what my group does thus far. So I'm going to be putting most things in terms of these artificial neural networks. But I just want to point out it is one of many supervised learning tools out there that fall within machine learning. So what is an artificial neural network or a neural network? Okay, here's the basic the basic idea, but it actually can get you pretty far. Imagine you have two inputs, you have an x1 and x2. Now I've put them in orange circles, but this is actually how computer scientists tend to show these these neural networks. So this isn't just a cute schematic that I made up. This is actually sort of the picture used in the community. So we have these x1s and these x2s. These are our predictors or our inputs. What they do is they're connected to what's called a hidden node. So that's shown in the gray here. And they're connected via these arrows. You see the arrows and I have some w's on those arrows. Those w's are weights. So weight one and weight two. And the way you read this is that you take x1 and multiply it by w1, you take x2, you multiply it by w2. And when you get to the hidden node, you just add them up. It's just a sum. So far so good. What we're doing is linear regression. Then we're going to add a y intercept or what they often is called a bias, a b. So now we just add a b and that's sort of implied after you do this sum. So so far we have linear regression. Nothing fancy has happened, but we've just shown it in this sort of picture way. Now the next part of this hidden node is we push that value that came out of that linear regression problem through a nonlinear activation function. And this is where machine learning is so powerful is that nonlinear function takes a linear problem and allows it to become very complicated. So you push it through this f, this nonlinear activation function, there are lots of choices. I won't talk about any of them here, but if you have questions, I'm happy to discuss them. And you push it through. Now the whole idea of training the network then is actually just figuring out the w's and the b's that make the output of this the answer you want. That's it. Just the w's and the b's. Okay. Now you might look at this and say, wait a second, how can this find cats on the internet? Cats and dogs, right? When it looks so simple. Well, this is just one little piece. Now we're going to take these puzzle pieces and put them together or string them together. So the output of one of these hidden nodes becomes the input of the next one with new weights and new biases. And then that feeds into another one, another one, another one. You can imagine this gets pretty complicated pretty fast. All right. And ultimately though, again, because this is supervised learning, we're going to output a prediction of value that we're then going to assess. How did we do? Was it right? Was it wrong? So the complexity and these nonlinear functions, these activation functions allow the neural network to learn many different pathways of predictable behavior. All right. So once trained, you have an array. All you have is a bunch of w's and b's. Okay. Yes, they are actually just a matrix. It sounds fancy, but it's just a matrix of w's and b's that allow you to predict new data. Okay. So I like this little video. This is actually using one of these algorithms to fit a line. Again, you don't do this. We have an analytical solution for it, but the idea is the same. We are fitting our w's or our slopes and we are fitting our b or our y intercept here to get the best fit line. And the way this is done, and if I had more time since this is a lecture, I actually in my class force all my students to do what's called back propagation and gradient descent to show how this process unfolds. And we don't have time to do that here. And I've heard from students that they find it unbelievably boring, but it is important to know what the machine learning algorithm is actually doing, right? And ultimately what it's doing is it's adjusting the w's and the b's to try and minimize a loss function. For example, a root mean squared error. All right. And then each increment adjusts accordingly and slowly, slowly, hopefully finds the optimal solution. And hopefully here's a pretty important hopefully because it can fail as well all the time. So this is where the tricky parts come in. Okay. So we've trained a neural network. Our goal is to find the optimal weights and biases. And when we're done, we have what often in the past has been referred to as a black box, right? Our data or inputs go into this black box. Magical things with w's and b's happen and then out comes a prediction. And if you've done a good job, it's more right than wrong. But ultimately, the concern and especially for scientists is like, wait, hold on a second. I want to know why. What's happening? What is this black box? And why is it getting the right answer? So this is where this phrase that's used all the time now of opening the black box, right? We want to peer inside and do more than just write down w's and b's in an array. We actually want to understand what those w's and b's are representing. And what is it that they are learning? So this is when we go to leveraging advances in explainable AI. It's also sometimes called interpretable AI, but they are slightly different. I can talk about that if you'd like during questions. Often referred to as XAI or visualization. All of these generally fall in the same realm of opening the black box and trying to understand what's going on. So first of all, in the past few years, there have now been quite a few papers out demonstrating the use of XAI for geoscience. So here are four of them. And I believe we'll be sharing my slides so that the point is, please don't try and dot these down now. But trying to actually talk about methods for the geosciences. I will say this is something I'm really passionate about because I actually don't think the computer science world is also working on XAI and they're outputting amazing things, but they aren't quite right for geoscience yet. And one of the things I'm so excited about is that as scientists, it's our job and our exciting job to take those and think about how they fit into the science that we do. And so that's what some of these papers are attempting to do. With that said, there are lots and lots of methods out there at this point. One particular one, which my group has used a lot. It's not the best. It's not the worst. It is a good method that we have found useful is layer-wise relevance propagation. Sometimes I feel like it sounds like I get money or something every time someone uses LRP. I do not. It's just a method we have found to be to be useful and work the way our brains work. So here's how this works. By the way, at the bottom here is explains where you can get this code. This is all available on GitHub by the authors. Although it does have some requirements with Python packages, etc., that can make you want to bang your head against the wall. But how does it work? Imagine that we want to predict if there's a cat in an image. We've trained in our network to do this and it does a great job. Now we want to know how did it know there was a cat. Imagine we take one prediction and we push it through our network and output the probability there's a cat. Great. What LRP does is it takes that probability and propagates it back through the network to produce a heat map showing the relevant regions of that input picture that were most, if you will, important for its prediction. So in this case, if you look, you won't be surprised to see that the most important or relevant parts of the picture is where there is a cat. That is, the network used where there is a cat to determine there was a cat. This is not terribly insightful or exciting, but bear with me. I think it's good to start simple. All right. So these tools are not a perfect view, but they are better than the black box if they are wielded intelligently. I really love this image. This was from a collaborator, Emma Ebert Hoopoff, and here's what she said. Imagine you have a black backpack with a bunch of stuff in it and you push it through an x-ray scanner at an airport. They do not see everything perfectly inside. And this is where my analogy is really corny, but the more layers of stuff you have, the harder it is to see everything. You can imagine more complex machine learning algorithms have a harder time seeing inside, but it's certainly a better view than the black backpack that you started with. Even so, these are not perfect and we need to start to understand where they have issues. So this is some work being done by our joint postdoc, Antonias Mamalakis, who said, well, hold on. These methods like LRP, looking for where the cat is, are useful, but they've really been tested in the world of ImageNet. That is a world where Google has a lot of images on the internet as trying to figure out what's in every image. So when you Google search a coffee cup, it can correctly pop those up. But have they ever been really robustly tested for scientists who want to use it for science? And so what he's done is he created synthetic data. That is, he made data that looks like climate data, but he knows exactly what it is. He made it up. So on the left, he has made the synthetic sea surface temperature data. And if you look at this, you could sort of convince yourself that maybe it could be SSTs. All right. And then he has pushed it through a function where he knows what it is to have a Y or a value associated with that. You could think of it as global mean temperature or some one value. And what he's done is he's trained a neural network to try to learn this relationship that he knows the right answer. And then he's used many different XAI methods to try and figure out which ones give him the right answer. Does this make sense? So in essence, he knows the region's most important for the prediction because he decided them. He defined them. And now he wants to see which methods actually give him the correct answer. So he can do this and it looks something like this. So we're not going to go into all the detail, but the reason I'm showing you this is to point out there are more methods out there than just LRP. So the top is the true contribution of every pixel to the answer. So a perfect method would give him the top plot. All right. The bottom are all different XAI methods operating and looking at the same neural network. That is, they should all agree on the most important regions and they don't. And so part of what he's doing is trying to understand why some of these methods give different answers and what are their pluses and minuses. And you'll see different LRP methods at the bottom give different answers as well and try and understand why this happens and as scientists can we see this coming. Because what we don't want to do is think we understand what the neural network is doing and then also be wrong. That's also not a good thing. Cool. All right. So you might pause and say, wait, okay, so there's XAI, but why should I care about it? My neural network is amazing. I can predict the stock market. I'm going to be a billionaire. Why do I care about explaining how I made all my money? Right. Okay. And as scientists, I would argue our ultimate goal is to understand why. But just for the sake of argument, I'm getting ready for my debate in a few hours. Okay. What if you didn't even care about why? Fine. I'm going to argue there are still reasons you should care about XAI, even if you don't care about why. And they are these four reasons. One is identifying problematic strategies. Two is building trust. Three is choosing the approach. And four is learning something new. And in that case, maybe the why sort of fits in there. So what I want to do is go through each of these with an example. And I think in my lecture notes, I had to write some takeaways from today. And to me, this is the most important part is why do we need these XAI methods in the first place? Why can't we just do the black box approach? And to me, these four reasons are vital. All right. Let's talk about problematic strategies. So ensuring the right answers for the right reasons. If you've seen me give talks on this, you can take a nap because I love this example so much. I show it in all my talks. So here we go. And this is not an example for my group. I wish it was. I would be so proud. But this is from LaPouchekin at all, showing the importance of XAI. So imagine we've trained a neural network to predict if there's a horse in an image. And we did. And the neural network is amazing, by the way. It's really good. So it keeps getting it right every time. But now we want to understand how it's getting it right. So let's imagine we have these two images and we push them through our neural network. And it says, yes, indeed, there is a horse in each of these images. Everybody agree. Hopefully there's horses. I hope. Good. The question is, how did it come up with this answer? So we use LRP to create a heat map of where the network looked or the most relevant regions for its correct prediction. And these are terribly ugly plots. But if you squint, the red regions generally, this is the heat map, line up with where there's a horse, saying the network looked where there's a horse to determine there was a horse, sort of like the cat. So, okay, so far so good. That's great. Let's do another example. Here are three more horse images. Hopefully you agree. There are horses in all of them. So where did the network look to know there was a horse? And again, the network got it right. So it was looking at something. And if we make the heat map this time, you will notice that the most relevant regions for the correct prediction are not where the horses are. See that? It's in the bottom left-hand corner. And you may not have noticed it before, but in the bottom left-hand corner of each of those images is a copyright symbol. And so happened that this neural network was trained on horse images that mostly came from the same German horse website that all put copyright stamps on every image. And the neural network learned, hey, I can get it right all the time by just looking for the copyright symbol. I don't need to actually look where there's a horse. Now you might say, well, that's fine with me. In terms of physics and science, this is not good. We want it to learn actual relationships in our data and not spurious ones that later won't apply. For example, if I now go out and take a picture of a horse and I push it through this network, it's going to say there's no horse in the image because I didn't have the correct copyright symbol. That is not what we want to be doing with machine learning. That is dangerous stuff. Right? So AI helps us ensure we have the right answers for the right reasons. Next, we want to build trust. That is, our neural network does well, but we still want some good feeling. And this has to do, by the way, with number one. But still, are we getting the right answers for the right reasons? And we want to know, how it came to the decision that it did. So this is quickly some work that we've done looking at the human impact on the land surface. And what we do, I can't go into the science here, but we take Landsat imagery and we push it through a convolutional neural network, which is a little bit different than what I've been talking about, but same general idea. And we predict the human footprint index. That is how impacted of this region. How much is this region impacted by humans? In this case, since we have Landsat imagery from the year 2000 and 2019, here are two examples over this particular region in Sumatra, the neural network says, hey, it used to be impacted by humans at 0.38, where one is the most. But now it's in the year 2019, it's 0.66. So we go, oh, wow, the network thinks that humans are impacting this landscape more. But why? So we can use LRP to say, where did you look, where did the network look in this image to know or to predict that humans were more present in the image? And when we make that heat map, we see that a large contribution comes from the fact that it picked up a big new highway that actually just went in in 2019. So it gives us confidence that it's predicting the human impact in this image in a way that makes sense. Another example using the same data set is choosing the approach. There are lots of machine learning tools and algorithms out there and you have to make a lot of choices called hyperparameters. How many hidden nodes do I have? How many, how are they laid out in my network? And sometimes you don't know how to choose. So XAI methods can help you do that as well. For our case, we were using Landsat imagery, which has many different channels. Landsat has lots of different channels that you can look at. We didn't know which channels would be best for our network. So we input multiple channels here. So on the top is Landsat imagery over a certain region. And then on the bottom, after we made our predictions, was the heat map or the most relevant regions for the network's prediction. And what we saw is that channel three really appeared to be being used the most of the three channels. And we were able to throw out the other two and save on data space. So this is another way that XAI can be useful. And finally, to learn something new. So the idea that the prediction itself might not be terribly interesting, but maybe the science can be what the network has actually learned. And so to end here, I'm going to go through some S2S examples of how we are using explainable AI to learn something new about S2S predictability. Okay, I think you've probably already heard this phrase a few times, but just as a reminder, forecast of opportunity or the idea that certain conditions in our climate system may lead to more predictable behavior than others. So beyond the weather timescale, we have to look for the specific states of the air system or opportunities that lead to this enhanced predictable behavior. And we use in our group machine learning to help us find these opportunities amidst a sea of noise. So one example is postdoc, Zane Martin, who's thinking about the MJO. So he's using simple neural networks to predict the MJO and to help us explore MJO dynamics and predictability. So he, in this case, inputs of fields of outgoing longwave radiation in the tropics, zonal wind at 850 hectopascals and zonal wind at 200 hectopascals, and sets up a classification problem that is where he's going to predict whether the MJO is in phases one through eight or whether it's weak. And ultimately, I wanted to turn this a little bit more into a lecture than just research, I wanted to point out how this is done, is the network actually predicts values for each of these eight classes. And then it's pushed through a function called the softmax, which is pretty magical. It's just an equation at the bottom. And what it does is it rescales all the values so that they add to one. So you can think about them sort of as a probability. So it takes the values output by the network and turns them into something where you can say, in this case, a weak condition is more likely than, say, phase one, because it's 0.86 instead of, say, 0.034. All right, so this is something that's used quite a lot in classification problems, the softmax. And then these confidence can be related to the accuracy if they're trained well. In essence, we have uncertainty to a certain point, uncertainty quantification of our predictions. And I'll show that for Zane's case, he's trying to predict the MGO in, say, 10 days. And what you can see here is on the x-axis is the confidence of the network. Ignore this bottom panel here, but on the x-axis is the confidence of the network. And here is the accuracy of the network. And do you see how as the network becomes more and more confident, the accuracy goes up and up and up? See that? That's a good thing. That means when our network's like, yeah, I know the right answer, it actually knows the right answer. You don't want a network that thinks it knows the right answer and it gets it wrong all the time. That is not a good situation to be in. So this is just showing how we can use these uncertainty quantifications from the softmax output to try and understand our accuracy. Now, another way that Zane's, what things that Zane's doing is then he is able to train many networks with many different variables to understand which variables are most useful for the prediction of the MJO. So on the left-hand side, it's showing accuracy of networks when he only uses one variable. And then on the right-hand side, when he uses combinations of three variables and there are certain combinations that lead to more predictable or more accurate predictions. Now Zane then goes on to use LRP, but since I don't have the time to show you that, I thought I would jump to another S2S example that uses LRP as well. And this one is by Kirsten, mayor, who's thinking about teleconnections to mid-latitude. So once again focused on the MJO and the tropics, her input into her network is daily maps of outgoing long-wave radiation. But now she's trying to predict the circulation over the North Atlantic 10 days later. Well, actually in this case, sorry, it's not 10, 22 days. There we go. S2S time scales. Okay. So she's saying, if I know what the MJO or the tropics is doing today, does this help me know what's going on over the North Atlantic in 22 days? And after she's trained her network, she can use LRP to go back and look at the particular features in the tropics that were most important for predicting the North Atlantic circulation. Okay. So, but before she does that, we want to return to this forecast of opportunity idea. Again, she looks at her network and looks at the most confident predictions and sees that the accuracy goes up as the network becomes more confident. So this is where, to me, the science of S2S comes in. We could analyze all samples, but at least to me, I don't think all samples are predictable. I don't think all days in the observational record, we should be able to predict the North Atlantic circulation based on, say, the tropics. It seems to me that there should be some days, maybe when the MJO is active, that this makes sense. And other days, maybe when the MJO is inactive, that it's just not going to help. And so, instead of analyzing all of the days, she's going to analyze the 10% most confident because this is when the network says, I see a signal. Okay. And it is indeed more accurate than two. So when she takes those pop 10%, she then looks at the LRP heat maps of those and says, where is the network looking when it's so confident that it can use the tropics to understand the North Atlantic 22 days later? And the heat map here, if you for a second ignore the white lines, the colors or the shading, the blue and white shading, that is the LRP heat map. So what it's saying is her network is really focused on this region and this region over the Pacific. And then part of the, and indeed part of this is associated with the MJO. And part of the fun part of science now is to go in and say, okay, what about these regions led to this particular behavior? Can we understand the physical processes that the network is picking up on? Okay. So my wrap up, my big points is first, these explainable and interpretable neural networks, I think fit well within the toolbox of climate scientists. I actually think a lot of people feel like they've never done this before. And I personally have a lot of the mindset that if you're working in this space already with data, you are almost all of the way there. That a lot of the hard part is actually doing the thinking about the science. Artificial neural networks are not black boxes and tools exist to help visualize their decisions. I should say that machine learning in general, many methods like random forests as well have XAI methods. And so these, these tools really are not, in most cases, black boxes anymore. And I do believe this is a game changer for how we do scientific research, because we can use these tools to do science. And then finally, you know, that these neural networks, even though this is about S2S prediction and predictability, they can be used for more than just prediction. We can, the science, you know, that we do can actually be what the network learns and how it learned rather than the prediction itself. All right. I think I left some time for questions. Thank you very much.