 different points. But today I'm going to talk about work that we've been doing at Colorado State University myself, Libby Barnes and Eric, if you've seen both Eric and Libby in this workshop using a simple, explainable machine learning model called a neural network to try to predict the Madden-Julian oscillation. And I don't think I will dwell on the MJO at all. We've actually seen this GIF many, many times already in this workshop. And I would imagine in the colloquium it's this major source of S2S variability in the tropics and predictability worldwide. But I think despite the sort of rise of machine learning across a lot of different parts of the weather and climate enterprise, we're really only starting now to sort of see studies that look at machine learning in particular as applied to the MJO. So here I've just highlighted some different ways in which machine learning has started to become more and more integrated into weather and climate questions. I'm going to focus mostly on prediction, which is that little section there, but people are also using it as, you know, parameterization for downscaling, for classifying and identifying weather and climate phenomena. And I'll say, so in particular, right, we're going to look at applying this problem to the MJO. You heard Jaime talk a little bit about some work that she's done in that regard. And I'll just contrast sort of the approach that we're taking here is really to just use a machine learning framework. So not in conjunction with the dynamical model to bias correct, but just thinking hard about how can we use these tools to predict the MJO. So starting with something like how can we sort of frame the problem? What do these models look like? And then the second thing that I'll address is, you know, how skillful are these types of models that we've looked at, at predicting the MJO, right? What's the number of days and so on. But I want to emphasize this third point because it's been something that we've been thinking hard about. I think that I want to continue to think hard about in the future, which is, aside from just tools to predict the MJO, how are or how might these types of models be used to study and understand the MJO. And I like the analog of thinking about a dynamical model like a forecast model, right? You can run it as an operational model to make predictions. But as we've seen in this workshop, you can mess with the model's stratosphere or the model's ocean state, and you can use these tools to understand the physics. And so that's a component of this project that we sort of, you'll see a couple points where I've tried to tie that in. But I think that's a direction for ML modeling that should be stressed and that we should continue to think hard about going forward. So to address this first point about framing the problem, I'm going to give kind of an overview of the machine learning setup that we've used. And I want to emphasize that this is sort of one way of doing this problem. But this approach is incredibly flexible. And there's a lot of other ways that you could tackle this. Some of them we've explored a little, and some of them we really haven't thought about at all. So on the left, you see sort of what the model is taking as input. You can think of the machine learning model then as basically a nonlinear transformation that takes in that output. And then on the right, we're going to predict future aspects of the MJO. And I'll talk about that specifically in a moment. But on the left, you see an example of what we're inputting. So in this case, we're inputting daily maps of variables over the tropics. So all longitudes and from 20 North to 20 South. And they've been processed to remove climatology and long-term variability to the end. So, but importantly, we've set the processing up so that it's done so you can forecast in real time. There's no filtering. And you see in this particular example, I have three variables. Again, these are just variables on a single day. OLR, the Zonal wind at 850 and the Zonal wind at 200. And you can give this model any variables you want and any number of variables. So I'm going to focus on kind of this, these three, unless I sort of say otherwise, we found this sort of configuration works pretty well. And unlike the presentation that Will just gave, we're actually only using observed data. We wanted to start in the observed space, but you could imagine extending this to the climate models and leveraging that. We can talk about that in the questions maybe. But for the sake of this talk, we're using observed and reanalysis products, and we're training over the period from 1979 to 2010. So that's how the model is right iterating over and we're doing supervised learning to find these weights and biases that the machine learning model uses. And then we're validating with daily forecasts over this period from 2010 to 2019. So to give a little more context on the right hand side of this diagram, I'm not going to get into the details of the neural network. My impression is that some of the students at least have sort of a basic sense and we can talk more about sort of the details of the model in the questions or offline. But suffice it to say it's a very simple neural network. It's a single hidden layer with 16 nodes. You can compare that to even the model that Will talked about, I think was two layers and one of the nodes had something like 100 nodes. So this is a simple framework. We found that this framework works pretty well compared to more complex frameworks. And we kind of went with the thought that whenever we had a choice between a simpler model and more complex one, unless the complex one did much better, we would stay with the simple model. And what we're predicting is going to be and this is funny because of the conversation I think we just had, we're going to predict for now an MJO index. That's going to be the output. And we like that because it's concise and it makes it easy to compare things. But I think the points about the limitations of indices are well made. And again, I want to emphasize that you can kind of put whatever you want here on the right hand side. So this is a flexible framework. But for the purposes of this part of this talk, we're going to predict RMM one and RMM two at given lead times into the future. And we train different models for each lead time. So we'll have a lead time zero model, a separate model will do one day to the future and so on and so forth. So I'll just sort of give the give the overall picture of what the skill looks like to start. You see here, this is the skill on the y axis as a function of lead time measured as the bivariate correlation coefficient. This is the standard way of measuring MJO skill. And we're using a threshold of 0.5. So basically where the skill is above this gray line, the model has useful skill, although maybe not high skill. So what you see is the black line, which is the machine learning, the neural network in winter on the left and summer on the right shows skill out to around 17 days, winter at around 12 days in summer. And these other colored lines in both instances are simpler statistical models. I won't get into the details again, but we have a persistence model, which is the dotted line that just steps, you know, persists the initial condition. And then these two alphabet soup ones, MLR and VAR. So if I said to say those are two basically simple linear models that are they're often used in statistical MJO forecasting. And they use basically linear regression techniques to given the RMM index predict the future of the RMM index. And you see that the ANN outproducer has sort of modestly more skill than these other traditional approaches. But I'll emphasize that, you know, we don't have skill up to 25 or 30 days, which you'd see in a dynamical model. So unlike situations like ENSO where, you know, relatively straightforward machine learning models showed really high skill right at the gate. In this case, I think, you know, we're not yet competitive with kind of the state of the art dynamical model. I want to emphasize that at least to me that doesn't sort of invalidate these models as being useful. And this is a slide more for the kind of experts in this domain. So I'll go I'll go quickly through it. But but I think the point will be illustrative. If you start to look more, you know, into the sort of details of how this model performs, we've done, you know, we've done a lot of slicing and dicing and understanding the prediction skill. In this panel, we show the change in MJO prediction skill. So the same style of plot from before, but in different phases of the stratospheric quasi biennial oscillation in winter. And bear in mind that we're only validating over a decade. So we don't have a ton of, you know, the QBO is this every two year oscillation. But what you see is that this model has higher skill in the MJO when the QBO is easterly versus westerly. And I want to emphasize we don't train the model differently. This is a model that's trained for all winters. And the model has no information about the stratosphere. But it's able to reproduce this this observed result. It's been kind of a puzzling part of dynamical models. And so we're hopeful that here what we might have is a very efficient, fast and potentially useful tool to start to dig into some of these more sort of classical, you know, questions in in MJO predictability and try to understand them again in a framework that is that is simple and flexible. And I think, as you'll see in a moment has sort of explainability techniques that we can that we can leverage. I want to before sort of getting on to the explainability, though highlight another approach that we took. So we we wanted to kind of give a sense of the different ways that machine learning models can be used. And so we I'll talk about another modeling framework where the input and the model are very, very similar. The input is identical. The model is sort of again, one layer, you know, we tweak the kind of parameters to get it to get it optimized. But this model, instead of predicting a deterministic, you know, the MJO is going to have this index value on this day. Here we're predicting the probability that the MJO will be a certain category. And so we've chosen a weak MJO phase, meaning the amplitude of RMM is less than one. And then we've chosen the eight RMM phases, which sort of describe where the MJO is in its life cycle. Those are the categories. And again, I want to stress that this is a simple model that's outputting probabilistic forecasts. So to give a sense of what these probabilities look like in action, on the left you see an observed RMM phase diagram for a particular November, December in 2017. So the lead zero is the start of the forecast date and focus on this green dot, which is the lead 10 MJO in observations. The right side shows what the lead 10 model predicts for this particular forecast on this given day. So you see that it predicts again, across all nine phases, right? What's the probability? And in this case, there's things that we like that we were glad the models learned. For example, the model doesn't with, right? It doesn't think that there's a high chance the MJO will be in phases three through five. And you see that in this case, the model is right in terms of the highest probability being in phase eight. But it really is not sure actually which of these two, you know, whether it'll be phase eight or phase one. And you see that that's because, right, this observed point is actually right on this boundary between the two. So the model in this case has learned not only to make skillful probabilistic forecasts, but it's kind of learned which phases are near each other. And it's able to, I think, provide useful information. Just to give a sense of sort of overall model performance. In this case, we assessed the classification and then in part using the ranked probability skill score. And I won't get into the details. Basically, it's a, it's a metric that takes into account that this model is producing probabilistic forecasts across multiple categories. It's not issuing, you know, deterministic predictions. And the way that this diagram is set up and comparing the model probability to a climatological skill. So where you see skill greater than zero, that basically indicates a better skill than if you just guessed climatology. So in this case, that's something like 30% of the MJOs are weak and 8% are in each of the, of the eight phases. And you see that this classification A and N has skill over climatology to around 15 days. It's pretty impressive that it's, it's so comp, this is in winter, but it's comparable to the regression model, even though the sort of framework and the metric is a bit different. I'm still comparing it here to these two linear models, bearing in mind that these models are deterministic. So this maybe isn't necessarily like an apples to apples comparison. But it makes the point that you get, you know, additional information if you issue these probabilistic forecasts. And I think this, this gives us an output that maybe isn't as well studied where we can start to look at, you know, what, what large scale states control these probabilities, right? When does the model more or less confident? I think that could be interesting and illuminating possibly going forward. So in the last two to three minutes, I want to come back to this, this last point, right? Because we have models here that, you know, we've set up two frameworks that the skill is very good compared to dynamical models. But if you're just interested in making the best MJO forecast, I would say, you know, give Frederick a call and get some time on the ECMWF supercomputer, right? So the question then is like, are there uses for these models, you know, can we, what are their limitations? What, you know, what could they teach us? And I'm not going to present necessarily new science. I'm just going to show sort of flavors for the direction we see this work going and things that we're kind of continuing to investigate. So one thing I want to highlight is because we're using maps of variables in this framework, we can ask the question, what input variables give us the best skill, right? Is it always the case that, you know, when we input the RMM variables, we get high skill, what if we give the model SST or, you know, or moisture at different levels? So this plot shows the accuracy of the classification model predicting the active MJO at three lead times, zero, five and 10 days. And the different bars are different combinations of variables. And there's a lot in this plot, and I'm not going to sort of belabor too many points, but basically what you see, so the legend here is the different variables. You see, first off, that there is spread, you can choose bad variables and you can choose good variables. So for example, this, this model that I've been talking about is the blue bar, the OLR and the Sonal Winds, compared to say a model that ingests this pink bar, which is the low level specific humidity, the upper level temperature and the SST, this model really struggles at these shorter lead times to identify the MJO. And if you go down to two and one variable models, that skill gets even worse. Interestingly, if you go up to four or five or six variables, we don't see too much range and skill. The other kind of big picture takeaway, I guess, for me from this plot is that you can see that some models that don't do well at short leads do reasonably well at longer leads. So this gray bar, for example, is a model that uses total column water vapor, upper level wind and upper level temperature. And you see that at shorter, you know, at sort of lead zero, it's not as effective at identifying the MJO, but at longer leads, it has basically comparable skill to other models. And that basically to me, what this tells me tells you is at short leads, you're really sort of, you're thinking hard about like, what is the index that you're using? You know, whereas at longer leads where you're actually trying to make predictions, you can, the model learns to I think leverage different things from different variables. And so this type of work, right, continuing to think about what, what kinds of input could we give these models? You know, you could give it more static energy variables that combine sort of multiple inputs. I think that's one direction we could, we could go in. Another just a big picture of the sort of concluding thought is you can use this whole suite of tools from the field of data science that are aimed at explaining how these model works, models work. So Will talked about LRP, which is awesome. I'll just say it's right, it's a tool that shows you for a given prediction, where did you look to make that prediction? So here, this model takes pictures and it tells you what's in the picture. I uploaded a RAM, right? And you can see that it identifies the eye and the horns of the RAM on the right hand panel as being important. And so we've, we started doing this for the MJO. This is an example of composites from all of the lead five predictions at phase zero. So you can see that right in the OLR composite, you see this canonical signal with active conduction in the maritime continent, suppressed convection in the Indian Ocean. And the relevance plot picks up on that. This kind of confirms for us that the model is looking in the right place. And as you go to a longer lead time, you can see the model adjusting where it's focusing. So in this case, this is the same plot, but at the lead of 10 of the bottom, you can see the model has learned to look at the, the active convection over the Indian Ocean. And so looking at these types of, of plots, again, this is an explainability method that I can see being used as we, you know, look at changes in MGA predictability to try to understand, for example, does the model change where it looks under different states of the QBO or under different large scale climate patterns? So I'll just briefly conclude, right? To the, to answer this first question, the machine learning models that we've developed give good skill out to around two weeks, a little longer than that in winter, shorter in summer. They beat most traditional statistical models, but they're not yet, I guess that emphasizes that I see this work as, you know, continuing to grow. They're not yet competitive with dynamical models, but hopefully I've at least given you a sense that we maybe have a new tool or suite of tools that can be, you know, we can take advantage of how efficient, how flexible and ultimately explainable these models are. And I think thinking hard about what are creative ways to leverage this could, could prove illuminating in the future. I'll stop there and thank everyone again so much for, for this opportunity to speak. And I'd love to have any questions now or offline. Thanks. Thanks a lot. Yeah, fascinating new novel works. Any questions for Zayn? Hey, me. Go ahead. Zayn, thank you very much for the nice talk. It's nice to see the, the QBO-MJO relationship is very captured in this machine learning model. Yeah. Yeah, surprising. Yeah. And it reminds me about your input domain. Yeah, this one. So I think, so you use the whole domain, right? Have you considered only using the, like, MJO active area, like the Indo-Pacific one per area? Because it seems to me that there will be a lot of noise in this input. Like, noise may not be necessary. Right. Oh, you lost the screen. Sure. Let me share it again. In general, we haven't looked at limiting the longitudinal domain, because, you know, we thought that RMM is kind of a global, a global perspective. We have looked at limiting different latitude bands, so taking, you know, narrower or wider bands, and that didn't make too much difference. And I will just point out, so this plot, you can see my screen again, right? I know it's not in full screen mode. Yeah. But this plot here is another one of these LRP composites. This is just showing more variables than just OLR. So if you look here at E and F, this is the, where the model is looking to get the U-200 information. And you see that at least in this case, it actually looks over the central and the eastern Pacific, which kind of surprised us. So, you know, I think there may be advantages to limiting the domain. But I, you know, because the RMM is a global index, my guess is that it wouldn't make too much of a difference. But we could explore, we could explore just giving it the warmth bullet confirmed that. It's a good suggestion. Thank you. Yeah. Thanks, Jaime. Judith. Yeah, thanks very much for this talk. And I have a question about exactly this slide you're showing. So I have not a lot of experience at looking at these relevance maps. But if I look at it, I would say it looks really amplitude is high. Yeah. My question is, can you filter that out and see if there is more in there? Or maybe look at the spatial temporal structure of these relevance maps to see how big an area is. Yeah. Right. What is appearing from a quick view? Yeah, yeah, yeah. Yes. I think that the answer to that question is probably, but we have been thinking hard about exactly how to do that. But like, yes, I think you're right that these composite plots basically show that the model looks where the composite is large. But because you have a relevance map for each prediction, I do think that you have sort of latitude, longitude, time, data for your relevance. So one example might be if we subset, say, strong MJOs in the same phase, but under different large scale states. So you try to control for some of these things like the amplitude difference, looking at the difference between those kinds of relevance maps might be illuminating. I also think there's other, we haven't found this yet in our work, but there are examples where the relevance doesn't just highlight the maxima of the composites. So Kirsten Mayer has done some work on sub-seasonal forecasting and they found that the relevance sometimes does give you more information. So I think it's an area where if you're creative in terms of how you set up the problem and you use this output, yes, you probably can leverage it more. I think in these composite views, though, it makes sense to me that that's what pops out sort of first. And just as a comment, I wonder if it would be really helpful to coarse-grain what you have and really get down the spatial dimensionality. That's just a gut feeling. Yeah, we're at two and a half degrees right now for this. So yeah, we could, that's sort of similar to what Jaime's saying. We could think about that. Yeah, thanks. Great. Thanks, Judith. Thanks, Zayn. Jacqueline, you have a question? Yeah. Hi, Zayn. I do have a little bit more knowledge about machine learning things. So I like to ask you about this year too. So the confidence, it's like greater than 60, it's above the 60 percentile, meaning that is there a 40 percent of this relevance not to be significant. Is that like a good interpretation? Can you explain what will happen if it's above 90 percentile, for instance, in the same map? That's a little, yes. That's a little, I guess I could have explained that more. So all the confidence means here is we're taking the 60th percentile of the most confident predictions. So it's not the confidence of the relevance, it's the confidence that the model has. In making that prediction. And if you do know confidence threshold, the map looks basically the same, except that the colors are a bit muted. The reason that we did it here is in the paper, we wanted to just tie relevance to confidence. But at least for these studies we haven't found, it makes a huge difference. And then if you go up to, say, 90 percent, like you're suggesting for our work, you get to just very, very few sample sizes at that point because you're cutting it by correct predictions out of certain confidence in a certain MJO phase. And so we don't want to go too narrow. But if we had, you know, something like a climate model run, or we had longer statistics, and that's a direction we're maybe going in, then I think you could look at the sort of upper echelon and see, you know, how does the confidence change across predictions? All right, thank you. Yeah. Thanks, Jacqueline. We are eight minutes over. So yeah, feel free to leave if those who have to leave. Thanks for staying on. I had one other question saying, if you're okay to stay on. Yeah, yeah. Stay. So he was yesterday in highland stock. He showed how the NaO's mode can influence, like, MJO initiation, especially like. So have you thought about including, like, extra tropical variables that can potentially improve this prediction further, right? We have not. I have thought about it, and that's as far as I've gotten. Yeah, so you could imagine, you know, a different model that used only the extra tropics, say, that would be, you know, that would probably do pretty poorly. But if it did well at all, that would maybe be interesting. And you could also think about combining the tropics and the extra tropics more. So we haven't, again, we've gone out to 30. So this is all 20 north-south. We've gone on to 30, but we haven't looked more at, you know, giving it something like the NaO pattern. And I think you could do that. But again, you know, that's, that to me just underscores that, you know, these models are flexible. And if you're creative and you ask the right question, you know, it's a good direction. But now we haven't done it. And then I would say, you know, that, yeah, yeah, could be interesting, but we're not planning on doing it in the near future right now. Great. Yeah, great. Yeah, really nice work. Thanks. Thanks again, Sain. And thanks to all the speakers. Yeah, thanks to all the speakers today and all the student groups as well.