 We move to our next talk by Dr. Sam Shin. Sam is a distinguished professor of mathematics and statistics at the San Diego State University and the visiting researcher at Scripps San Diego. Sam also gave a really nice talk in our colloquium a couple of weeks back. Thanks for that, Sam and look forward to your talk today. Well, thank you, Anish. Thank you, Judith, for putting together such a nice program. We have enjoyed excellent presentations. So my presentation today is about Suponsumbo CCA method and Suponsumbo joint EOF method for SDS prediction of US precipitation. So this is a joint work with Tom Schmitt and Ralph Farrell of NOAA Star. It's basically a paper by, it is a paper by Schmitt at all like here in his old paper, 2006-10, but with some new development. So our goal is to predict the US precipitation. For monthly, we have one month need. For weekly, we have previous month need, previous week need. And so what is the main idea? You can see this figure. We have colored different domains. And so that each domain is predicted. So there is a bit of history to this. It's in 1980s at Scripps. Tim Palette and Rodov Presendov, they developed this CCA method with EOFs. They made a very substantial mathematical progress. That is, in the classic CCA, you need to invert a matrix. But in this weather forecasting business, that matrix is not invertible. And in the EOF spectral space, and so their method can avoid the inversion of a matrix and make it fast and also make it possible. And then in the early 1990s, Tony Bondstein and Tom Schmitt, they introduced this method at CPC and made it an operational model for seasonal forecast. And then what they did is that they used SST as the predictor and then US temperature and precipitation as predictant. And with that kind of a prediction, there is a problem. The problem is that you use the global SST. And what happens is that the tropical SST always dominates everything. Just like in a room, we have so many people talking. We say we have 10 people talking. The person who is so loud, and that is tropical SST, and then you cannot hear anybody else. But for the US temperature or precipitation, sometimes the Northern Atlantic becomes important. Sometimes the Northern Pacific becomes important. So what we decided to do in around the year 2000, I was on sabbatical at Goddard at Bia Law School. So Bia gave me this problem. So now let's look into this and how can we do better? So we decided to divide the ocean into different regions and let each region predict and then put them together. Just like in the room, we let each person speak. And then assemble them together according to their error. If they have a large error, we have a smaller weight. If they have a smaller error, we have a larger weight. So that is the best idea here. And then what Tom did in 2016 is that, well, we actually can use the oceanic precipitation as a predictor. Even we use the previous month, the US precipitation as a predictor, so we can increase the number of predictors. So let's get to the definition and mass review. Again, I'm a mass professor. So I always like to analyze what is original mathematics problem and if we do forecast, do we understand the mathematics and what is the essential idea of the mathematics and the method? And what are their assumptions? Are there anything wrong? So here the first is CCA. Well, CCA, basically, we know correlation. That's the two-time series of correlation. And then CCA is you correlate two fields. One field would be, say, tropical SSD. That's a field. And one field would be US precipitation, for instance. So there's two field. And then you try to maximize the correlation. And so you put a weight for each grid box for each field. And then you create a time series for each field and then maximize the correlation. So when you use this method as a prediction and then you're using Tim Ballet method, this becomes a simple regression problem. So the mathematics computing part at a prediction stage is very short, very easy. And then the second method is called joint EOF. So joint EOF, so this is that you can compute EOF at temperature, EOF for precipitation. But you can also do, you put your data, say, temperature data or precipitation data or any kind of space-time data. So you put a space in the row and time as column there. And then you stack them on top of each other. So top is, say, precipitation. Bottom is, say, temperature. And then you compute the EOFs of this tall matrix, the stack-together matrix. And when you use that to do prediction, so you think this way. So instead of, say, the column aligned with same temperature, same time for each column, and you just shift it by one week or one month. And then at the last column, for the predicted part, you have one column blank. So that is missing data. So with that kind of idea, your prediction problem become a missing data problem. And then we use this multivariant regression to fill in the missing data. So basically, you change the temporal problem into a spatial problem. That kind of mathematics is incorporated in this paper. And what do we mean by super-ensemble? So that means that we assemble things together with consideration of error. So you have a large error, and you have a smaller weight. You have a smaller error, you have a larger weight. And then super-ensemble 6A, which means each member is done by 6A. Super-ensemble joint EOF, that means each member is predicted by joint EOF. And let me give this kind of a flow chart. In my last talk two weeks ago, I emphasized flow chart and reproducible results. So I feel that we all should follow clear logic, clear mathematics, and clear assumptions. We know what we're talking about. So this illustrates the SE-6A method. So you have, say, predictors. Say, for instance, you have SST from different regions. You have a predictant, for instance, the US precipitation. So this is only an example. And then you say you have the time t. You have several predictants, like the tropical Pacific, say North Pacific, et cetera. And then each of them, you decompose this field by EOF so that you represent the field in the spectral way. And then you have your US precipitation and the predictant. You do the same EOF decomposition. And after that, you formulate this simple regression. Just like y could be a plus bx, a simple regression. And then you make the regression and then you estimate errors. So we also have produced an error formula. And then you produce these five predictions. And then according to their error, you put them together as your ensemble prediction. So this is kind of the flowchart of the idea and the mathematics. So we developed this method around around 2001 and 2002 when I was on sabbatical at Goddard working with BLL. And so we produce a long technique memo in NASA, documented all the mathematics. And then we publish a paper. And with target, try to overcome this spring barrier. And the way to do it is say that statistic forecasting, if just one shot, that's the linear forecasting. But because of this error weight, that becomes nonlinear. So this we call quasi-linear. And so that means that you try to find dynamic connections from which spot connected with which spot. Like say, for instance, the annual with Southeast the US, et cetera. And then you can have all kinds of conglulations and try to see if you can overcome certain problems. And then 2003, Kensi Ma introduced this to CPC. So now CPC is using this just after CPC has been using this as a operational model. So CCA is used there. And on some places, it is also used there now. And a further improvement come to our joint work with Tom Schmidt. And so that is you will introduce the joint EOF. And we tested precipitation as a predictor. And then we try to make it simpler into classroom teaching. And this is still ongoing. The very example of the scale. Say we use GPCP, 1-degree GPCP as a validation, ground truth. And with the forecast. And then we use the correlation. So that it will just, that's 1997, 2014, we do a loop. And then we do a time correlation for each corresponding grid box. So you can see joint EOF in June has good scale in the middle part of the US. And the CCA part has some overlap scales with joint EOF. But they are actually there in the different places. So that is good. They are not the same. And we put them together, joint EOF and joint under CCA. And that's better. So that is third one. So we see it's better. So that means we don't see many of those purple regions, gray regions. The gray and purple regions getting smaller. And then if you look at, say, what about the whole US average? So whole US average. And then we see that it's part 0.3, 0.4. And then you put CCA and joint EOF together, it's better. So we see. And then say, well, how can you get this US precipitation, total precipitation right? And so we did the forecast. And then you can see we got something good. But of course, this is the annual thing. We got it really right. But we got this big bump. And then the spatial correlation is up and down. And so also we see that we could get an enough helps. And then we look at the southern part of US because we found that since that helps, so the southern part we do well. We do better. We do better. So we just say, well, what about the south? So we test the south, and then we have a better correlation. And then this is a better scale to compare to the whole US. And of course, we test in many other regions. Some regions are not as good. Another test of the scale is the heat map, the Tosai map, above normal, normal, below normal. And this is the heat, the heat map. And this is the mismap. Mismap is that it's bad heat. Supposed to be the top third percent, a third, but it gives the bottom third. So that means that this is a bad part, and this is a good part. So what we have found is that probably not much detail to analyze, but I see there is a difference here, the blue region here and blue region here. So that we have more misses in June when there is no ocean part. So the ocean precipitation is helpful. And that's what we want to conclude here. I mean, it might not be a conclusion, it's a kind of hint. And we also tested what is average heat and miss kind of thing. So average heat exceeds about 30% to 40% heat. It's not, probably we want more, but we don't. And then we also compare with model predictions. So this is a North American multi-model assemble. It looked like we're doing slightly better. Somehow, this is the wide region is 1.3. And our 0.1.3 is like blue and sky blue region. This seems slightly smaller, but comparable. And then we tested just a week. What about week forecast? So we just choose like every month, which is seven days, the second week. And then we get some tests, you know, same thing. So we find that this black one is the monthly scale and the correlation scale. And the red one is the weakness scale. The weakness scale, as we expect, is lower. But somehow in January, it's higher. That could be kind of a coincidence. There's no theory to support it. And what we're doing now is we really want to, because this method is not hard. It's pretty, mathematically, used to be actually really hard. But now we find a new way to simplify the mathematical theory. And then we're coding it into R and using SVDE. And then we want to introduce this method for classroom teaching. So summary. We have analyzed the NOAA CPC seasonal forecasting tool. That is what is used in there now. It's called ECCA. And we have tested SECCA and SEGEOF for monthly and weekly US precipitation forecasting. And we're developing the optimized SECCA and SEGEOF. And then we also want to couple that with our 4DVD tool. And we want to use this tool to be used in classroom for teaching. Thank you very much for your attention. Great. Thanks a lot, Sam, for a great talk in these tools. Any questions for Sam? I had one, Sam. So you've tested these largely for the US and the continental US region. Are there efforts to test this in other regions around the globe and other challenges in terms of different regimes or different forcing? For instance, US, and so maybe a big part. But if we go to Europe, it's not playing that big a role. How well can we adapt these methods for different regions? Yeah. And also, at that time, Kim and I posted up from Belarus Group. And I tested many regions in the US. That's how we tried to overcome the spring barrier. Great Plain Area, Southeast Area, Southwest Area. And then we tried from the papers, you know, like Robnowski-Huppert papers, say, which one hit which one. This kind of thing. We tried to incorporate those, all the publishing information we know, into this model. So that is just because the model's flexibility. So we would be able to do that. But of course, still, a lot of manpower they've been intensive. And so that's why now we're trying to optimize this. And then so that would allow us to do the more tests. It's like mass production. So the answer to your question is that we did for US, but I'm not down for other parts of the world. OK. Great. I don't see other questions. Sam, thanks again for great talking. Nice work.