 Okay, so welcome to this. It's the second questions and answers session that we've had in CSDMS on artificial intelligence and machine learning. I prefer to call machine intelligence and then we can just cut straight through that distinction. So the idea is that machine learning. We only have a few papers in the session that dealing directly with machine learning. But I think we can expect a real wave of future papers in coming years. And in any case, even if you're not directly involved with machine intelligence. You're going to have to in your careers assess work that has been done using those tools. You might have to assess it say as a reviewer, or you'll have to work out whether you really want to use products that are generated by machine learning. So this is going to touch everybody. I was really impressed when Google said a couple of weeks ago that artificial intelligence is more important than electricity. So that's pretty hard to beat isn't it. Um, there you go. I hope you can live up to that. On my part, I have to speak very modestly about my involvement with machine learning. So some years ago I used the random forest technique to produce mappings of the seabed around the Falkland Islands to help protect the Patagonian to fish. So that was my introduction to machine learning and it worked very well, but I had to really wrangle with the problems of getting the data into good order sufficient of it, and making sure for instance that the different features the parameters had enough variance and that they weren't too many gaps, and so on so very base level experience with machine learning. So let me introduce the three panelists here we have Richard Barnes, who is associated with UC Berkeley. Yes. And he's involved more on the hardware side on the scalability issues of machine learning and efficiency issues so deals with very large tasks. So we have Dan Buscom, who uses machine learning in geospatial applications. He's a co author on a poster here today. And you're associated with a company martyr LLC, I believe. Yeah, that's great. Yeah. But his real trade name is doodles. So you can check out the little posters, little badges on the table outside. And then we have tie tough associated with see you earth lab. And he tends to tackle land and city management issues using machine learning. So his LinkedIn says that he solves hard jobs. So there we are there's an introduction. Very good. So to start the ball rolling I just I have a couple of questions. And the format should be that we will have five minutes per exchange per question. So we'll ask anybody with a question to craft the question well so that it doesn't ramble, and it gets really to the point. One sentence, and then we just earmarking five minutes for the answer from the panels. So it's a clipping pace. But we last time we just ran out of time completely and then the meeting, bubbled over past the clock. Okay, so here's my first question is, how are physics numerical and process models being joined with AI models. What's the method. We are. I don't mind starting on that one. So I'm not an expert in this field but the things that I've seen and read and point to there's a couple of different techniques one is called physics informed neural networks, for example. So there is that you're actually baking in the physical process to the machine learning model. And how does that work. You have a, let's say you have a partial differential equation that kind of that governs your system, you're trying to essentially capture the the central parts of that. Descritize it and make it a differentiable loss function that you can then minimize so so it's kind of literally baking in the physics straight into your machine learning model. So the way to do it is kind of to use machine learning is just to use it in terms of like parameterization, like use it for data discovery like you're actually using machine learning tools to end with large high dimensional data sets that you've collected or model outputs, actually then distill that information. So you can kind of pull out the kinds of patterns that you can't see with your eyes. Another way in which the, this is kind of a little bit separate from actually kind of the methodological improvements that are necessary to solve physical problems in machine learning but forcing models with outputs that have been generated using machine learning, for example, the kinds of things that I do, you know, and I'm using machine learning on gridded data sets such as photography to then generate maps that can then elucidate physical processes and numerical models can be kind of used in validation that I've done. But, but mostly I would say it was a mindset I don't think the machine learning community is going to solve this problem for the numerical modeling community necessarily. I think that a lot more numerical modelers will have to adopt data science concepts and techniques in order to kind of distill the machine learning into their, into their models. There's a lot of overlap between what it takes to produce a machine learning model, you know, and numerical models, you spend a lot of time with code, spend a lot of time with data, you do a lot of data wrangling. You have to spend a lot of time with research infrastructure, you have to do a lot of methods development before you get anywhere in terms of your science. Those things are the same with machine learning. So I think it's largely a mindset thing. So I don't know that I can speak to the methods of how machine learning is being joined with physical and numerical process models, but I want to draw a distinction between those those parts of science in which we intend to do prediction. And those parts in which we intend to do explanation. And I think ML kind of touches on those differently. When we're thinking about prediction, the kind of question that comes to my mind is, are our models already doing machine learning. I come from an ecological background, and in that world for say fisheries management we construct models with hundreds of parameters, and each one of them has some kind of physical manifestation that you can in principle measure. But when you then tune those models so that they match reality that's a lot of flexibility. And Carl Bodiger, UC Berkeley had a paper about a year ago deep reinforcement learning for conservation decisions. And he said what if we just throw this out and try to instead learn what the the underlying. What nature is allowing us to predict from the data we have. And this outperforms by far the parameterized models. So switching to a paradigm where we're using ML to replace process driven models in order to do prediction seems really fruitful to me. When we look at large language models, there's this idea of foundational models, models that have been trained on so much data that they've come to recognize something fundamental about the world. And then fine tune those models to work on particular data sets. When I think about problems like time series analysis. Maybe there's something fundamental going on there. Maybe models that learn on enough time series pick something up in a way that they then become foundational. And that's the case. This isn't like a one off problem that each of us has. There's a kind of underlying unity. And perhaps we can find models that we can all leverage. Think about explanation as the goal of science, Google had a paper on evolving symbolic density functionals recently, and in DFT density functional theory when you're trying to predict the surface properties of materials, you can't calculate all of the forces and so you use these density functions to approximate them, and people have suggested various functions over time. Google fed this to machine learning program and asked it to try to evolve symbolic equations that we would be able to recognize as people, and it found both equations that people had been using for decades as well as entirely new equations. Greg Tucker's West Valley demonstration projects is one of those things where we kind of dig in and we asked well if we have a suite of equations which ones actually end up predicting the world best but these are kind of drawn out of the literature right out of things out there that we haven't yet thought of. And maybe we can find those using ML approaches. You can imagine a situation where you, you have a clue system in a lab and you just run it continuously for a long time gathering data and then feed that into an ML model and you ask it well what do you observe that's conserved in the system. And for both of these cases both prediction and explanation, AI and ML the language in which they tend to be written is Python, the major frameworks are in Python, and I think that has a kind of effect for legacy models, built in Fortran built in MATLAB built in languages that might not interplay so well. And that has effects for how we should think about designing our modeling as well. Thank you. I think I'm on you guys hear me. Hello. All right. So I am an ecologist by training and I work at a national synthesis center where we sort of think about how do we just squeeze more questions into the same project and how can we squeeze more and more data sets into the same project. And so from that lens we're trying to think of things really broadly and about applications of AI really broadly and so as when I was posed this question about physics, I'm trying to think of. There are really a couple different ways that people are attacking this right now. One is hybrid models. So these are what you heard a brief explanation about what something where you're baking in something about known physics into the model itself so maybe this is an entire independent layer in your network. There's some type of training bias that you put in maybe this is some type of meta parameter that you put on that all drives everything towards this hybrid solution where you're constraining the possible outputs. The second time is the second type is the exact opposite of that, where we're saying how can we improve our existing physics knowledge with more data. We can take a ton of data and say how good is F equals MA, and you can go and really test existing models so you can see how those two different ways of thinking about the problem are both driving the problem forward in equal ways. The third way is model emulation. So you think about something like your computation fluid dynamics model that's over a decade long doing everything. So this is a hybrid model. If you can do AI to build if you can build an emulator that just emulates that process by learning what's going on and doing it short doing quickly. This can speed things up so there's a process called mesmo where you in remote sensing you categorize a cell into however many parts you want this thing is really computational intensive and a really good case from an emulator and so people have been working on an emulator that it's this thing up maybe 1000 times. So it's something you know the process, you just want it faster. So as you're thinking about the opportunities going forward to driving it and driving your science, I think the idea here is that this should be a tool that you have next to you. You know there's this joke going around that scientists are not going to be replaced by AI they're going to be replaced by other scientists who use AI. So I think it's just your it gives you superhuman strengths in a lot of ways if you know how to use it, not saying it doesn't have problems but the draw here is that it amplifies your ability to do your science. It's not doing your science for anybody like to add something for the floor. It's a pretty central question for this with his crew. And hold on I'm going to bring a mic because we do have people online. I didn't dare to ask Mark to do this again. I wondered if you had a view on the differences between AI and ML to talking about and can it. I guess my column or classical in bus techniques that do a lot of the stuff that you're you know they make predictions they strike structure from data that you might not know, which, you know, provide things like geophysics like central geography central to whole pilot is maybe don't feature so much in this discipline. But I wonder like the language is slightly different. But quite often the goals and many of the processes I think such in space is efficient you can transport the simplest possible model you can get away with. How do you regularize all that kind of business is stuff that it's been a geophysics since the 60s. I just wondered whether you think there is like a very big difference. There's not so much characterization. Sorry, this is not. Yeah, I'll start maybe I can interpret it. So is there a significant difference between AI. I have zero experience with inverse modeling. I personally think that there is a lot more opportunity to use machine learning for those processes that are very, very poorly understood and not parameterized at all. We don't have any first principles understanding for examples of that might be social and biotic interactions with the physical landscape. And so I could probably construct an answer around that, but I don't know if I could. I don't think I'm qualified to talk about inverse models. In terms of straight optimization versus ML I would say they're intersecting circles. There are optimization problems that solve directly their optimization problems where you're doing a lot of kind of modeling and searching for minimums such as with like basic and hyper parameter optimization. And then ML intersects that space, but extends into regions where they're, I think are where you don't know what you're optimizing even. And that looks very different I think than a traditional optimization framework. Yeah, I think the question here is about big data and sort of how big of a machine you get so you know if we think about to the industrial revolution, the big thing there was that we made machines that amplified our physical ability. The big thing with AI is we now can amplify our intellectual ability. So if we're just using it to do current things smarter. It's pretty much an optimization problem, but I think where it comes in is that it can actually generate new ideas. It can actually predict things in a better way than an optimization problem can. And so it's this like future looking generative part where it's becoming this amplifying machine and not just doing some old school thing a little bit better. But one more thing just follow up on that. I mean, what we're talking about machine learning is sometimes called software 2.0 right so we're not, we're not giving the soft, we're not baking our logic into the software. The logic is coming out of software. So in this scope, what we're talking about I think is with inverse models I think you have your logic kind of worked out like you know, you have a kind of an idea of what's causing what. But, and that's, that's called feature engineering in machine learning you kind of engineered your, your data to a point where you've extracted certain features from your data that you can then plug into an equation that gives you what you need. Whereas machine learning is fundamentally different in that you're just providing the data you're telling it that that data point relates to that point, and it's going to tell you. It's going to make that mapping for you without doing any feature extraction whatsoever in lots of cases. Thanks very good. So the next one on my startup list here is, would you like to tell us about some tricky issues with assembling training. What sorts of issues that is. What's on your mind with that. I can start this one if you want. I would say they're all of the standard tricky issues that you have trying to automate anything or pump big data through any type of algorithm. These are major problems right off the bat with your missing data. So how do you impute data how do you fill in those gaps. Your second big journey is do I have enough data, and it's likely the answer is no I mean I think the big thing that we've learned from this chat GPT craziness and last year I mean they now are working with a trillion parameters. So to get optimal performance out of these things, you want a ton of data and you want a ton of parameters and so that's the struggle that people get to right at the end. And then I think the third big challenge that piles on top of that is the sort of different styles of data that going into the thing. You know, you always had to normalize your data and standardize your data but the machine learning can be so sensitive to those input distributions that you want to be really careful about that. And you have to sort of iterate and make sure that you've done that properly so I would say it's the same struggles we've always dealt with. Yes, are at such a bigger scale that they are a lot bigger headache and they pop out in sort of a new annoying place. The sure that out of distribution problems are kind of, you know, my personally my biggest headache. That's in situations where you've trained your model on insufficient data it's overconfident on the data that you trained it on does not generalize very well to other places other seasons other situations. A lot of the strategies that are in place to prevent models from becoming overconfident. They, you know, things like data augmentation and active learning and lots of different strategies training strategies to they all have their limits. And you, you know, we're noticing that, you know, we can't model around those limits in many, in many cases you just have to acquire more data. And then then you're in a space of knowing when to stop acquiring new data. It's machine learning projects tend to be quite iterative. You have to have a lot of patients to get to get them right. Specifically with the work that I do, because models are trained using what's called mini batches. The model only ever gets to see a small amount of your data set at once. It's therefore very bad at learning things like spatial correlation, temporal correlation, because it doesn't get to see all of your data all at once. So that they're big problems for us and like so just using snippets of data, not baking in auto correlation not using time as a variable and that kind of thing and being better at doing that. And finally, I would just say that, like, I think a more general problem is that I think a lot of the model work is a lot more glamorous than a lot of data work. Data, data labeling is quite important in machine learning and it's recognized, I think it's important as being understood and recognized, but, but there is always more unlabeled data than labeled data unsupervised models are still not as good as supervised models. We're still in a landscape where we need to label data, and it's doing efficiently when you track the errors in that data, and we're still working out a lot of details, at least in the geosciences. Well, when I think about data, I think about the quality and the quantity of the data that we have. And in terms of quality when you look at the recent history of language models it turns out that there are predictable scaling laws that allow us to say ahead of time how big of model we would need in order to get some sort of performance and also how much we need to train that model. And I think it's interesting stuff had mentioned chat GPT uses something like 1 trillion parameters, and that follows out of the scaling laws we want a better performance so we build a bigger model. But this is just a paradigm in which we're currently operating. Google released a model called Ag and Chilla, which performs similarly to chat GPT. And they find that most large models are in fact under trained, they don't have sufficient data. So for a constant compute budget you should spend more time training a smaller model, then more resources training a bigger model. And chinchilla in fact comes in at one half to one seventh the size of comparable models uses four times more data that gets better performance. So when I think about a quantity of data again, global satellite missions are getting all kinds of data that we didn't have before. And I think there's a wonderful opportunity to think about what would we do. If we had a large amount of data what kind of ML would that make possible or conversely, if we're thinking about solving an ML problem what kind of data inputs would we need in order to effectively address that. In the fastest models the scaling laws allow us to make predictions about that in a way that could potentially guide the acquisition of data or the deployment of sensors. There are also projects like Zooniverse, which hosts snapshot serengeti and this is a way of using citizen science to crowdsource labeled data in the case of snapshot serengeti the label something like a million images a year. You can identify animals in them, and then you can train ML to recognize those animals and get a kind of virtuous feedback loop where the ML is able to tell you while I'm uncertain about this could you please get someone to label it but I'm very certain this is a giraffe. In terms of quality of data. One of the things that emerges from from medical, the medical fields is that when you try to train something to recognize say breast cancer. Oftentimes what you find is that it learns something about the type of imagery that you're doing. And it that allows it to kind of learn things that it shouldn't be you wanted it to learn that this was a tumor but instead it learns that people that go to this hospital that take this kind of image tend to have more cancer. And so there's a real trick with trying to make sure that you're not leaking that kind of data, if for instance, I won't go down that route. In image net this this competition to build a better and better classifiers of images. They're about 500 1000 concepts and each concept has about 500 to 1000 images associated with it. But when you dig into image net what you find is that there's a lot of ambiguity, there will be a picture of an apple but a boy will be holding it, is it a boy or is it an apple. Image net tells us it's an apple but machine learning network is going to identify both concepts. And it turns out you can kind of dig into these networks a little bit and find that at the lowest level it knows both things are present and then the upper level gets a little bit mucky as it tries to sort out what you the user actually intended it to say. And I guess one other note that I would make is that maybe there are privacy issues that come associated with gathering large data sets that we might need in order to train models us census for the first time is looking at using differential privacy. And so looking into those kinds of things might be useful in terms of thinking about acquisition of large data. So are there any questions from the floor. It's a good time to take a break from the set questions and open it right up. So the statistical community has raised concerns for years about the way on many scientific fields kind of misuse statistical methods. As these models become more complex and more black boxy and more easily accessible and usable by people with, you know, no, no experience in the underlying guts on what kind of guidance do you have for, you know, how, how do we avoid inadvertently misusing or overextending when when we don't have the ability to really dig in and understand the, the innards of the models. Yeah, so there are a couple of things in there that are really meaningful one I think is this you ended with the interpretability of my models which I think is sort of the base of a lot of this problem that you're talking about like you get something like chat and SPT with a trillion of these things. We, it's absolutely impossible for the human brain to understand what's going on there. Right, maybe your little one, maybe you make a small one with 100 nodes you could then spend the next year figuring out what all the nodes do but turning them on and off. But when you get to a trillion, it's, you're not going to know this so that that understandability of the models, I don't see going away anytime soon. And now the statistical misuse problem. I may be an optimist here but I think that it's going to help and not hurt the cause, because the AI can help people stop making silly mistakes. So right now the barrier of entry to statistics is so high, even if you just want to be a frequentist. It's really high and then you want to be a Bayesian you're going even further and then you want to do some crazy modification of either one of those. And so how do we make it so an elementary school kid and do a reasonable analysis. Well, I think it's something like a big a model that's a big AI model that's going to safeguard against a bunch of stuff. So I'm not saying that doesn't have problems that come with it but I think that it's going to lower barriers of entry for people and make things more usable and make things more understandable for the user, even if the network start to get so big that we have no idea why they're making the choices. That's a fantastic answer and I can follow up on that but I would say that one thing I would say is, don't trust your validation metrics, I think it's just a simple thing. A lot of people really trust them validation metrics they look at their validation metrics, they tell them they've been model trained okay, and then they run with it. That is a very bad idea. And because of these reasons that we were talking about earlier on it's actually quite likely that it's going to fail. You have to have independent metrics, and you have to have multiple metrics. You have to understand all of those metrics, and you have to be careful and how you report those metrics. And ideally, you have to kind of get some independent verification at least a couple of those metrics, in order to be really confident. So what I guess I'm saying is machine learning is just should be treated as just one tool in a suite of tools to solve any problem. And I guess I also have an optimistic take on the switches, but I was reading some lovely papers about the efficacy of the keto diet earlier this year. And they had something like 25 people in the study and then five people dropped out and then they had split the study, and they had calculated the number of people that they needed to achieve 80% power at point five statistical significance and their study dropped below that. But they had a whole paper full of stats t test p tests. I don't understand the purpose of all of those but it seems to me that part of the purpose of them is that if I do this particular set of tests and numbers come out, then the paper somehow becomes acceptable science. The ML I think helps with this a little bit is that it changes the onus the ML has learned something. And now it seems like maybe I can concoct a reasonable set of things I would want to see in a paper to believe the ML. Whereas I have a tendency to believe the stats. Does that kind of make sense. Like I would want to see model cross validation very important I want to see an out of set thing and I can just ask are these present. And that doesn't solve the problem of having done it wrong, but it solves the problem of looking at the stats and thinking it's somehow authoritative I look at the ML and I know it's not. And what I'm looking for is people that are trying to correct for that problem. And it's easy to see if those corrections are present or not. You know, there's a, there's an expression that we sometimes hear in science, a fishing expedition. And it's usually used in a kind of a derogatory way. But I'm wondering like hearing you guys talk and talk about what science is increasingly maybe going to look like do you envision a time in the near future when a common approach to a PhD, let's say in some Earth surface sciences field will be. Yeah, there are lots of data and satellite images or whatever for area, whatever. And clean them up carefully to avoid the problems of missing data and so on that you're referring to run some kind of an algorithm on it to see what it sees and then spend the next, you know, three years, figuring out why it sees what it sees. The other one don't see why someone couldn't do that. But I think the question is, is there, there's, is there a broader narrative that comes out of that. If we figure out why it's seen what it's seen does that allow us to do something new with the world and I think that's what differentiates that probably from a good dissertation versus not so good dissertation. That seems very game to me. So I think I hope that doesn't come about in only because I don't want each individual person to have to clean the data. And let's, let's, let's start from a position where everyone has access to all of that data and all of that data is already being cleaned. And then maybe they don't have to spend three years in that discovery phase, but I think increasingly machine learning is being taught at universities. Students are naturally going to want to, if they've been exposed to it, I think they naturally going to want to use it. The, the onus is on professors and individual schools to kind of try to rain that in a little bit, but also try and facilitate their exploration to because I think machine learning could be important, even if it ends up not working for their individual project. It's something that they're certainly going to need to have training and understanding in the wider world. I think I would go back to my earlier statement about the successful scientists are going to use AI as a tool in a way that we currently don't. So I'm imagining something more like the graduate student students sits down and has chat GPT open every day. And they're working together, like chat I have this idea what do you think, oh that's right and wrong because of this. Okay, where find me some data. Okay, here's some data. Okay, I'm pulling that in. In the academic culture of humans, we still have to get through committees and we still have to sort of show a new novel idea. And it's going to be really hard to get an idea AI to like generate a novel idea that actually get you past that. But as a tool, every step of the way. That's just making them go faster and have a better final product than the rest of us. That probably isn't. Back one more time to your particular question. I think that this this question of what's going on inside the network and is that worth a PhD right now that's definitely the case. The interpretability is huge and when you look in neural networks you can find like neurons that correspond to the concepts of something like a spider both visually and textually. I was talking with Laura Larson yesterday and she was telling me that in the time series prediction models that she does when you dig in the network you can find neurons whose values track something like the time series of soil moisture even though the soil moisture is not provided to the network. And so the network is kind of learning something about the world even though it doesn't have direct access to those observations. And that seems like kind of a cool thing to learn. Thank you for something on that Greg. Why isn't ML just like building a new type of telescope. You gather from the telescope results and then you put interpret. Why isn't it just like the new piece of the bar. Next question. I've been experimenting experiments in which is directly infer the existence of something like the F equals me equation, and I don't know if that's been done for planetary motion. Certainly this idea that building elliptical cycles off of what it first seems like a simple theory is something that could come up but I think that we're doing that already a little bit with our physical theories, where they work very well in some regimes that approach critical thresholds and you start to have to add a lot of corrections, and then you pass the threshold and you have a different theory. And maybe there's something that unifies those and we haven't found it yet. So, I don't know that we're doing much worse than an AI would in terms of producing weird theories but we know that they can produce reasonably good theories for a phenomena that we already recognize. Just one, one more thing. And now it was not. So, the along those lines there's, there's the toll make view of how the planetary motions working there's Copernican view. And the, the tomato view is completely wrong in terms of putting the earth in the center and has epicycles and whatever but it's very accurate predictions. And the Copernican view at first was more correct, but it was wrong because it assumes circle or circular motions when they were elliptical. Would, would you think that AI could come up with both methods you know could it could come up with the one that's right for the wrong reasons as well as maybe the other one. Definitely another interesting example. One of the things you observe when you're training network is that it has a loss function and that loss function if you're training well drops over time as the network learns the data better. And then once in a while what you'll see is that that that loss function will drop pretty dramatically as the network comes on some kind of new concept that explains the data better. So if you were to freeze it before you reach that kind of drop what you would see, if you understood the network is one theory of the world and what you would see afterwards is probably a pretty different theory, and that shift between them happens pretty suddenly. I was going to say exactly same things you Richard like you have the lost landscape, right, you're, you're finding multiple solutions. And it's all to do with the training of the model therefore you know what what your ultimate solution is that you have every opportunity to go back and look at all of the intermediate solutions that the model found. And because it's done in such a structured way, and you can examine, you can freeze that you have the weights of the model every single point. And yeah, it becomes a trivial thing to be able to look back, you play that model, learning experience back, and you can then translate whatever loss plateau you found within that loss landscape. Having models that are correct for the wrong reasons is a very common thing machine learning happens all the time. And that's why we try to develop model explainability techniques that try to internet the black box nature of these models. But as as a non theoretician and a non position as all of us all I have to say on this. I keep thinking about the parable of chess, right that chess was the unbeatable game, and it was like the pinnacle of human expertise to be able to beat somebody else at chess. And then AI comes along in 97 and beats the world champion and chess, and people didn't stop playing chess, it actually just like as exponentially increased since then because more people are able to learn chess and play chess and do chess things. And related to that is the go question, and the Chinese government's philosophy is that that was a completely existential crisis game three of the go turn. Because, again, this was, you know, national heritage, everybody plays this thing game three go makes a new move, a move that had never the AI makes a new move that had never been recorded in any historical game of go ever. And it won. And so for this entire culture, it was like we're not doing AI we're not doing AI we're not doing AI existential crisis, let's double down I now realize how big of a tool. You know, for Google it was the F equals MA. This, this is another famous story of they put in a double swinging pendulum, and just put in the coordinates and within 24 hours, the things fit out at equals. They can just do these AI and do processes quickly. I think it's a lot more about owning in the question, you know the, the double pendulum question, they only gave the AI the information of these coordinates of the double pendulum through time. And it learned from that information. So, sort of scope of learning that the AI does determines what sort of processes it can learn from you know what new conclusions that can come to. But certainly I think if you gave it a couple thousand years of orbit trajectories, it pretty quickly figure out what was driving. From what I, from the story I've heard they set up a physical double pendulum in the lab with a little like computer vision tracker and it was just putting the real time coordinates of this real thing in there, and seeing what come out. My question is, if you can suggest some first steps for how we can get ML, more ML methods incorporated in the types of modeling that we do because it seems like it's a bit of a slow uptake so what do you suggest as ways to accelerate that. So I put all my notes for today in a blog post that has a bunch of citations so they're probably 50 citations in there to just get you started if you want to read. So I'll give you that the link but earth lab, it's our earth lab blog from upstairs so at least that should give you some, some starting places. But I think the best place, if you guys are coders is just to start playing around. And so find some example code online, just run through a vignette and boom you realize, like, pretty much all of this stuff is conceptualized, they're abstracted down into a function. So it's not that hard to set these things up, you get an old school function that feels like you're running something you start playing around with it. Usually the syntax is set up to look like something older that you were used to in our they're all like, they use the linear model syntax for the most part. So it's just like a syntax you're accustomed to just throw a new function name in there. It's giving you some results that like have some different nuance and you spend some time figuring out that nuance. And I think you're going to pretty quickly get comfortable with the process and move forward. Did you have a perspective. Um, so when I think about how to get started using ML and my models the the big thing that concerns me about models is how to make them efficient. So when I'm running a computational model what I want to do is kind of smooth out the stochasticity of the world so I can get at some kind of underlying truth that a case study in nature isn't going to provide me. And so I'm thinking where, where can I substitute something expensive for something cheap, and I look at differential equations, especially when I have implicit methods where there might be some kind of iterative process, but the input data is maybe bounded in a little machine learning model that, given the input just predicts the output without having to do that kind of iteration on a potentially complex equation. And I don't know, there's so many ways to answer your question. So I, if you have a way to scope that down. I'd love to talk. Yeah, and the only thing I was going to add was that really it takes a village and I think we just need more community, like, you know, we have a community of numerical modeling system opens or open source codes. It's fantastic. We don't have the same thing for machine learning. Maybe we need that. And it could be inside this group or it could be outside this group but I think there just needs to be more people doing it more people that connected to one another. And, you know, the benefits of that and just the exponential. How good is psychic learn psychic learn is fantastic for simple problems on small data sense. And it's getting better and better all the time. And it's been, you know, this links up Intel have got some Excel at hardware acceleration that it does now that links specifically with psychic learn. There's more and more data sets available, more models. And not just simple data sets like it learn will do okay with even larger and more complex data sets, all the ranges of course neural networks, but at least on tabular data something like XG boost often outperforms neural networks. Yeah, if you're working with tabular data then XG boost is really it's your go to model and if it doesn't work for you then you're an outline. I was wondering if we could forget about data for a second and go back to numerical models so if you imagine you have a very costly numerical model you want to run. And you want to be doing parameter estimation using ML techniques. Could you just discuss briefly the different techniques that would be relevant for that, and how you would do it. So by parameter estimation do you mean I'm going to be running this model large number of times in order to impute the correct value for that's what I want to avoid. Yeah, so I want to I want to feed it. Let's say it's this topography here and that's real topography and that image. Okay, and I want to understand what parameters in a landscape evolution model could produce that. Yeah, I want to run it through. I want to use a single. I want to use an ML technique to get me there faster. So, I think for that I would probably start with something like Bayesian optimization. So I'm trying to choose very wisely the models that I am running. But then when I'm running those models perhaps I am training a ML model in the background in order to predict the outcome of those runs. And I'm able to use that in order to better refine that surface that the Bayesian optimization technique is using in order to choose what full model runs it's going to do. That's the response to that particular problem, or maybe digging in the guts of that particular expensive model and trying to find again places that I can replace with small models that accelerate otherwise complex equations. And the point is, is going to be what I was going to focus really just kind of breaking your model down into all of the different components, possibly evaluating those complexity in a computational standpoint. And then try. And I have no personal experience. Yeah, I mean it makes me think of that, you know, like the old school fitting of an ML model where you're like plotting the log likelihood and looking at these huge surfaces and like trying to find the minimum coming up with different ranges of things. And so I just think of it as like this old school sensitivity analysis, where you're running iterating through all of the possible versions of a parameter. And then you put a neural network on top of that to make that thing a lot faster and just sort of hone in on the thing a lot quicker. And so you're sort of doing like an MCMC type of bounce around the parameter space, but the machine learning things doing it and so if it learns that one space is no good it's going to go focus on another space and do what you would maybe do manually in that process, but I would just think of it as a sensitivity analysis, where the ML is doing that sensitivity run. But here's another idea. So the domain of that particular image you pointed out as you increase the dimensions the amount of computation increases exponentially because it rises and squared right. But ML there's a class of techniques called a super resolution that takes a low resolution thing and makes it look like a high resolution thing. So perhaps you could use a much smaller model that you can iterate on much more quickly and then use the super resolution in order to try to capture the element that you're interested in. It's also a thing called a graphical neural network where you're actually you turn the space into a graph. Instead of a surface. And you have edges and nodes and ML sort of collapse parameter space down a little bit and get an idea and graphical space. So again for that particular thing, the graph domain might work. But for some larger evolution, where you need the 2D surface probably doing something else. So I insert one of the set piece questions because I want to bring the discussion to general issues that concern CSDMS, not just geosciences. So the next question is, what developments in AI should surface dynamics modelers be taking notice of right now. What's coming down the pipe. I think is my turn to go first. I mean we all know about human AI interfaces I think at this point. There are chat box now large land they're based on large language models. There are many of them. They are growing very, very quickly, very scarily. A whole community of open source large language models that are being developed by people like hugging face and then there's a bunch of large language models that have been developed by commercial companies. But it's, you know, it's exciting. We have the ability to use these interfaces to help us code to help us label data and do lots of things. I'm also personally quite excited about a lot of the hardware and things that I'm seeing. You know, for me, it's not a big problem to train models. It doesn't take me a very long time to train models in compare compared to actually applying them because I'm applying them to extremely large data sets. So I'm actually in the space now where I have, you know, I train models on GPUs but I want to use GPUs for all of my inference on my model so I can scale up. I want to use more sophisticated computing architectures and things like that and there's just a lot of good hardware. You know, projects that are out there that enable you to do that and serverless GPUs and things like that. To spin up a cloud instance very quickly and kind of just drop your model in it and run it on GPUs and things like that. I'm really excited about edge computing quantizing models so they can just, they can, you take a large multi-billion parameter model that's being trained as a float 64 precision and you can kind of boil that down into a smaller model that's running on your camera or on your cell phone or on a tiny chip. The potential of that for environmental monitoring I think is enormous and obviously that will feed back into the new model and community eventually. I could talk a lot about the excitement I have about all the different data labeling interfaces I now see. Now these days we don't have to just manually label everything. There's been a couple of examples that have been shared about how there's these large scale things like the Zooniverse for example that actually builds in object detector models, image segmentation models and things like that. So you're in a space where you're very quickly labeling things because oftentimes it's just verifying that the model output was correct or not. Data labeling is so important to machine learning for so many different reasons. Even though even if you're kind of concerned with unsupervised models, you might think they wouldn't need a lot of labeled data, but of course you still need your data to verify that unsupervised approach actually work. And then finally, I could talk about the subject. Finally, what I'm really excited about is multimodal models. What that means is so traditionally we've kind of had machine learning models that take only one form of input, and then they give you only one form of output. They are like for example you feed it in a grid and you get out a grid or you feed it a grid and you get out some text. These days we're starting to see models that you can actually provide the inputs as a grid or some text. It could be an audio file. It's incredible and pay attention to it. Yeah, so I think of AI's impact on Earth surface dynamics modelers as breaking down into affecting the cadence and the scope of science. And the cadence is in my mind it's, it's all the chat GPT stuff really. The name Shockley speaks about scientific productivity as following a kind of log normal distribution, where some people just are so much more efficient. And the way that you get there is by imagining science is a series of hurdles that must be passed in order to produce a research artifact, and becoming really experts and just one of those hurdles does not get science done it's being good at many of them. And I think chat GPT is a tool that allows us to have efficiency rise across all of those barriers, whether it's getting started on papers writing grants ideation, exploring ideas. It's nice to have a tool that I can talk to, and I can ask it any question and it never gives me a right answer, but it always gives me directionally a feeling towards what the answer might be. And it's very hard to find among among my colleagues among the people that I sit with people that can speak to the the deep expert questions I might have about my own domain in in that kind of way at short notice. And it's when you apply this to understanding science there's so many papers that come out and there are tools on the horizon that are able to ingest summarize and bullet point those for you in a way that allows you to really see more of the scientific literature and more of that landscape. And then you can ask something that's ingested that well has anyone ever thought of combining these two ideas I just had, and maybe no one has. And so this gives you a way to really understand the literature in a way that wasn't possible before. So my personal practice I try to use tools like this a little bit each day, even if it's initially not a productivity hit because I think that the the long term outlook of this kind of tool is that it will really change the way our workflows exist. I think in terms of the scope of science foundation models that I mentioned earlier these models that have learned something so deep about the world that they then apply across many domains. Those I think are going to be really key in the future because they reduce the data requirements that we need in order to benefit from models. Another example is that just earlier this year Facebook released a model called segment anything. And you can pass it satellite images and it draws boundaries around the tops of trees you can pass it images from biology and it recognizes cells you can pass it images from cartoons and it will segment out the characters. So we build this as being a foundational model for segmentation. And this is formally a task that we might have thought of as being something that you would need to do special training on your particular domain specific data set to achieve. And I think that will be true for other things time series analysis image analysis. And this the the other thing coming down the pipeline is when you look at the design of new supercomputers 90% of the flops are hidden inside of the GPUs, and that sucks. If you, you're not in a space where you are using GPUs, it's not appropriate for everyone to do so, but thinking about how to use that computational power is kind of important because when you go to schedule on a supercomputer, you're doing so typically a note at a time. And so if the only thing you're using is the CPU, and of that CPU you're only using a single thread, then you're burning an enormous amount of computational resources to do this task, being able to make better use of multi threaded CPUs being able to offload the GPUs gives you much better access to the hardware that's coming to define the computational landscape. And that driven not just by GPUs being able to provide more advantages to things like CFD or what we would think of as traditional GPU domains but it's, it's our supercomputing centers. The organization is like the DOE that build these looking to the future and seeing machine learning and artificial intelligence as being a major use of the machines. So, yeah. I think yeah things that you should know about that are happening right now, five years ago, GANs general adversarial networks were the hot thing and everybody's talking about them. And now transformer models are the hot thing that everybody's talking. I think the general message that comes out of there is that the science being done here is on architectures. So people are really thinking deeply about, should these neural networks be wide and shallow should they be narrow and deep should they be is there some optimal between those two. The general adversarial network was to neural networks facing each other sort of one generated something and one critique that thing. And so this is generally I think talking about this other broader process of people going more unsupervised. So this paper in 2017 that drove the chat GPT was this one called the tension is all you need. And it showed that if you build this big network and give it this mechanism to pay attention to what data it brings in, it'll make its own architecture that's really nice. The reason that beat out all these GANs is because the GAN is pretty complicated and doesn't usually have very much data in it so you wind up getting these scientists are trying to use these things. And they're finding them just really hard to get to stabilize really hard to converge really hard to get to do things that they want them to do. So I don't know where we're going to land on this thing, but we're seeing advancements all the time and that advancements those advancements are coming from different architectures. So I would say if you're going to dive in and sort of make this your new science or dive in and try to do something new with this. That's where I would start what architectures are out there where the architectures going, what architectures are coming out that might serve my purpose better, because tools are getting pretty old pretty quickly. Okay. The last part is that I think you're going to really notice, you're already starting to notice but you're going to notice a lot more data sets coming out quickly because they were either a are generated or AI quality controlled or AI. Munged in some way before I got to you so you're going to find it easier and easier to find things like modus daily products that are coming out like minutes later instead of days later, or, you know, all these new satellite products that are now quality controlled by machine learning algorithms, you're just going to get them quicker, you're going to find that you have to do less quality control yourself, you can just sort of take these things and they're ready to rock and roll. As you're going forward from today on I think those are the big things to pay attention to what architectures are out there. Can you sort of adopt the new benefits from the new architectures. Can you sort of ride this wave of unsupervised and that largely is about data quantity so if you're dealing with small data sets that's probably not going to work for you. The last one is try to take advantage of some of these new data sets that are coming online. That's fun. That's really fun. Yeah, thank you. Absolutely.