 So we are coming back from lunch and we are about to start panel number three which is called machine learning lessons learned in related fields remote sensing physics and astrophysics. And we have two speakers in the session and then we will have like we did before some panel discussion after that. And so our first speaker will be Hannah Kerner from the University of Maryland. And so she'll talk for about 25 minutes or so and then we'll have a couple of questions for her and then we'll go on to our next speaker. So thank you Hannah. Hi everybody, thank you for having me. As Matt said I'm an assistant research professor at the University of Maryland. I primarily work on machine learning applications for remote sensing in earth science and agriculture monitoring and food security now under the NASA harvest program. And previously I worked mostly on planetary exploration problems or problems in planetary science. I'll have some examples throughout this that include both earth examples and some other planets as well. There are many reasons why remote sensing is really a perfect storm for machine learning. And one major reason is that remote sensing has huge data volumes that are coming down from earth observing satellites and other sources of remote sensing data such as drones. So for example, Planet Labs, which is a commercial company that operates the largest constellation of earth observing satellites, every day they are getting 11 terabytes of data. And this is an absurd amount of data to process. Additionally, these data are extremely high dimensional. So they're both multispectral, hyperspectral with several bands to hundreds of bands. And they also have high temporal resolution. So we have frequent revisit times where satellites are imaging the same point on earth every day in some cases. In the case of Planet Labs, this is daily or even sub-daily sometimes. And the Landsat and Sentinel-2 satellites which are widely used in remote sensing have 16-day and five-day revisit times respectively. So you can see here I added this example on the bottom of an earthquake that Diego mentioned in Indonesia. So here we're able to capture the aftermath of that hurricane with Sentinel-2, or sorry, not hurricane, earthquake, different disaster, with Sentinel-2. And above you can see these are images, a time lapse of images taken over three days of fires burning in the Amazon. So we're able to capture events as they're happening with remote sensing data. And additionally, these data have complex relationships both within the spatial, spectral and temporal dimensions, but also between these different dimensions. So one example that I'm showing in this GIF here are recurring slope lineae on Mars, which are these dark features that you can kind of see growing out that look like shadows in many images. But when we look at how these vary over a temporal scale, we see they recur seasonally, and that can help us to differentiate these features from things like shadows or slope streaks. Although it's a perfect storm, there are so many opportunities for machine learning and remote sensing. There are many challenges that are, I think, not unique to remote sensing, but maybe to scientific applications of machine learning that require much more care and engineering maybe than in typical data sets that we're used to working with in machine learning. And one of these challenges is that remote sensing data sets are not independently and identically distributed or what's called IID, which is a requirement for most machine learning algorithms, and it's assumption of many methods that are developed. So in remote sensing, we have a lot of class imbalances, for example, where we might have a ton of examples of corn growing, but we don't have very many examples of illegal mining. And so we want to be able to still detect these things that are rare. And we also have lots of correlation between pixels that might be part of the same object on the ground. So, for example, here, these different colors are representing different fields in agriculture. And if you're treating these pixels as independent, then that's really not a correct assumption because these pixels might be from the same field and growing the same crop, for example. Another major challenge is that the images that we work with in remote sensing are much, much larger than typical machine learning image sizes. So for comparison, one very common benchmark data set in machine learning is CFAR 10. And the images in that data set are 32 by 32 pixels. They're really thumbnails and they're also RGB. They're three spectral dimensions, whereas as one example of Landsat tiles might be on the order of thousands by thousands of pixels and have many bands, in this case, 11 bands for Landsat. And so this is a much different problem to deal with, particularly in an engineering sense. Again, another issue is that there's non-trivial pre-processing and cleaning that's required to work with remote sensing data. So here is actually a time lapse of here. There was actually data. It's clouds and NANDs. And so we have for a single pixel in this image, there's going to be huge variability depending on clouds data. That's not there because of the orbital track. There's interpolation that needs to be done to account for this. There's also a lot of issues of how do you represent your time series or how do you represent your data to account for all this variance across pixels in the same image and also co-registration of images or aligning images from two different periods of time or taking it two different times is a major challenge as well. Another major challenge for machine learning is that labeling remote sensing images often requires domain expertise. So while it might be very obvious for somebody on Mechanical Turk to label cats versus dogs for one cent per ten images, this is really different for labeling things like impact craters or geologic features on Mars that might not be so obvious or so easy to describe for non-experts. Another major challenge is that from these earth-observing satellites, we're getting inputs that have different physical units. So for example, the Sentinel-1 and Sentinel-2 satellites are providing radar in one case and multi-spectral optical images in the other case. And so these can look very different in different units and we have to synthesize these for the model. And finally, we have differences in illumination conditions, which is really more of a problem on other planets where on earth, we kind of can get regular crossing times by choosing our orbits and imaging regularly, but it's still a problem on earth. And it's certainly a problem on other planets where you can see how in these images showing the same site, the Apollo 11 landing site from lunar dust to dawn, these features look dramatically different depending on the sun angle. So in this talk, I'm going to first talk about some common applications of machine learning and remote sensing right now. And then I will talk about some emerging applications that we're seeing being explored much more right now with machine learning. The most common application of machine learning and remote sensing is land use and land cover classification. And the idea behind this is just multi-use class classification at the pixel level. So for every pixel, we want to assign one of N classes. And these inputs might be represented, these pixels might be represented either as a spectral reflectance curve. It could be a temporal time series curve, or we often see images being represented as a patch where we want to predict the central pixel. And we saw this with the sliding window and seismograms also. And so here's one example where you can see the reflectance spectrum for a few different classes of land cover, soy beans, corn and deciduous forest compared to the time series spectrum of those same points. And you can see that this data looks quite a bit different. And so we have lots of different dimensions of information that we can look at these patterns in. For land use, land cover classification, we're largely getting the labels for these models from national databases, such as the USGS publishes the national land cover database, the NLCD every, I think, 10 years or five years, something like that. And the USDA produces the cropland data layer every year. And these are kind of used as the ground truth, but in many places, especially that don't have national programs for collecting these data, we have researchers conducting field campaigns to acquire that data, or they're doing photo interpretation from very high resolution images. The most common types of machine learning methods that are being used for this right now are decision trees and random forest for supervised classification, but increasingly, we're seeing deep learning methods being employed for land cover, land use classification. And typically these are convolutional neural networks, both in 1D and 2D, for when, especially when we're looking at the spectral patterns. And we see a lot of recurrent neural networks and specifically long short-term memory networks, LSTMs, like we've heard about earlier today, being used for classifying time series. So here are a couple examples of where this is being used. So up here, this is an example of where the authors used both 1D CNNs and LSTMs for classifying a number of different crop types in Yolo County, which doesn't have a lot of clouds. And another example down here I wanted to show from Greg Asner's group, where they're actually doing this classification in unsupervised manner where they first cluster the pixels to find similar classes of pixels and then they label it based on their interpretation of those. So this is done for coral reefs in this example. So they can find all of the pixels that might be corresponding to sand versus one type of coral or another type of sea grass, for example. Change detection is another common application of remote sensor, sorry, machine learning right now, and this is typically done either at the pixel or the image level. So you might be looking at if one pixel changed between a pair of images or a number of images or if some feature changed in one image. And typically, people are using different space methods, such as here, if you subtract the image and you can set some threshold that would determine whether there was a meaningful change in that pixel or not. Or very commonly, people are using post-classification comparison, which is simply doing the land cover, land use classification for two different time periods and comparing the predicted classes. For image level change detection, where we want to see was there a meaningful change in surface features that occurred here. We're often using more object based or deep learning approaches where we're comparing multi-temporal images at the feature level rather than the pixel level. And so in this example, this is from a paper I worked on where we use an autoencoder neural network to extract the latent representation or the most salient features from two different images, and we compare them in the feature space to see was there something meaningful that changed in order to ignore irrelevant variations, such as lighting condition or misregistration. Scene classification or what's called image classification in machine learning is also becoming more popular as images become a higher resolution that we're collecting from these remote sensing satellites where a pixel may be used to represent an entire field or entire class type. But now we have much more detailed images where we might be more interested in finding more nuanced classes. And so these methods are really enabled by deep learning today and in particular convolutional neural networks. So here are just a couple of examples. One on Earth where the goal is to classify image patches collected from Google Earth using a convolutional neural network and or actually I think they use a random force there. But in this example on Mars, I wanted to include this because it's being used in order not just to classify and as an exercise, but it's being used to enable a search function in the planetary data system so that you could search for, for example, images that contain craters or that contain dunes or something that you might be interested in. So this is really enabling others to to streamline their scientific analysis. One challenge in scene classification or image classification for remote sensing is that there is a lot less labeled training data available than for pixel level land cover classification because, you know, in pixel level, we might have thousands of examples just in this one image, but for image classification, we have much fewer. And so for that reason, a lot of researchers are doing things like transfer learning, as we heard about lots today, pre-training image, pre-training models on larger image data sets like ImageNet or even benchmark remote sensing data sets and transferring those models to fine-tune them for a different task. And actually, this work here fine-tuned an ImageNet pre-trained model for a totally different Mars application, such as you how well these can transfer. Another common area of machine learning for remote sensing is novelty or anomaly detection. And this is the task of detecting rare or unseen patterns or outliers defined as spatial, maybe morphological, spectral outliers, temporal outliers. These things that I think Carrie Ann mentioned earlier are kind of really hard to do with machine learning, typically because we're looking for large scale patterns and not so much the things that we only see a few of. And so that's what anomaly detection is really good for. And so typically these are either unsupervised methods or one-class supervised methods where we're using machine learning to characterize the normal class. And then we look at deviations from that normal representation to identify anomalies. And the majority of applications for machine or for novelty detection are using the RX detector method that computes pixel-wise anomaly scores. That is the Mahalanobis distance between a single pixel and a background distribution, which you could define to you could define the background to be a window around the pixel, the rest of the image or even an entire data set of typical images. And so this is an example of where this could be used to detect a building and maybe a desert landscape in remote sensing. But one of the most common types of methods for anomaly detection in the machine learning literature are reconstruction-based methods where we are learning to, we're minimizing a model reconstruction of normal examples such that this distance or this reconstruction error will be large for novel examples. So here's an example of a spot where the Mars Science Laboratory rover has cleared out some dust. And so this is a novel feature with respect to typical Mars geology. And so the model when it reconstructs what it thinks this image should look like, it is missing this novel feature. And we can see that if we compare them that the feature that is novel is revealed and we can use this for finding things like meteorites or other things on Mars. But of course, there are many applications for this on Earth as well. Another common application is for regression or estimating physical quantities from the remote sensing data. And one example of this is it's very commonly used in agriculture for estimating yields for different types of crops from remote sensing data, such as maize yields being estimated in the corn belt here in the US. And another example is again from Greg Asner's lab, where they're estimating above ground carbon density directly from remote sensing data. And these types of models are often combining multiple data sources in order to predict these variables. And most commonly people are using regression trees and feed forward neural networks and process models like Gaussian processes. But increasingly, again, we're seeing convolutional neural networks or LSTM recurrent networks being used for this. So now I'll talk about some more emerging applications in remote sensing for machine learning. And one of these is object detection and mapping, which is very similar to scene classification or image classification. Except now we want to predict actually where in the image the feature we're interested in classifying is. And so we'll draw a bounding box around that. And very commonly there are people are not engineering their own networks for this. They're using off the shelf networks, such as YOLO, you only look once is what that model is called. Regional CNN, this is a whole string of variants on the regional convolutional neural network. RCNN, fast RCNN, faster RCNN and regional fully connected networks, which are similar, but use more of a kind of unit looking structure than the other one. And so here are some examples where ship detection is kind of a classical example of this. Car counting is another example and crater counting as well. And so these methods are really useful if you're trying to get some aggregate statistics about a feature like you want to count the craters and you want to look at the distribution of sizes of those craters, for example. But there is a major challenge of redundancy with these still. So here you can see a lot of these ships are kind of being counted multiple times. And so a lot of the research and object detection methods is eliminating redundancy or speeding up that process of filtering out duplicates. Semantic segmentation is another emerging area for remote sensing. Though it's extremely common in machine learning, especially for things like self driving cars where we want to link every pixel in the image to a class label. And so this is typically supervised where we're doing semantic segmentation where we have some class labels and we want to cluster the pixels into similar ones. But we also want to assign a label at the same time. But in unsupervised segmentation, we're more interested in finding which objects are similar and then interpreting those like in the coral race example. And so here you can see two examples where this is done for some building types and trees and roads. And then I think this is a really interesting example because it may be hard to see with the contrast of the screen. But not just our whales being identified and segmented in this drone imagery, but they're also identifying different parts of the whale. And that's really useful for these kinds of studies that people are doing, not just getting superficial classifications, but really getting insights that will help with advancing these individual sciences. And again, commonly people are using off the shelf networks for this rather than hand engineering their own. And commonly, these are fully convolutional networks, UNET, which has come up today and mask our CNN is probably the most common one I see today, which is just, you know, in that family of regional CNNs, but also includes this mask prediction. Another emerging application of machine learning for remote sensing is super resolution or what people usually do is called pan sharpening, where you want to fuse multiple images into a higher resolution image, or you want to take a lower resolution image and predict its higher resolution counterpart. And so this is an example here where this first column is miser 275 meter per pixel resolution images. The second or sorry, the third column is the higher resolution 30 meters per pixel Landsat image. And in the middle here is the higher resolution image that was predicted by a machine learning model, in this case, a generative adversarial network again. And you can see that it looks pretty similar. And so commonly people are using, as I said, GANs for this, but also autoencoder neural networks, convolutional neural networks and some other generative models as well. One one issue with this is that while it may be useful for enabling downstream analysis, like doing object detection here, which it might be hard to detect what looks like center pivot irrigation here, might be hard to detect that feature in the lower resolution image and easier in the higher resolution one. There is some concern about what the interpretation of that predicted data is. So people are concerned. One, I mean, you may predict that there was a particular feature there, like a fault, maybe, but how do you know that it actually is if you don't? And I think remote sensing or satellite images are typically seen. They're kind of a trusted medium that is not really thought to be very altered. And so if we're if we're doing a lot of super resolution or kind of predicted images, falsely generated images, there's a lot of concern that this will lead to mistrust or misinterpretation of remote sensing data. People are really worried about deep fake. Some of you might have heard of this where, you know, images of fake people that look extremely realistic or fake scenes are being generated and interpreted as truth. Finally, one example of where machine learning is being used in remote sensing increasingly is for image registration. And image registration is a process of taking two images that were acquired by different sensors, maybe with different resolutions or at different times. And we want to align them at the pixel level so that we can use them for some downstream analysis like change detection. And you can see this example here, where these images taken by the Lunar Reconnaissance Orbiter Camera are misaligned. There is actually a new crater appearing there, but it's really hard to tell if something is a new crater coming in from this side, maybe, or something that just was in the adjacent frame. And so this is a really important step for many downstream analyses. And so people are looking at using neural network models, as well as some other machine learning methods to estimate this mapping between a pair of images in order to co-register them. So this is often done using Siamese networks, which are just identical networks that are joined by a distance metric or some distance or similarity layer, as well as just directly learning these mapping functions using regression methods. So you can see here how this was used to take these images that were a little bit misaligned. These are Sentinel-2 and Landsat-8 images. There's some misalignment at the pixel level, but then after doing this transformation learned from machine learning, they are aligned at the pixel level. So I'll end by talking about some limitations and directions for future work that maybe can feed some discussion for our panel discussion. So even though land cover classification is most common, most commonly, I mean, really, it's universally using machine learning. But most studies are focused on, I made this map of this crop for this year in this region. And there's not a lot of discussion of generalization. So, for example, maybe I train this model on the US corn belt, but how well is it going to do in a country where there are much smaller field sizes or different types of crops being grown? But also, if you've crafted some representation based on the time series, well, how is this going to generalize to future years that might have delayed planting, for example? Global inference and using these methods operationally is still a major challenge. It's one thing to build a model for one county, but it's a complete other problem to deploy this at a regional or national or certainly global scale. We also have very limited ground truth, particularly for countries that do not have these national programs to produce these data sets. And there's not a great culture of publicly sharing these ground data sets. And another issue that we've talked a lot about today is interpretability and explainability, but also reproducibility. So it's, I think Qin Kai commented on this, that there are ways to inspect what's happening in the network, but many people are not going this kind of extra step after they've gotten their results and they're ready to publish in order to open up that black box. And it does require extensive engineering and experiments to do, but you can do it. And so I think there maybe we can develop more tools that make that easier, things like that. But one example I wanted to share of doing this is this paper where the authors looked at for different types of crop type theories, which what these features were that were being activated in the different feature maps that were learned by a convolutional neural network. And so they were able to find out that, for example, some of these activated local were activated by local peaks in in the time series. And some of these were activated by the decreasing slope of the time series. And that can help interpret the model, but also inform future research. So I will end there. Thank you for your attention. And if we have time, I'm happy to take any questions. Great. We have time for a couple of questions. If anybody has one. All right. So if I understood that this obviously you're working with massive data sets and and as you pointed out the problem in the generalization, but certainly within areas for that, particularly for the crop, particularly for the crop situation, we're at a stage now where we could potentially use machine learning to come up with a whole range of climate change scenarios for crop and crop failures or is this already being done? And it's something that people are working on. Yeah. So one thing going back to like fusing inputs from multiple sources, a lot of what we're looking at doing is incorporating our crop models with weather predictions and in order to do much longer forecast. So instead of looking at the end of this current season, which is really a big challenge right now is doing in season crop classification and yield prediction. Can we go even further and predict what will happen in the next season or the season after that, especially as climate change is making our food system more vulnerable? Yeah, I guess it was because we talked before about interpolation versus extrapolation and then in ways to perhaps consider other data sets as well. Yeah, I think that's a great point. Like one of the things I'm working on is in this current corn season for the United States, we have these huge floods that are becoming increasingly common extreme weather events that really break our prior models that are based on an assumption of a regular growing season. And so we need to develop models that are capable of being robust to changes that might occur that we might not anticipate. Thank you. All right. We'll move on to have some more questions during the discussion, I think. So then we'll move on to our second speaker. Bryce Menard out from the Johns Hopkins University will talk about statistics, machine learning and astrophysics. Hello, everyone. So I'm I'm Bryce Menard from Johns Hopkins University. I'm an astrophysicist. I'm not going to tell you all about machine learning and astrophysics today. I will try to give you an overview of different techniques that can be used when we have complex data, and I will use astronomy as an example. All right. So to start, this is the typical data we have. If I can move to the next slide. Yes. Okay. So what you see here are lots of galaxies. This is an image taken by the Japanese Subaru Telescope. So we can accumulate large amounts of data and make maps of the universe. So this is real data. Every dot that you see here is a galaxy. We are here on the left. And what you see is the large scale distribution of galaxies in the universe. And there is this beautiful filamentary structure that we study. For each dot in this visualization, we have an image and we have a spectrum. Okay. So those are the two fundamental fundamental types of data we have in astronomy. 2D and 1D. And for the seismology room, so all the 1D applications that we have have a lot in common with what you do when you study seismograms. Okay. So we have millions of those and we can map out the universe over a pretty large fraction of its accessible volume. So this is the map going all the way to the first slide, the cosmic microwave background. It took 15 years to accumulate all the data that you see here. And I'm presenting this because I think that there are lots of similarities between us in astrophysics and you in geophysics. We both do geometry under sphere. We both study things we cannot touch. We cannot conduct experiments. All we can do is to record data from a distance and analyze it. So we should talk to each other more. All right. So in science in general, I'm going to quote Gibbs here. What we are trying to do is to find the point of view for which the subject appears in the greatest simplicity. OK. So we have a complex world around us and we get to know it more and more through data rather than our own senses. And what we want is to find simplicity in this complex world. So this has been, you know, that's what science does. It's been going on for centuries, but recently things have changed a bit for a long time. Visual insight was a key and important guide in doing this. And when we do this, it's very important to keep in mind how we do this. OK. So what we do at the end of the day is that we try to use the language of physics and statistics to describe what we see. So what are the limitations? So recently, visual insight has been receding because the amount of data that we have gets greater and greater and much greater than us in size and dimensionality. But something I would like to point out also during this talk is the second point, the limitation due to language. And when we have complexity, this itself can be a limitation irrespective of the data size and data dimensionality. All right. So I'm going to show you examples of things we do well in astronomy and things we cannot do well at this point. So first, another image. All right. This is the decal survey. What you are seeing here is the actual data. Again, lots of galaxies. And now what I'm showing you here is a model of this image. OK. So if you did not really pay attention, I'm going to swap between the two. This is the data and this is the model. This is the data and this is the model. And to show you that we have a pretty good model of what we see, we can look at the residuals and most of the residuals are close to zero. OK. You see a little bit left around some of the brightest objects, but that's not much. And we could actually increase the complexity of the model to better characterize those brighter objects and get even better residuals. OK. So what is happening here is that when we see an image like this, we can think of it in terms of objects. We can try to detect the positions of the subjects, measure their shapes, their brightness, et cetera. We can turn that into a catalog. Once we have that catalog, we can regenerate an image, which is what is called here the model. OK. So this is showing that we can understand. We have the right language to describe what we see and we can reproduce 99% of the variance in this data set. OK. So when we do this again, we start from the data. We are in pixel space, billions of pixels. We can do dimensionality reduction in this case because we can talk about objects. We can detect them. We can measure their parameters. And typically, their parameters, we are using low order moments to extract them. OK. So the zero order moments, the total intensity, going to be the brightness. The first order moment is going to be the position. Second order moment is going to be size, ellipticity, et cetera, et cetera. OK. So we are basically doing, we are using Gaussian and Taylor expansions here to describe what we see. And this is all very low order descriptions of the data. But that's enough. It's enough because I can use this. I can turn my pixels into a catalog. Dakros talked about the importance of having catalogs earlier. This is exactly the key point here. And we like this because this is a very nice space. This is the environment we like to be in to work because we can do searches. We can play with databases. We can look for correlations directly. And that's much easier than working in pixel space because this is too much to manipulate. And it's not convenient at all. All right. So this is an example where things work really well. And we can regenerate the original data. So if we have been able to go through that pipeline, basically we can say we understand the data. And this means that we have an efficient language that allows us to extract the information efficiently in a concise way and regenerate the data meaningfully. It works with these galaxies. This is a relatively low resolution. So they are basically blobs. And blobs are typically Gaussian objects. It works also in the case of continuous fields. This is a Gaussian random field. We can also describe this with a limited set of numbers. And as you might know, this is fully described by a power spectrum, which is measuring variance as a function of scale. So this is, in this example, we can extract a few numbers and regenerate the field. In this case, the phases have changed. So it's not exactly the same. But it has the same statistical properties. So this is the regime where we can do everything we want. But now what about this? How to describe these objects? Can I turn this into catalogs? Not really. So now we have a challenge here, which is the fact that we don't have the adequate language to extract meaningful information from these images. OK, so the complexity is much higher here. And as a result of this, we have a lot of data like this sitting on disks. But we cannot, this is not searchable. We cannot use queries to try to find objects that are different or objects of carrying particular shapes, et cetera, because we don't have the ability to describe those shapes or those properties because they are too complex. So coming back to Gip's view and what is happening in science today. So yes, we have more and more data with more dimensions to eat, et cetera. But there is one challenge that we are facing, which is just a question of language and complexity. So this is to show that in astronomy, as is the case in seismology, data is increasing at an exponential rate. The x-axis here is time. Those are different surveys that happen or ongoing or planned. And this is how many sources each of these surveys is going to record. And as I mentioned at the beginning, we take pictures. We do imaging. And we do spectroscopy as well. So yes, we have always more data. This is more slow for data. But that's not really the challenge. We can have a very, very small data set and have these complexity challenges. And I'm going to give you an example now. All right, this is a very simple black and white image. Imagine you are on the phone talking to a friend. And you need to describe what you see. OK, if you spend a couple of minutes, you would be able to describe this image. And your friend on the other side of the line could draw this image relatively well. Now what if I show you this image? You're on the phone and you have to describe what you see. What would you say? Again, this is not a lot of data. It's just black and white. It's about 100 pixels on the side. It's very small. Very simple in some sense. But at the same time, we cannot describe what we see. So the key point here is that language is the limitation. And this has to do with complexity. The image on the right is more complex than the one on the left. So what I like to do is to put data sets in this space. So what I have here are two axes. The x-axis is complexity. And I'm going to explain this a bit more in a second. And the y-axis is noise, stochasticity, or temperature. So first, complexity. There is no generic definition of complexity. So I'm going to use several proxies to explain what complexity is. So complexity is only apparent to the observer. Something can be complex for you, but not to someone else. And if we want to try to describe complexity, we can think of it as the number of scales that are involved in the data. The description length. Can I describe what I see in a concise way? Or do I need a very long description, lots of parameters? Another way is also to think about the computational cost. If I want to generate some data, how much computation will I need? I can start from a very simple equation. And from that equation, I can generate images and curves and all sorts of things. But that's some computational cost. So that's also a potential measure of complexity. So let me put a few things now in this space. We can start with the simplest possible thing. Let's go to the origin here. I just have a point. So let me consider objects in just one or two dimensions. So this is an object for which there is no noise, and there's just one scale. I can add more scales to it. I get a segment. I get a set of segments. And I can make it more and more complex. And I can more and more scales to it. And if I take the limit all the way to infinity, I enter the world of fractals. Now let's try to do y-axis. I can add some noise here, some stochasticity, or I can increase my temperature. And from the point, I get a Gaussian distribution. If I take the limit, I get a completely random distribution. And so my point now is that any data set is going to live somewhere in this space. And depending on where you are in this space, you will need to use a certain language or certain tools in order to do your data analysis. So a couple of comments. Some of the corners here have names attached to them. This is Euclid, a long time ago. This is Gauss. And over there, this is Mandelbrot. OK, so now let's try to go vertically and having lots of scales. So as you might know, fractals are good models to describe, for example, coastlines. So what you see here is a stochastic fractal. It's no longer completely deterministic in the sense that there's some randomness to it. The randomness here is whether I turn left or right at each step. OK, and so by adding this stochasticity, this is the type of object one can create. And if I continue, and there could be lots of examples, this is basically the natural world around us. And this also includes our favorite machine learning cats. And the reason for it here, I mean, a cat typically has one scale, a cat is big like this. But cat in images can be in so many different positions or configurations that the space of possible configurations is absolutely huge and has many, many scales to it. So how to navigate in that space? So that's the key question. We talked about courses earlier. Very often, undergrads only learn how to write this axis these days. So those are Gaussian statistics and the world of feature. But very little in too many places, there's very little done to go in this direction. That is, that's how I studied statistics at my time. So I'm going to focus in the middle part of this because what is completely deterministic or completely random is not that interesting usually. And so let's explore this axis a bit more. So in astronomy, we have 1D or 2D data, typically. This is the types of objects we have to deal with. And by increasing the number of scales or the complexity, we have to analyze these types of objects. I'm going to make this a little bit more generic here and talk about all sorts of data sets. So this is what I showed before, the distribution of galaxies or blocks, Gaussian blocks. And this is the Gaussian-London field. And if we add scales to this, then we enter the world of textures. And if we push it all the way, we have all the natural images and the data sets like ImageNet. It's also interesting to comment on how we collect this data. Okay, so along this axis, we have different ways of collecting data. If we are on the very left side of it, okay, we basically do controlled experiments. We want to design an experiment that will give us the shortest possible description length. An ideal experiment would just give us one bit of information, yes or no, zero or one. But usually there is some noise associated to this. In astronomy and I guess in geophysics as well, we do surveys. So here we are going to collect data in a much more open way, but still it's going to be confined within a certain limit. And if we go all the way, we have this unrestricted data collection. We can give people buckets and they can put anything they want in those buckets. And this creates all sorts of data like Facebook, Twitter and so on that people are mining as well. So how to navigate the space in the horizontal direction? We have a number of techniques there. And as I said, if we are on the left, typically we use Gaussian statistics. We have at our disposal a number of other techniques, manifold learning, dictionary learning and neural nets. And what I would like to emphasize here is that we need to pick the right one depending on where we are here. Okay, so neural nets are not the answer to everything as we tend to hear sometimes. It's important to realize the trade-offs between what we are gaining, what we are losing and what they can do. As we move to the right, as you know, there is less and less control on what's going on. Okay, so interpretability becomes challenging. It's not impossible, but it's more challenging because the number of parameters one tends to work with explodes. And when it comes to real data, there are always questions about systematics that might not be present in training sets. So a quick overview of a couple of techniques. So talking about manifold learning, I'm illustrating a few of them here, principle component analysis, Disney, ISO map, you might have heard of U map, of self-organizing maps, et cetera. The idea is to have some data and to try to represent them in a lower dimensional space such that we will start to see interesting structure. In this case, when we work with handwritten digits, the goal is to be able to separate the different values. And this slide shows that in this case, you notice me does much better than the two other techniques that are shown. So this is good. So we are using geometric properties of the embedding of the manifold to be able to make inference at the end. So a lot can be done in this regime already. What is nice is that we can visualize what's going on and we can make decisions along the way, which is not the case with neural nets as we will see. I'm going to advertise a technique that I developed myself. The idea is to try to order data automatically. And this is done in such a way that we are looking for a one dimensional manifold. So if we have data with lots of dimensions, but if at the end of the day it is distributed along an elongated one dimensional type manifold, it can be as complex as you want. This is going to try to find it. And so the idea is that, so here what I'm showing is a collection of spectra. Those could be seismograms, for example. They are about 1,000 on top of each other. And they are randomly ordered here. If I ask an astronomer to tell me what kind of data we see is usually that person will not be able to answer. But now if I order them on the right, this is the same data. Just, I have just shuffled the rows. Then I start to see structure and what I'm seeing here is a redshift sequence. So those are emission lines of distant quasars that are redshifted as a function of distance. So the question is how to go from the left to the right. We can think of this data in a different, in a number of ways. I can take all the pairs of points here and I can create an adjacent matrix or a distance matrix. It would look like this. And on the right, once the data has been ordered properly, what I noticed is that this adjacent matrix has been diagonally. So the question is, can I find a way to go from this to there automatically so then I can observe my trends directly in the data? So there are some techniques on the market. Some of them were on the slide shown before. We came up with one which is different. It's called the sequencer. The paper is not out yet. We are writing it, but you can already use the algorithm online. And so the idea is that from this adjacent matrix, we can think of it as a graph and we can find ways to connect all the points so that the total distance is minimized. And when we do this, we can look at the shape of the graph. And if we find something that is very elongated, we know that there is a one-dimensional sequence in the data. And so the algorithm can scan for different scales and different metrics. It's going to scan the data for all of them. And in each case, it's going to look at the geometry of this graph. And if it finds a graph that is very elongated, then it has found a sequence in the data. Again, what is nice is that this is parameter three. You can just upload your data set, it runs on it, and it gives you the best possible ordering it has found in the data. So examples, this is the one you just saw. We have lots of spectra. We can order them, we find the redshift sequence. This is another data set. We order them. Now we find that we have stellar spectra. And here on the signal has been picked based on the large scale distribution of flux, like in your client, that goes from top left to bottom right. What we have seen here is the peak of the blackbody spectrum of this star that's changing. So we have hot stars on the left, cool stars at the bottom. This is another data set where the classification finds two interesting things. First, there is another redshift sequence and we are announcing spectra of galaxies. But what you can see as well is that at the top, you find a different population of objects that are not like everything else. And the algorithm can only turn understand that two things are going on. And to show you that it's completely generic, there is no physics in it at all and no free parameter. We can take these data sets, we can order it the same way and find this. So we've played with this in astronomy and outside. And I'm going to show you one example in seismology, this is work done with Red Leakish. We see it. You know better than me, the network of seismometers across the US. From these people make all sorts of maps. Because from this network, one gets complex data, multidimensional data, and the question is what should one plot? Depends on the question one has. So what we did is that we ran these data through the sequence error. So from all these seismograms, we looked at surface wave velocity as a function of frequency. And so this is the raw data that is being as a function of frequency. We send it to the sequence error, we order it automatically. And once we have this ordering, what is nice is that the one dimensional sequence can be mapped into a color bar. Okay, so this is just the index across the sequence. Once I have this color bar, I can just put this color on the map. And this is what we find. So this is now a map showing properties of the crust from all these seismograms. You can think of this as some sort of hyper-histogram equalization of the data. So this is trying to extract all the relevant information from the data as much as possible and put that on a one dimensional manifold or a color bar. Okay, so I invite you to use this tool. It can be quite powerful. Moving on, the axis of complexity, there's dictionary learning. What I'm showing here is an example that was provided in the review given to us. So I'm not going to describe all the steps, but the key idea here is that if we want to describe the data that we have, we can do that with some requirements on sparsity and we can create a dictionary, a set of atoms that is shown on the right there, which can be used to efficiently optimally represent the data. Okay, so once we know which vocabulary to use, if we have found an efficient vocabulary, we are in a very good position to then interact with this data. So that's often the key on the challenge. So this is a very simple example. Now let's look at images from the natural world. So let's try to generalize this. So we can take a lot of images. We can look at subsets of these images and we can do a gigantic principle component analysis of all these images from the natural world and what one finds that was done in the 90s is a basic set, which is illustrated here. And it's basically doing a four year decomposition of the data. Okay, so this is not really informative because we know how to do four year compositions. We don't need a PCA for that. Now, if we do the same, but if we require, instead of auto-canality of the vector, if we require sparsity, this is what we find. This is very interesting. Okay, so what you might recognize here, those are the Garbor filters that are in the eye. So those are the filters that we have in order to do edge detections and to try to understand the world around us. Okay, so people have known these cells in the eye for decades, but it's only in the 90s that people understood how to obtain those filters naturally from images to this basic set. And this is just a requirement of sparsity. Similarly, if you use neural nets today and you look at the first layer, you look at the filters that the network came up with, again, you find these Garbor filters. Okay, so there is something really intrinsic about this. And I think another sign shows that there is something very important here is that this is learned from the for the neural nets. But for us humans and mammals in general, this is something that we, this is hardware for us. This is not something we learn in the brain. This is directly at the back of the eye. Okay, so there must be something very fundamental about this. And if so, then should we try to learn them and spend TP on that, or should we try to inject them directly into our representation? Okay, so this is something that can be done. So typically those filters are wavelets as a function of scale. And I'm going to show now something called a scattering transform, which is a very interesting work done by mathematicians, the papers are from 2012. The idea is to try to represent data using the cascade of filters like this, together with some, with just one nonlinearity. And what these mathematicians have shown is that by applying these filters to any data and by applying them in a layered way, okay, hierarchically, we can come up with an extremely powerful estimator, which is invariant to translations, rotations and small deformations. That's extremely powerful, which means that if we have images, okay, we have some data X, which is in very high dimensions, the number of pixels here. We have two types of data, red and blue. We can apply this transform to the data and it's going to basically put together objects that belong to the same classes, okay? And once you have an image like this and you apply the transform to it, you can do translation, rotation, or you can distort, you can warp your image. The coefficients that are produced by this estimator are going to be stable to these deformations. Okay, so this is extremely powerful. And it's only recently that people are starting to use this in physics, okay? So my message, I strongly encourage you to have a look at this. Example, this is fantastic for texture classification. Okay, so the idea here is that if we extract the coefficients from images from a single, from the same column, we will have similar coefficients. But if we look at images in different rows, that will not be the case. Now, on the right, you can see a visualization of these coefficients, okay? So this transform can be done at different orders. Typically it's done up to second order, the applications that I know. Now the trade-off is that you have to start to work with hundreds of coefficients, okay? This is no longer summarizing everything into just a small set of, you know, let's say a dozen coefficients. Now it's hundreds of them. But the gain is that those coefficients are stable to the transformations that are listed here, all right? One can do all sorts of interesting things from this. One can take images, extract the coefficients because those are statistical representations. The coefficients do contain the statistical knowledge about this field. One can use these coefficients to regenerate new distributions. And as you can see, it works extremely well. All right, so now let's push things all the way to the end of the complexity axis and let's talk about neural nets or convolutional neural nets. I'm not going to introduce them again. We saw them in previous talks. But this is a typical example in astrophysics or cosmology. So what you see here is a simulated distribution of matter in the universe. That's the filamentary network that I showed with data earlier. This is a lot of data. And in order to generate this data, we basically need something like six numbers. Okay, that's all we need to describe the universe on large scales. This is an example where they focus only on two of these numbers, cosmological parameters, sigma eight and omega matter, but the mean is not too important here. But basically the challenge is that we want from these billions of particles, extract information that's going to be compressed and condensed into only two numbers. People spend their entire career trying to measure these two numbers. Okay, those are the cosmological parameters that will define many important things in the universe, like the density of matter for omega m. So this is a complex task. And until recently, people were using only Gaussian statistics, correlation functions, et cetera, to extract that information, knowing that a lot of information was not used because this is not a Gaussian distribution. But we didn't have the right language to go beyond that. People tried to use higher order statistics, but that did not really produce the results they were hoping because it's too difficult. So now people are skipping all the steps and using a CNN. They tried to go directly. They can learn from simulations, how to go from a 3D distribution to the parameters. So it's a huge compression of information here. It works well with simulations, but now how well will it work with real data when we have systematics and all sorts of differences here? I don't know. And the challenge here for people doing this is to have the community believe that the results they get for the parameter are all the best and trust the best. The problem is that we are jumping directly from this to two numbers. Okay, so one issue that sometimes people say that this is a black box, we don't know what it's doing. The situation is changing. We can text images. We can learn classes or categories from these images. That's what people typically do with those CNNs. And what is interesting is that within the network, the network itself is building some sort of a dictionary that allows it to describe the data. Again, so that dictionary is such that it will have the invariance for translation, rotation, deformation, scaling, et cetera. So until recently, it was very hard to understand what's going on inside, but this is changing. There is now this so-called feature visualization technique which can be applied. So once the network has learned, one can go back, one can query individual neurons, individual layers or combinations of them, and visualize what the network has learned. For a network that has been trained on ImageNet, this is work done in collaboration with OpenAI. What you are seeing here are the different layers. And it's basically a PC analysis of this feature visualization. So we are going deeper and deeper into this network. What I want to show you here is that this is not a black box, and actually see all the filters, how they are getting more and more complex to describe high-level features of the data. And so it is possible to explore this box and see what it has learned, see what the dictionary looks like. And this is done in the context of images, but the same could be done in the context of geophysics. All right, so I'm going to summarize and say that what we all want to do as scientists is to go from complexity to simplicity. But in order to do so, we need to use an efficient language. And depending on where we are along this axis, we want to do different things. As I mentioned, as we move to the right, we have to accept the fact that we are going to work with more and more coefficients. They don't blow up to millions. There is an intermediate step there where we can work with hundreds of coefficients. And I find that that's very interesting. And I guess you're the example of the scattering transform that does that. But so we have all these techniques at our disposal. Unfortunately, yes, too many of our students do not learn them early enough. They learn how to navigate along the Gaussian axis, but not so much along the complexity axis. In all cases, as I said, what we want is simplicity. And we find simplicity in different forms. When we do Gaussian statistics, it's in the limited set of parameters that we have. If we do manifold learning, another technique is going to be in the geometry of the manifold that we find. And if we do dictionary learning or convolutional neural net, it's going to be in the limited set of classes that we impose onto the network. The message that I have here is that a lot of us scientists have been stuck in this corner for a long time. And recently, we can now finally go and explore this space. I think there is a huge discovery space with existing data. And the challenge is that we now have new languages that we can use to basically go and explore this complexity. Thank you. Great, thank you very much. We'll take a couple of questions and then we'll move on to our panel discussion. Go ahead, Richard. So the obvious question about the one piece of, the one piece of tomography in there. So that map that you then showed where you were looking at the ANSS data, it was it simply order it and you ordered the seismograms into a one-dimensional sequence and then you showed the map. So was that just telling us about the velocity? I think you said it was showing this phase velocity, but was that the only thing that it was telling us? So was that map just telling us the crustal velocity or was it telling us something else as well? The data that was used by the algorithm is only velocity at the same time. Breeze is correct. It was a surface wave dispersion that was used, not the seismograms themselves. We could run that from the seismograms. It's just more complex. So you mentioned that at the end, obviously the potential to make new discoveries and if you go into this more complex space. In astrophysics, are you starting to actually make those discoveries? Are there examples you can give? There are examples at different locations along that complexity axis, but we are doing things that we will not be able, that we could not do 10 years ago. And we are using data that's been sitting around for 20 years. So again, it's not the limitation. The examples I showed are not, the challenge is not the size of the data. It's not computational power. It's really just coming up with the right language. And that's what all these techniques really do. It's to provide us with the adequate language that can extract the relevant features in the data that can build an efficient dictionary and to express the information. Classification algorithm, which is probably oversimplifying it. Going back to the dispersion curve example. Now, if I were to compare this with what a K-means sort of thing would give me or some other clustering, what's the added sort of information that I get out of it? So the goal is not to look for clusters, but to look for continuous sequences. So it's very different. K-means would not give you that because this is basically one cluster. You are living on one single manifold. Okay, so there are clustering algorithms. When you are looking for n clusters, they are discrete objects. But very often we have data that's generated. We have populations that exist due to the variation of one parameter, something that can vary continuously, temperature, pressure, anything. And in this case, we find these sequences. And this is what the algorithm is designed to find. With K-means, you would have to first say how many clusters you are looking for. Right, but I guess what I'm trying to think of is an example where the clusters wouldn't be points on your color scale, right? Is it possible that you would have some sort of non-linear way of going between the clusters? Because if the clusters exist, then they should map along that color scale. But your method also works if there are no clusters. Is that right? That's right. It works with continuously distributed data. So you don't need clusters. That's right, yeah. And what it does as well is that it's capable of finding whether the information is on small scales or large scales. Okay. Yeah, I have a question about the scattering transform. I guess now I can't remember whose talk. I think in Hannah's talk, we saw these beautiful examples of how illumination can really dramatically change an image and how understanding the information that's in that image, well, how that process can be made difficult by those changes. Do you think the scattering transform would be able to deal with the kind of changes in, like imagine pictures of the moon at different angles? So at some level, yes. Because you can, in the limit of small changes, this is going to be like similar to warping locally. And so it will help your, it will give you an estimator that's more stable to these kind of changes. Why don't we move to the panel discussion? Why don't you, Bryce and Hannah, why don't you just come sit at the front table and we'll move to a more general discussion. Can I ask a question while you're moving? You've been working with very disparate data sets as well. I think at the beginning of your talk, certainly you talked about multiple missions and multiple types of data and even potentially legacy data too. How do you, well, obviously with some of the more sophisticated techniques for rotation scattering issues, what are the challenges in working with multiple, multi-sensor data sets and different time series? How have you overcome the challenges? That's a very general question. I know it's too general. I should have focused on. I mean, what the other way is. Sorry, can I focus it a little bit more? What I meant is, are there ways to preserve, have you found ways to actually utilize more of the legacy data or the older data sets and incorporate them in these? Yeah, so what's happening in astronomy is that we've had a lot of data for a long time, but during that long time, it was analyzed only using this Gaussian statistics and we were on the left side of this axis. And so now using the same data, we have a new discovery space that is in front of us because we can finally start to manipulate this data that we've had with these new tools and we can start to extract information, very important astrophysical information that we didn't have access to before. So it's changing the field. All the young people are seeing this very clearly and they are on their own learning this ML techniques if they are not being taught at universities. Unfortunately, universities tend to be slow at changing. My recommendation as we discussed before is undergrad should be exposed to this very early on and they should not only learn how to write the Gaussian axis but also the complexity axis very early on to understand the potential and also to understand that these neural nets are not the go-to tool in every possible problem. They should understand what's the right language. That's needed. All right, I'm gonna ask a question that we sort of had this whole panel discussion about. So I wanted to get both of you guys's take on sort of this question of, from seeing the talks this morning, what is the cross fertilization that could exist? What are the commonalities you see? And I guess both of you guys have sort of worked across disciplines already. I guess Hannah has a background in planetary science and you've been working with VED and probably other people too. And what are the lessons learned on the commonalities that we could benefit from in the geosciences from things that you have learned in your discipline or vice versa? I think definitely we've talked a lot about capacity building and education. We could have programs that include geoscientists, remote sensing experts as well as machine learning experts but we can also learn from programs that are being done in these individual fields and try to duplicate them as well. I think one example where this is being done in planetary science mostly but increasingly remote sensing is the NASA Ames has this Frontier Development Lab program which is a 10 week summer program that has half machine learning graduate students and half planetary science graduate students. And they work on specific problems that are informed by the scientific community. There's an RFI from NASA about it just a few days ago to solicit ideas for projects for them to work on. So they're working on really meaningful things. I think to maybe tools to like we all have this challenge of interpretability and maybe sharing tools like making tools like the one that you developed actually deployable and doing things that others can use these tools for inspecting the nets. Like we know that these exist but you still have to duplicate the experiments and the engineering effort yourself. And so having more of these tools be publicly available across disciplines I think would be really helpful. Yes, I very much agree. I think it would be a good idea to organize more summer schools for students not only coming from one field but a collection of fields because I think we all have the same needs. I think this is all about a new type of mathematics. We all have the same training for basic mathematics. This is the modern version of statistics in a way. We all need that. I think we all share the first layers of these neural nets are doing the same for whatever you want to do. Whatever you need, whatever data sets you have the first layers will be these very basic filters and whether you work with one data to the data. So I think it would be extremely valuable for students and for faculty and senior researchers to learn, to collaborate on this much more so. To add to that too, another, you mentioned summer schools. I've seen summer schools organized around conferences to in both machine learning and remote sensing. For example, at the ECML European Conference on machine learning that happened recently in September, they had a one or two week long summer school around geospatial analysis using machine learning for remote sensing, which is important for training machine learning scientists as well to use that data which is really challenging to access and analyze in many ways. I wanted to ask you something about this legacy data. I'm assuming an astronomy like in any other field like this legacy data probably have less resolution or they're a little bit different than say modern data sets, okay? So how valuable are those? So actually is it just having more data or how do they inform your results? In astronomy we have these surveys of the sky where a dedicated facility, a dedicated telescope is going to scan the sky for five, 10, 15 years. And that data is public available to anyone. And this have been the most successful data collections. They have triggered more publications than any other facility. They have initiated all sorts of international collaborations. So what these data sets have that others do not is they span a very large number of a wide range of scales, okay? The ability to measure from like a time scale of one second to like several years. You can only do that in this survey mode, okay? And so having access to so many scales that is the essence of big data. I think big data is not about the three views, velocity, volume and I forgot what the last one is. I think big data is this having a data set in which you have many different scales altogether and allows you to put things in context and allows you to measure all sorts of correlations between scales, understand the context, the environment and understand the big picture as well as the microscopic details. And so there's something extremely unique and valuable about this that no other data sets will provide you. I think it's made a really important point that we sat earlier in the day but maybe it's worthwhile speaking. The number of discoveries and advances that have been made by open access data and open access from the outset. And I think open access, not just the software but also the data sets themselves and just want to reiterate. Yes, if you look at the number of publications linked to a specific data set as a function of time, many, I'm going to call them experiments or targeted observations. We'll show you a curve where you have a lot of results and then it decays exponentially soon after that. But for these large surveys with open access, the curve is very different. It basically continues and plateaus for many, many years. And so the scientific output is really great. I think that's also true for data sets like the Landsat and Sentinel programs that as soon as they became publicly available, we see huge increases in the number of studies using those and we continue to see that growth. Yeah, and I would like to add also that very often we are not limited by the data but our own ideas. And as I also mentioned during my talk, we are limited by the language that we have at our disposal to manipulate the data. And so once those things change, when people come up with new ideas or new ways to look at the data, these data sets, even if they are 10 years old, extremely valuable and lots of discoveries are being made. Well, I was just going to read the remote participation question which goes after open data but also reproducibility. So the question says, what about publishing source codes and making workflows available? It's my understanding that actually sort of the machine learning field is quite good about that. And a lot of papers come along with a source code. I was just wondering if you wanted to comment on that to make sure people can reproduce things and things like that. Yeah, that's correct. It is really standard practice in machine learning but like I mentioned at the end of my talk too, there's not really a culture of sharing data sets. Often actually conferences, I'll see people give talks where they're like, yeah, this tool is publicly available, data sets open, you can go use it and then you go to search for it, you can't find it, you ask them about it, they're like, oh yeah, you can email me, call me, it's not really open. And so I think really pushing for links to these things, like the link you included in your talk and this question mentioned Docker containers as well as like Anaconda environments, you can containerize these things to make them executable. And I don't know if it's something that should be required for publication but definitely some way of incentivizing that in the community would be great even if that's just a community effort to uplift through sharing things on Twitter or something that really highlighting the efforts of these researchers to make that code executable and make the data set publicly available. And yeah, I guess what I was gonna say is that that also makes it easier to, when you go to publish a new method, you can actually test your method against one that somebody else has published because right now it's kind of an unreasonable effort to expect somebody to reimplement somebody else's method that they've published. But you still might have the question which is a very valid question of well, how does this compare to such and such as paper and that method but it's an undue burden to force people to engineer that themselves in my opinion. I mean, I guess Sydney maybe wants to ask the question but we had this issue about benchmark data sets, right? Does the machine learning people have like a standard data set? And then you go like, wow, my method is 1.7 on a so-and-so scale. Yeah, exactly. I mean, that is the way it works for ImageNet. People publish like, oh, we got 0.2% increase in accuracy compared to this and there's a lot of debate about that too. Like I don't think that's really what we should move to. I think that's resulted in a lot of problems where people are creating precisely tuned data set to get a small improvement on this data set that has no real relevance to our daily lives. So there's some maybe in between there where we do need data sets that we can all test things on to benchmark performance but I think focusing too much on that can be detrimental. I guess just to briefly follow up on that, just for background. I mean, it seems to me in at least in geodynamics we've had that conversation about 20 years ago and there were lots of concerns such as attributing credit, right? What about the PhD student investing time? And then we sort of, it seemed like we had moved to a place where this was all worked out and we agree OpenX's shared codes is the way to go. But now our professional society has started enforcing these things in terms of publications. And we're now right now, like the last couple of weeks faced with a situation where full reproducibility is something that in theory is required if you wanna publish an AGU journals. And some of you might not have gotten that. And now it turns out a big part of the community wasn't really listening 20 years ago when we first discussed this and we having the same arguments over and over. Well, of course I'm not gonna share the code. And we're like, well, we agreed 10 years ago we should do that. So I was just wondering, do you have a feeling for your communities where you are? And you had indicated, right? And then it's like, oh, send me an email and you don't never get an answer. It seems like it's not perfect, you know? So do you have a feeling in your community where you are? Maybe I can say so in my field astrophysics people are very happy to share their code. It has become part of the culture. So one important thing is to reward the young people to write codes and to understand that the code is, you know, basically as good as a paper. It's the same type of achievement. So some people will spend a lot of time writing code and not necessarily making discoveries but we do need those people, those are talents and we have to reward them. Yeah, in terms of tracking and citing things like data sets or code that may or may not be linked to a paper, all of my papers that I've published I've published the data set with it and with a Zenodo DOI link. And, you know, maybe people will use it and not cite it but that's the risk you take. But I think that's one way, like linking these to DOIs you can encourage people to cite the data set even if it's not the paper or cite the code or things like that. I think you need to switch yourself. Excuse me. Mine always works. I'm just being there for the people listening online. Torsten pointed out, provided the perspective of an editor and I guess that's another challenge that I feel has perhaps in astrophysics astronomy as well. As new techniques develop there's always a lag in terms of reviewers and standards and challenges. Do you have recommendations to our community as we start to handle these types of papers in our regular workflow? I guess, you know, if data is accessible and the codes are available, I mean it's full transparency. That's what I would recommend. Yeah. I'll add to that. I was discussing this a lot with Carrie Ann during the break that there is a lag in reviewers available to review these machine learning type papers and I think that puts a lot of pressure on people who are working in these topics to review these papers because we see a lot of mistakes being made and we see a lot of the same mistakes being made and it's important to catch those in the literature and the risk is that people are reviewing it that may have more of the application background and think, oh, cool application, interesting results, but may not be able to recognize that a model is being misused or generalization is not being assessed in a correct manner. So I think that's just another reason why it's important to educate people in geoscience or remote sensing or astronomy about machine learning, but it's also important to educate people in machine learning about the science application so that they can also be part of the reviewing pool for these related publications to their field. Great, all right, well, thank you very much, Ann and Bryce, I think we'll end our panel here and take our break. We're gonna reconvene at 3.30, all right, great, thank you.