 with today's speaker. Our guest today is David Spergo. David is an accomplished astrophysicist and a Mauritius professor at Princeton. In 2016 he became the founding director of the Center for Computational Astrophysics at the Flatiron Institute in New York and he will take over as the president of the Simons Foundation later this year. I think it is fair to say that David knows the cosmic microwave background better than almost any other person. He was one of the lead analyzers of the original data delivered by the WMAP satellite, work which earned him the breakthrough prize in fundamental physics in 2018, which is only one of the many prizes and honors that have been awarded to him. Over the past years cosmology has increasingly become a precision science. But to fully exploit the large amounts of high quality data that are available today, new analysis techniques are necessary. In his talk today, David will ask maybe the ultimate question, the question about the initial conditions of the universe and he will tell us about how machine learning can help to find the answer. David. Thank you and it's a pleasure to be in Oxford even if only virtually. You know, to add to my trajectory, in many ways, started at Oxford. I spent my first year of graduate school working with James Binney and I was at Merton College and learned a lot about dynamics that year and have had the pleasure of returning a number of times, including spending part of my sabbatical there a number of years ago. Today's talk is really, in a sense, a continuation of what led in that introduction. You know, what much of my middle career was spent on thinking about the microwave background and developing analysis techniques for it. And what I want to talk to you about today is a sense very much work in progress, a program of thinking about how can we develop the tools we need to fully exploit the information we have on large scale structure. And can we develop the tools we need to do the kind of statistic analysis that the data I think ultimately deserves to be able to extract the maximum amount of information about the initial conditions of the universe and its basic properties. And to sort of give the punchline, we have tools for doing the forward modeling from initial conditions to observations with some uncertainties. Already, these tools are numerical simulations of various kinds that are incredibly demanding computationally. And in order to apply them to data, we will need to apply techniques of machine learning to be able to develop ways to accurately approximate this forward model. And I believe with that, we will be able to extract much more from the data than people currently plan and get enormous value. To tell that story, I want to begin since there's a broad audience by introducing some of the basics of cosmology. And, you know, to start with, to give a sense of where the field is, we have found that a remarkably simple model in some ways fits all of our data. We now have, and I'll show you some of the data, measurements at millions of points of the sky in the microwave background, observations of the large scale structure, measurements of the expansion rate of the universe, and with a model that assumes that the same laws of physics that are valid locally are valid throughout the universe's space and time. The total energy of the universe is zero or equivalent, but the geometry is flat. And has five basic parameters, the age of the universe, the density of atoms, the density of matter, how lumpy the universe is, the amplitude of power spectrum, and how that lumpiness varies with scale, the slope of the primordial power spectrum. With those five numbers, we can fit all of our observations pretty much. There's some interesting tensions, but that's another talk. So it's a simple and successful model, but it's a strange model because it says that atoms make up only 5% of the universe. Most of the universe is in the form of dark matter, which makes up about the next 25% in dark energy, energy associated with empty space. We don't know what the dark matter is. We don't understand the dark energy. We have a basic theory to explain how these fluctuations are generated in the early universe, but that's a very incomplete theory. So we know we're missing lots of physics. We know the physics we're missing is physics that goes beyond the standard model of particle physics. So we know we need new fundamental physics to address this. So we're in a very interesting moment. To explain that, we need to go and give you a basic introduction, first, to special relativity, to remind those that you're not used to thinking about cosmology, that because the finite speed of light as we look out in space, we look back in time. So our observations of nearby stars a few light years away, something that's 10 light years away, we see as it was 10 years ago, we see a nearby galaxy as it was a million years ago, a more distant galaxy as it was perhaps a billion years ago. And when we look out, we'll look back in times the microwave background, we'll see the universe as it was roughly 13.8 billion years ago. There's my Brexit joke, so we'll move on. The other key idea to think about is that we live in an expanding universe. The picture that I like to have of the expanding universe is to suppress one dimension, to think about the universe as a sphere, we live on the surface of the sphere, and the radius of the sphere is time. So the universe is expanding and expanding, getting less and less dense with time. If we run things backwards as we go back in time, the universe gets denser and denser, hotter and hotter. You can see in this picture as we collapse towards a point that the big bang is a moment in time, not a moment in space, as in which it collapses down to a point. And if you think about this expanding picture, what you see the universe is expanding into is the future. It's not expanding into some other region of space. It's as time goes forward, the universe gets bigger and less dense. In that basic picture, as we go back in time, the universe gets hotter and hotter, denser and denser. When you get back to about 380,000 years after the big bang, the universe gets hot enough that the hydrogen, that's the dominant component for the atoms, becomes fully ionized. And back when the universe was about one-one-thousand, that's present size, as you're measuring with its redshift, the universe underwent a transition from when the universe was ionized to being neutral. So here's another version of this quick history of the universe. And when we look back at 13.8 billion years at the microwave background, this leftover heat for the big bang, we see small fluctuations in the temperature of the universe. And over the past now 30 years, we've been measuring those fluctuations with increasing resolution. First, detected with the Kobi satellite, I played a role in the Wilkinson microwave anisotropy probe, a NASA-built satellite that mapped it at higher resolution. And this past decade, the best results we've had have come from the ESO-led Planck satellite that's mapped the sky at even higher resolution. And this is a version of the Planck data that shows, and with the zoom in, how high a resolution Planck has achieved in mapping the fluctuations across the whole sky. And I'll just quickly flash up because we have some new results. We continue to make these measurements at higher and higher resolution. We've been operating a telescope in Chile that's been remapping the same sky now at five times the resolution and more than two times the sensitivity of Planck. And we see the same sky. It's really reassuring to see all these experiments get consistent data. But now with increasing resolution, that also lets us see dusty galaxies and clusters, those cold spots on the map are actually clusters of galaxies that we detect as shadows against the background microwave sky. So what do we do with the state? The approach we've taken is basically summarized as we can take the temperature maps, these maps across the sky, we expand the mountain spherical harmonics. In this case, and we'll contrast this with what we will talk about in the rest of the talk, things are simple. The fluctuations are very well characterized by a Gaussian distribution on these scales. The fluctuations are isotropic and seemingly homogeneous. We check that. We can take the fluctuations we see measure the two point function of the map. And we fit our theory, which we can compute those are five basic parameters. And we ask what's the likelihood of the temperature pattern we see given the parameters in our model. And we get a very good fit. And this is sort of the current state of the art with the different data sets shown different color points, the Planck experiment in blue, and then a series of recent ground based experiments shown in red, yellow, green, and purple. And we measure not just the temperature two point function, but the two point function of the polarization maps, which we decompose into what we call E-modes and B-modes. E-modes being the pattern of polarization that's symmetric under mirror reflection and B-modes, the pattern that's anti-symmetric. And the high level takeaway from this is, first, the experiments agree. And these are very sensitive measurements, fluctuations at the micro Kelvin level. And second, that we have a theory that fits all the data. And what's inside a nice check on all this is you can fit your theory to say the temperature data alone and then predict what happens with the temperature and polarization data. And one of the exciting things of having been part of this is, you know, you fit your model one year. A few points seem to be off at the two-sigma level, or even sometimes with level points, some at the three-sigma level. The next year, your experimental colleagues have improved the results and the measurements keep going back towards the theory. So it's been fun to see that all work out. And with that, we actually have a basic theory of the initial conditions, where we gives us a very well-motivated prior that our initial conditions are Gaussian random phase. We can take the density of the universe, expand it out in foyer modes, and each of these modes are independent with an amplitude drawn by the power spectrum. And what's shown below is the data. This is actually pretty old data at this point. This is from WBAP, but it's a plot I like that shows the number of fluctuations of a given temperature normalized by the variance on that scale. The data shown in the black histogram of Gaussian curves. So there's no, in a sense, no parameters here once you've normalized it by the variance is the dash line. And you're looking at the data smooth on the four-degree scale, the one-degree scale, and the quarter-degree scale. And you can see that the fluctuations we observe are very well-fit by a Gaussian distribution. So once we've specified, those initial conditions, we can then run it forward in time to make predictions about what the universe should look like today. And this is where things can I say get interesting, because you can see those Gaussian initial fluctuations grow by gravity to become highly nonlinear and highly non-Gaussian. And gravity is the dominant force on these large scales. And as we'll see as we get to smaller scales, we have to start worrying about hydrodynamics and galaxy formation. So the basic picture here, and this is, first let's begin with that little picture in the corner. When we look at the microwave background, we're looking back in time. We see this early nearly homogeneous universe. Gravity makes it evolve to a much richer nonlinear universe, stuff like us, isn't it? And we then on in the local local volume, looking at things over just the last four or five billion years, we can trace that with the large scale distribution of galaxies. And this shows a slice of the distribution of galaxies, measuring things like the Sloan survey. And while on the largest scales, the approximation that the fluctuations are small and well described by a Gaussian random field seems to remain good, on small scales, the perturbations are nonlinear. And these nonlinear perturbations on very small scales collapse to form what we call dark matter halos, bound galaxies of dark matter. And in these dark matter halos, remember the universe is mostly made of dark matter, baryons, atoms and electrons and protons represent, you know, only about a sixth of the gravitating matter that clusters cool, they form stars, they have feedback from things like supermassive black holes. And when we actually go to observe galaxies, we're observing the stars in the galaxies, we're observing in some sense, the froth on top of the ocean waves. And we're trying to figure out how do we use our observations of that froth to understand what the underlying distribution of matter is. And our goals and all this in some ways, we characterize and as I use the next flu prompts as trying to measure basic parameters in the model, the density of matter, the density of atoms, the expansion rate of the universe, the properties of the initial fluctuations, also other parameters in the model. What is the mass of the neutrino? What are some of the properties of dark energy? Do we see new particles? And the reason why we want to measure these parameters is, you know, this lets us get at some really fundamental questions about the basic properties of the universe. So for me, this is the ultimate kind of my ultimate motivation in thinking about these problems and applying that as we'll see soon, the tools of machine learning is we want to better be able to answer these questions. And again, to give you a sense of where we are right now, what is the standard in the field is to do the same thing with the large scale structure to take a survey like the Sloan Survey. And here is a paper from 2021 measuring the two point correlation function. So just measure the two point statistic and take most of what we can out of that. In this case, we do some corrections for how structure grows. Here they're applying this reconstruction technique that Dan Eisenstein and Ed Serco and Huyo Sung and I developed a number of years ago, many years ago, to this data to improve our ability to learn a bit more about the information. But it's all linear theory applied. We're not doing anything very sophisticated right now. And the question that we want to ask is if we go from the observations to the theory, what's the tool we want to use? For the microwave background, we knew the right thing to do. We knew that all the information was encoded in the two point function. So we just measured that and fit that. We felt for a long time, there's much more information than what's in the two point function. And the challenge is how do we extract it? The first step in that is understanding just what are the summary statistics that we want to use to determine these basic parameters with the smallest errors. And now I think is the time to address this question. And that's because we're in the midst of this incredible explosion of data that's going to happen in the next few years. The dark energy survey is underway, a much bigger survey than the Sloan Galaxy survey that's those spectroscopic surveys of galaxy positions I showed it will be improved enormously. The Vera Rubin Observatory is going to begin surveying the sky next year. We will have much deeper images, which will give us measurements of the distribution of matter through gravitational lensing and like. The European Space Agency is launching the Euclid satellite, which will map much of the sky now from space, so it with improved image quality. And NASA a few a couple years later, and this is a mission I've been deeply involved with, is going to launch the Roman Space Telescope. And that will map eventually the entire sky with the resolution of the Hubble telescope. And in addition, and it will be operating in the infrared, which will let us peer deeper into the universe. In addition to these optical and infrared surveys, the German-Russian collaboration is launched Irosita, which is an x-ray satellite that is mapping the whole sky as we speak. And we're continuing to improve our microwave background maps. One of the big projects that we're involved with at the Simons Foundation is the construction of the Simons Observatory, which will be mapping the microwave sky at significantly higher resolution and sensitivity. So we're about to measure all this wonderful data. My experimental and observational colleagues are doing terrific work. And I think, you know, and I feel that as theorists, what people have been doing is basically applying the tools that Jim Peebles and others developed in the 1960s to measure the two-point correlation function to this incredibly rich data. And we'd like to know, well, first, is there more information there? And one of the techniques we want to do to this is take a series of observables and just ask, can we extract more information from this? So an approach we took to do this was to compute the fissure matrix of the parameters given the observables. And the way we computed this was we threw a lot of computer time at it. It's good to be director of a computational astrophysics institute. So we ran Mrs. Work led by Francesco de Vesuelo Navarro, who goes by Paco. And what Paco did, a letter group that launched 43,000 full-end body simulations, like the one you showed, sampling over 7,000 cosmologies. This produced a data set of about 50 trillion particles over a volume larger than the entire observable universe. In these simulations, we've identified billions of halos and voids. It took 35 million CPU hours and generated a petabyte of data. And I'm advertising this because all this data is available. So if anyone wants to play with and train on cosmological and body data, which is what we want to do ourselves, all this is available for people to use and download and interact with. We have a team of people who have been looking at this data, really at this point spread out throughout the world, looking at different pieces of it and have extracted lots of information, interesting information from it. I'm going to highlight a few of the results. So one of the things we looked at was, is there more information there? And first, what information can we measure on these basic parameters like the density of atoms, the expansion rate, the Hubble constant, and so on from the power spectrum? And we extract the information from the two-point functions that we expect. And what we discover is we start adding additional statistics, like the halo mass functions, the numbers of voids, we try different summary statistics. As we add more summary statistics, the amount of information we get improves. And one way to quantify the value of this is to say, we've built these big experiments, we're spending billions of dollars on these satellites, motivated partially by measuring these parameters. Improving the parameters by factors two or five, as we sometimes do, is equivalent to getting four times or 25 times more data, which is another multi-billion dollar investment. So there's, I would argue, significant almost, certainly scientific value, but even quantifying a sense, the return on investment that one gets from applying these new statistical techniques. And we've looked at a number of statistics and a particularly promising one in improving parameters is taking the density field and using more density field statistics. And the real takeaway here on these different approaches, and this is just a number of papers showing different ways of cutting the data, different techniques, is that there seems to be an incredible richness in the data that we've not extracted. Now why haven't we done that so far? One of the reasons is that while these dark matter simulations are great, we actually know there's much more underlying physics. And to give a sense of that, let me show you this simulation from the illustrious TNG team that includes not only dark matter physics, but gas cooling, star formation feedback, the density of baryons. And you can see that when you look at the distribution now on the scale here of galaxies, so the bar in the corner shows 100 kiloparsecs. So we're looking on the scale of about a megaparsec, roughly the distance between us and Andromeda in these simulations. And when you get down to the galaxy scale, there's much richer physics than gravity. And included all these simulations, and this is the state of, I would say the current state of the art in many ways, is lots of approximations. So you can see in the right hand corner, for example, the gas radial velocity as it's forming. And you can see lots of turbulent motions. You can see winds from explosions, I guess you just saw an explosion in the right hand corner, from supernovae and from active galactic nuclei, jets driven out from supermassive black holes having big effects on the environment. And these are all things we treat with very approximate subgrid physics that we know do not fully capture what's going on. So we know when we get to these small scales, small scales here are millions of light years. We're only getting approximate descriptions of the physics. So if we want to use statistics that go beyond the applying the two point function on the largest scales where gravity is the dominant force, we need to have ways to marginalize over those uncertainties. And a way to get a sense of that those uncertainties is this is a pair of simulations run with the on the right hand side that the current best version of the code, the left hand side, an earlier version with a different treatment of the underlying physics. And you can see initially on large, you know, on the largest scales, things broadly agree as these simulations evolve forward in time. But as we get closer and closer to the present time and nonlinear structures become more apparent, there starts to be some differences in detail, depending on what's happening on small scales. And if the approach I'm using is to trace this structure with galaxies, you'll notice that the number of galaxies on the left and number of galaxies on the right are different. So that the things like void statistics and two point statistics will be a bit different in one box than the other. And this really represents our uncertainties in the underlying physics. And any approach that we want is going to have to marginalize over these uncertainties if we're going to want to understand things. And we to me, this is the approach we want to take is can we marginalize over that the broad question that we're trying to address. And this is, I'll go through some of the results we have, but this is very much work in progress as part of a big program is can we extract all the information from these reserve fields, we actually observe the galaxy field we observe through lensing the large scale distribution of matter, we can observe the large scale distribution of pressure through our microwave background observations, our x-ray observations tell us more about the distribution of hot gas. So we have lots of observations, can we extract more information from this? How can we do this? And the work I'm talking about today has really been led out of two of my colleagues at the Flatiron Institute, Shirley Ho, who is the group leader for our machine learning and cosmology group, and Francisco Villescuero Navarro, who's now at Princeton working on this. And what I want to talk about is a few pieces of this problem of how we try to make this into a more tractable problem to treat statistically by improving our ability to forward model quickly, marginalize over the barrier on physics. And I must admit I am a convert to applying machine learning to this problem. To me, as initially a complete outsider that seemed magical and I didn't quite understand what was coming out and I in the last several years have learned more and more. But what got me interested was work that CUhe, a student of Shirley Ho at Carnegie Mellon, where she was at the time, did on making faster predictions for these numerical simulations. And early on I showed these movies showing how structure grew under gravity. And the question they asked was could they learn or train a network to reproduce the behaviors of these gravitational simulations much quicker? And the first step in this was to start with an analytical approximation that gave us a pretty good description on the largest scales. And then train the network on the difference between the analytical approximation, which is fast, and the numerical simulation, which is slow. And CU trained the network. And this shows the difference in error between the unit errors. And here the errors are quantified in terms of displacements in megaparsec. And the best analytical approximations we have at the moment that on the scales of galaxies and clusters, particularly in the dense regions, which are those red regions here, the errors of position will often be of water five megaparsecs. While the unit was able to reduce the errors by nearly an order of magnitude. We can quantify that, or they quantify that, by looking at the power spectrum of the simulations and the correlation between the exact solution, the numerical gravitational simulation, and the predictions of the unit. And what's shown in, oh, the labels, almost the labels, okay, sorry. What's shown in orange is the truth. What comes out of the body simulation, what's shown in green is the truth. What's shown in orange is what comes out of the unit. And blue, what was the state of the art. And you can see that the unit does a much better job at reconstructing the power spectrum. It can reconstruct things down to wave numbers of about point five quite accurately. And the beauty of running things with the unit is that once you've trained the network, it is now 60 million times faster at forward modeling than running the body code. So it lets us start to think about doing things in terms of evaluating likelihoods of non-Gaussian statistics that we couldn't imagine doing if we had to forward model with an end body code, each piece of it. And we continue to improve on this, and this is some recent work done by Renan de Lovera and Yan Li, who Renan was a visiting student from Brazil here at Flatiron. Yan Li is a postdoc here where they have improved on the performance of the networks, made improvements in architecture, and now are pushed in scale from wave numbers of point five to wave numbers of one in terms of the scales in which we can accurately reproduce the results of the end body code. Because the amount of the number of modes we measure, those is wave number cubed or 3D, a factor two improvement in the scales we can simulate in a sense represents a factor eight improvement in the effective mass resolution over which we can reproduce results. And what's intriguing, and I think this rests on the fact that the analytical approximation captures the large scale behavior, and there's some scale invariant behavior that's being learned by the network. And we are very much work in progress to understand why this performs so well, is we train the network on a universe with a matter density of about 0.3. We've then taken the same network, same unit trained at 0.3, and applied it to universes with much higher or much lower density. And we find that we get very good performance even though the network was trained on a very different cosmological model. And my hypothesis on this is what the network is learning is how to take the kind of fuzzy structures that come out of the analytical approximation and make them sharper to go back to this picture. On the left you can see things look kind of fuzzy. And what the network learns to do is make fuzzy things sharp. And as long as the analytical approximation gets the fuzzy things approximately right, the CNN is learning to do local corrections to make the details right. I think an important part of the physical structure of our problem is that on large scales we have a successful analytical theory. On small scales is where the nonlinearities are, so we want our networks to learn how to capture the small scale behavior. Now you have to remember a cosmologist, so a million light years counts as small scales. So this motivates us to really think about a whole program which we've embarked on of can we go from initial conditions to the results of a dark matter simulation from forward modeling. Can we learn how to go from the dark matter simulation to compare it to simulations that capture the full astrophysics that often take months of supercomputer time? So we can learn how to go from dark matters to galaxy and gas, then think about model how to go from the simulations of the distribution of galaxies and gas to the observations to be able to make comparison to the observations, then either think of forward modeling the whole thing and applying likelihood-free inference or techniques of that sort, or think about the inverse problem and ask if we can iteratively work our way from observations back to initial conditions. And this, some simple figure, conveys a big program that a number of us have embarked on with many people working on various pieces of this problem. Another way of quantifying this, if I think about this in terms of a likelihood function, is asking can I compute the likelihood of the data I observe? And this is what conveying data from, you know, D here is might be data from our microwave background experiment, D prime is our data from our large scale structure experiment, we want to marginalize over all possible amplitudes, marginalize over our uncertainties and the astrophysics, marginalize over our uncertainties in observational systematics, ask what the likelihood of the amplitude are given the parameters, where I have my various priors on parameters, the astrophysics and the observational systematics. And it's easy to write down this integral, it's pretty challenging to evaluate. We need to model a volume of about 10 to the 11 megaparsec cubed, so that integral over initial amplitudes is in a sense an integral over 10 to the 9 to 10 to the 10 variables. Every evaluation of that forward modeling means going from initial conditions to the reserve galaxy distribution. When we do that forward modeling, we need to marginalize over uncertainties and the astrophysics ultimately need to project into the observational plane and include systematic effects. So the next step in terms of developing tools to do this was Paco, Shaijinel and Daniel Ankhaz-Azhar embarked on a new suite of simulations. We call the camel simulations. They launched basically the same number of and body and hydro simulations. The hydro simulations used to independently develop hydro codes with different realizations, the underlying physics. The illustrious TNG code used in the irreplacable system, the other code developed by Volcker-Springle and Lars Hernquist and the Gizmo and Symbol codes developed by Rameel Darve, who's known in Berlin as collaborators, together with the underlying gadget three code for gravity. So we had sort of dark matter only simulations. We have dark matter only simulations. We have hydro simulations with one set of codes. We have hydro simulations. The other set of codes, our hope is that the difference between the two hydro simulations is of order the difference between the hydro simulation and the real universe or at least gives us a sense of some of a piece of the uncertainty, which is actually very hard to quantify is how does different treatments of physics, how does that affect our uncertainties? And we've generated large numbers of simulations with the goals of providing theory predictions for the summary statistics, training neural nets to be able to extract information, marginalize over the baryon physics, and learn how to go from the and body simulations, which are expensive, but not as expensive as hydro simulations to the hydro simulations with full baryonic effects. And really our goal is to be able to go have a neural net or pair sets of nets take us from initial conditions to what would be the results of gravity only to what are the results of the full baryonic simulation and have it exploring things in this way lets us quantify the dependence of galaxy formation and evolution on astrophysical and cosmological parameters in a way this is a very different a bit of a different spirit from the approach people have usually taken to galaxy formation simulations where they want to fit the best model to the data. Here our goal is really to fit to evaluate what is the uncertainty in the predictions of the model given that we don't know all the astrophysical parameters. So as we vary the simulations and this just shows for example how the hydropower spectrum on the right compares to the dark matter power spectrum as we vary cosmological astrophysical parameters so what do we mean vary astrophysical parameters we vary things like what is the energy that supernova put in to the gas around them if you remember those explosions driving gas out we vary the amplitude of the parameter that describes that energy input and as we vary those parameters the different models have different effects on the power spectrum and this shows the variation in the matter power spectrum as a function of scale and you'll notice that the range in scale now goes to much smaller scales that we've talked about earlier k equals one is now on the left that's the sort of small scales of the earlier plots and you can see on those scales one of the things we learn is that the corrections due to hydrodynamics are at the percents level well when we get to much smaller scales or factor 10 smaller in scale um wave numbers of water 10 uh mega parsecs the corrections are now at the 20 or 30 percent level and this shows it just pairs of simulations done with different codes with the same initial conditions again given us a measure of some of those variations and here's the two different codes evolve in the universe forward and of course they you know they broadly agree on the large scale and differ on the small scales and one of the things we want to do is capture that difference so again the sort of dream here is to have these thousands of cosmologies simulated with dark matter only and with kehote thousands of astrophysical models we're developing some super resolution techniques so we can improve the range and then find ways to marginalize over the barrier so let me give you a sense of some of the pieces we've been working on in this and uh this is uh some of the team that's been uh contributing to various pieces of this to look at the mappings between the body sims and the hydro uh here's some work done by uh leander kehl who's a graduate student at princeton and here the approach he's taken is asking can we learn from these camel simulations how to go from dark matter only to the distribution of gas can we predict the electron pressure the electron density and the electron momentum given what the dark matter with gravity only simulations predict and the approach we've taken which um i think we found for a lot of these problems has been a successful is to start with analytical models that get us part of the way there and that to train the network not ongoing from the dark matter simulation to the simulations of the gas physics but to go to train the model to uh take the dark matter simulations take an analytical approximation to predict what the gas does and train the network of the difference between the analytical approximation and what the gas does and this turns out to be very valuable in a problem like this because we have a very the key parameter in determining the gas properties for a cluster is the mass of the cluster and rather than having to train the network on a very wide range of masses and sample that mass distribution very carefully we can train the the network to learn the correction to the analytical theory as a function of scale and one of the challenges in this problem is if when we look at the large scale distribution of gas this shows the in blue the the pdf of the gas pressure as a function of scale and you can see that the typical pixel typical voxel in the simulation has very low pressure and the um there's only a tiny fraction of the volume in the very high pressure regions the observations measure the integrated pressure so they care primarily about the highest pressure density regions yet our simulations are not doing a very good job of sampling the most run a series of zoom simulations high multi-scale high resolution simulations where we sample the tail much better so we create and what we actually use for our training is the dash blue distribution that shows that much of our simulation time and volume in the training set goes to simulations that are the extreme cases which are the ones we care about observation so you've got to train not equal volume but train um unequal importance and this shows our ability to recover parameters uh black here represents the power spectrum of the um so what we've done is we train the network on these small camel simulations we with lots of small boxes we now apply it to for testing to a much larger scale simulation the actual the big illustrious t and g simulations that are now available from the illustrious team and black shows the predictions from their simulation orange shows as we've trained the network to go from dark matter to prediction how well we do and the blue represents the analytical theory which gets close but doesn't quite capture what goes on and the plot on the right shows the one point PDF and shows that we're capturing a lot of the basic parameters um running a little late so let me move quickly we've also looked at how we go from dark matter to galaxies and can match the large-scale distribution of galaxies so we're working on a program of being able to make predictions for surveys like the euclid survey for the large-scale galaxy distribution the next piece of the problem um that we've you know begun working on and this is some work that uh terugmody has been pioneering is the inverse problem which is assume we know we have the right forward model can we then invert it and recover the initial conditions and what's shown on the left is um his efforts to take uh the sort of a current inference machine approach approach make a cosmological version of it and compare its ability to recover initial conditions and right now this is working well but on a very small box a 64 cube box is what we can fit the GPU and that this compares the truth to the reconstructed model and you can see both in um and we've also looked at doing the reconstruction not on just on the dark matter fields but also assuming it is being sampled as a galaxy distribution and on the right shows the ability of rim to do this inverse problem and it's outperforming uh kind of an optimal an atom optimized approach and pushing out to reasonably small wave numbers we're you know charade is still working to understand well what sets the limitation of this reconstruction um and can one push this to smaller scales also there's another challenge for this problem is this is still being done in very small boxes compared to the scales we would need to be able to apply this to uh cosmological surveys so we have a number of thoughts on how to generalize this so let me give you a summary of what I tried to convey um a question we're trying to answer is can we extract more information about the fundamental properties of a universal and cosmological observations than we're doing currently I think the answer is an emphatic yes in fact with these new surveys coming I think we're almost obligated having asked the taxpayers to pay billions of dollars to build telescopes and satellites and like that uh we extract all the information there and it's clear that what there's information that we've been missing that's uh in these other summary statistics I think to apply these summary statistics of the data it's going to require that we develop techniques for extracting all the information that requires uh very fast ways of simulating the universe and then marginalizing over the astrophysical uncertainties it's a bit you know it's a more sophisticated way of doing our statistics and then um machine learning is going to be a key piece of this effort and uh you know I'm a we're very much approaching this as um you know users of uh machine learning not uh so much as tech developers of techniques but it's often true when you really are pushing something to do a demanding problem you will hopefully in the end also do things that will lead to insights um into the underlying techniques and approaches so let me stop there thank you for listening and uh take questions excellent very for this excellent talk very very interesting okay we have time for a few questions if you want to ask a question then just please raise your hand and I will have to actively unmute you to be able to speak I think I have the privilege of speaking without raising a hand I typed a long question in the in the chat and I but it's maybe easier to to explain it so this very wonderful talk you give displays a pattern that we've discussed in the seminar where machine learning is a so-called surrogate for scientific simulation the surrogate means it's much faster and allows one to explore spaces not uh spaces in the parameter space not previously explored um I think the most profound example we heard about this was from Lawrence Livermore laboratory that's looked at fusion in an oval instead of a sphere they had never been able to you know consider a different shape than a sphere the sphere is round that should be the best and it wasn't yeah which was discovered by applying the surrogate first and then confirming it with the traditional simulation now I think you mentioned some great opportunities for this particular surrogate to make the transformational change in this research could you repeat that or articulate that just a little further because I think that that's sort of the center point of the interaction between computation and astrophysics here right so let me actually generalize and talk about some things I didn't talk about in terms of opportunities so the one I want to focus on today was in a sense this version you've talked about a what I want out of machine learning is the ability to do calculations faster that I couldn't do before and it's just the fact that I can do these calculations it's an approximate in a way it's using it as a an efficient way to do interpolation in a high level so that's a great example with very expensive numerical simulations and that lets me do this address these questions and our goal in doing this is to in this case to get at fundamental properties of the universe what's the mass of the neutrino there's other pieces that we've been thinking about which I think you know really subjects the completely different talks can we use it to do multi-scale physics in ways we haven't done before right so and and that's something that's actually hidden in this talk that I have not addressed which is all right can we take a problem like turbulence where we can only simulate a certain dynamic range and the there's a lot of mhd turbulence and things like that going on we get to the galaxy formation scale can the neural net actually learn better descriptions of multi-scale physics can it learn closer relationships that we haven't seen before and there's a group of people here at flat iron who've been looking at both 2d and 3d turbulence to see if we can effectively learn new physics and there's a kind of a branch with that where one looks at applying this in ways in which you can actually try to get better learn actual you know and new analytical expressions that better capture it so I think there's another piece of the story in terms of things we've been thinking about that I just wanted to throw out there besides the story I told today which is really using machine learning to do things that we know how to in a sense know how to do already but can't do fast enough which to be to be able to get the full advantage of the data which is what I'm focused on today yeah so so you know I think you're hinting with the multi-scale physics that what you need to do with a student like machine learning you need to drag this student to make great discoveries in science and that's you know for us to learn how to train these students well beyond the mechanical training on data is I think a really great challenge yeah I think it's a great question um that's not me but next is it yes and you will come in person then I hope you know as soon as they let me into the country without quarantining I'll visit excellent fantastic okay thanks uh Raymond you have a question I've got my my my vaccination pass it's the people aren't accepting it yet oh yes yeah so are you okay yes please go ahead okay good yes hi yes I really enjoy that it's interesting that the similarity of your program for cosmology to what various people in climate want to do with machine learning is very striking uh whereas your headache is barionic matter our headache is clouds yes the the um with the cloud problem we have the wrong the clouds feedback extremely strongly on the equivalent of dark matter on the large-scale hydrodynamic circulation and so I was wondering in terms of the structure of the problem something it wasn't clear to me coming at this field from the outside does the barionic matter formation uh that process does that does that feedback very strongly on the dark matter distribution or is it more a matter of of uh you're needing to go from the the dark matter distribution to to say what the barionic matter is doing embedded in all of that the feedback the importance of the feedback is very much scale dependent so on the scale of a galaxy so uh thousands of parsecs it's an the feedback effects are very large order unity change in the dark matter distribution on the scale of a cluster a million parsecs it's significant but the typical gas motions are of order a gas particle will move of order a million parsecs three million light years in the age of the universe so um it will change things a lot on that scale but if I go to the scale of five megaparsecs it's all motions that are uh small compared to that so what it will do is change the quadrupole moment of the distribution of matter that will that affect things on large scales so as you get to a larger and larger scales it becomes a perturbative correction so and in a sense what I you know one way of viewing the this marginalization over uncertainty problem is to say I know all it's doing on this larger scale is changing the quadrupole a bit the quadrupole is going to I can learn that it is changing within this range of values and that's what I want to marginalize over and evaluating my uncertainty so there's because of basically because of mass conservation and momentum conservation when I look at what happens on one scale on one region here the fact the change in gravitational change over here isn't most a quadrupole the monopole and dipole terms are zero because I can't move matter so that actually makes things converge on the sort of five to 10 megaparsec scale okay actually what you said reminding me of a follow-up question so in a lot of the things that people are trying to do with clouds they've had the problem that the standard machine learning algorithms don't make it easy to impose physical priors like conservation of the mass of water conservation of angular momentum things like that so in your approach the application of machine learning are you actually able to impose certain physical priors or or does the machine learning algorithm have to learn those things that you already know are true you know we're doing a bunch of problems so there's a bunch of different things we found valuable we've often used equivalent networks that express in the network structure the underlying symmetry the physics um and uh so that's been one approach um another um in these cosmological problems as I mentioned what's been powerful to use is the fact that we often have approximate analytical descriptions that get you much of the way there and using that step has been been helpful um great yeah thanks i i'm uh we you know it's always this question in numerical systems right do you use your conservation do you impose your conservation laws or use them as a check on your final result yeah and one of the things we've been talking about is an important thing we know about the universe is it's um homogeneous and isotropic and the when we get to the end of the day and project this compared to the observations the observations are um the observations make things inhomogeneous certain parts of the sky you have more data on because it's been less cloudy on earth so you were able to get more data when your telescope pointed in that direction um there are because we measure things not in three physical dimensions but were affected by galaxy motions the observations are anisotropic now we understand in our forward model how to capture that when we go back to recover initial conditions the approach we've been thinking of and this is really very much work in progress is to use um our knowledge that things that the universe ought to look isotropic and homogeneous and if my reconstructed initial conditions um all point towards me there's something called the finger of god effect which was identified by fritz wiki the original discoverer of dark matter back in the thirties and there's a margaret galler 20 when i was a graduate student zlicki used to go run around and say look at all those fingers of god pointing at you telling you that you were wrong and i think at the end of the day when we do this reconstruction if we find fingers of god pointing us at our initial conditions that's telling us we're wrong and i'm at the moment preferring to use the some of the symmetries that we know as a test on our um the effectiveness of our approximation and models rather than imposing them through the mapping yeah okay fantastic we have one more question which is by martin please go ahead hi uh thank you for the talk i appreciate the progression from the kind of fairly general to the very specific questions um i was a bit curious about you know you've been to many times you're trying to use your your very impressive suite of simulations to to recover the physics in particular the baryonic physics um and if i understood well the way you do this of course is to essentially map the observation onto specific simulations where that physics was implemented you know in a specific way so if i understood that right then aren't you worried that in fact in those simulations the physics was parametrized if you had very very crudely i mean often with really barely any physics at all often would in fact entirely empirical so-called laws like like the schmidt kenner cut for example so are you worried that what you'll recover is you know the best recipe but that means itself you won't learn any physics from that because there wasn't much physics to start with in those recipes um yes um but my hope is um that if we're in you know so let me talk in particular about the things like the schmidt kenner cut law as a parameterization so this is a parameterization that treats relates the surface density of gas in the disk to the star formation rate and um that then relates to how much energy is put back in feedback i think it's going to be very difficult to uh to extract cosmological information on scales smaller than a megaparsec a million parsecs and all the details of star formation matter a lot on those small scales but i think by i like to think about is marginalization that if i'm interested in what's happening on the five megaparsec scale if i can vary the astra my astrophysics over a plausible region of parameters and it is a parameterization but if it does capture the range of responses then that may turn out to be enough particularly if i can calibrate this with some of the observations so the observations for this particular problem i've been quite interested in is our measurement to the distribution of electrons around galaxy halos so one of the things we've been doing we have a a paper outlet by stafana i'm a day on this a couple months ago um so this is part of our observational program is we've been cross correlating the large-scale distribution of galaxies with observations the microwave background measuring what's called the kinematics in the ass of the other defect for those of you are cosmologists we basically means we can measure the electron galaxy cross correlation remember what the key uncertainties is how the winds drive out the electrons how the winds blow out electrons and protons the different parameterizations of how the underlying physics of the winds will give us different distributions what we care about is the overall distribution of matter and because what changes the matter distribution is when you blow out the winds that changes how matters distribute if we can calibrate our models so that they basically say we're only interested in we put in our likelihood wrongly we want models that reproduce the right electron galaxy distribution um that will restrict us to if i think about the parameter various parameterizations um a sub manifold of parameters of things like supernova energy injection star formation efficient season the kennecote smith laws um and so on that are consistent with the data and the approach we've taken in the camel simulations is we've actually explored a range of astrophysical parameters that are actually much broader than people usually do in the simulations you know we we we put in models that have more wind energy than we know is there and much less wind energy we know is there so at least we can understand some of the dependencies um and you know this is uh this lets us handle the the known unknowns uh what's always uh haunts us is the unknown unknowns enough thank you wonderful thanks indeed thanks a lot david for this great talk thanks a lot to people who have asked questions that's it for today uh talk to you again in trinity term see you then thanks