 Okay, I had titled this third lecture, Observational Prospects, and I was going to talk a fair amount about aspects of survey design and about upcoming surveys as well as recent surveys. Because I'm running fairly far behind on material from yesterday, I've decided to cut back a fair amount of what would have gone in today and mostly cover the baryon acoustic oscillation material that I planned for today's lecture. But to begin by returning to what I was talking about yesterday about modeling galaxy clustering and then I hope after discussion of BAO if there's time I'll come back and say some more about the Lyman Alpha Forest. But the discussion of BAO will bring up some of the general points about survey design. So this little film clip I'm showing here is from the Sloan Digital Sky Survey, which has produced many of the major galaxy clustering data sets that we're currently using. And I started working on the Sloan Survey in 1992 when I was a postdoc and at that point we were doing design and then moving towards construction. The survey had its first light data in 1998 and began full operations around 2000. And that was when we started really being able to do science with the data. And then it's been through several phases since then. So 2005 we started a second phase called SDSS2 and I was the project spokesperson for that. And then from 2008 to 2014 I had a phase called SDSS3 of which for this group the survey you're most likely to have heard about is the one called BOSS, the Baryon Oscillation Spectroscopic Survey and I'll be showing you results from that later today. And I've been the project scientist for SDSS3 and now we're on to SDSS4 which I'm still involved with but not quite so heavily. This is just a time lapse of one night at the SDSS telescope. So the SDSS did imaging of the sky through a giant mosaic camera shown here along with Jim Gunn who was my thesis advisor and was also the person who really made the SDS more than any other single person out of the hundreds who were involved, Jim, is the one without whom the whole thing couldn't have happened. But now SDSS is mostly is doing spectroscopy. So it's measuring the red shifts of galaxies and the spectra of quasars and stars through fibers that are plugged into plates, those and so there are cartridges that get mounted on the back of the telescope and light goes onto the plate down the holes through the fibers to the spectrographs to measure the red shifts. So what you see is these periods when the telescope is pointing at something and collecting light and measuring all those red shifts and then there's a period when someone has to go out take one cartridge off the telescope, put another one on and then it gets repointed somewhere else. There are some calibration observations and then it points that way for a long time. And the emboss were able to measure, we have a thousand fibers in the spectrograph so we can measure a thousand galaxy red shifts at a time and on the best nights we have nine of those cartridges and we can get as many as 9,000 galaxy red shifts in a night. So if you're a theorist who works with galaxy clustering data like me then mostly you start at the point where these things have become catalogs with lots of positions and red shifts and maybe colors and magnitudes which is very convenient but for it's often important to remember that all of these things actually start with first of all the people building telescopes and cameras and spectrographs and then with people going out every night under variable conditions and clouds and other complications and collecting the data that end up producing this. So one other just brief advertisement, one of the things that I've done over the last ten years that's been pretty fun is collaborating with a sculptor named Josiah McElhaney who does a wide variety of things but the things I've been involved with are cosmologically inspired sculptures, we've worked on five different projects, this is an image of one of them that has the Kobe map of the cosmic microwave background represented on that central sphere and we've done things about the expansion of the universe and multiverse inflation and so forth. There wasn't really, I had too much to cover in four and a half hours anyway but for any of you who would like to see more pictures of this and hear more about it, I will show up here around two so the last half hour of the lunch break and I've got various pictures from these projects so if you're interested enough to give up the last half hour of your lunch break then show up at two and I'll show you some things in this project that I'm happy to tell you about it but it's fun but it's somewhat off topic so that's for two o'clock today. So let me pick up with modeling galaxy clustering and I talked about four different approaches yesterday in fairly rapid succession. So hydro simulations where we try to actually put the physics into our simulations to track where the galaxies will form, semi-analytic population of halos and subhalos from n body so taking the history of each halo or each subhalo in the n body simulation tracing it back pasting a semi-analytic galaxy formation model on top of that to predict the galaxy properties. Abundance matching which is basically doing the same thing, populating the halos and subhalos from an n body simulation but instead of putting in a full model for the gas cooling and star formation and feedback the assumption in abundance matching is simply that more massive halos or subhalos contain more luminous galaxies. You can extend this to age matching where you say that at a given luminosity the halos that formed earlier have the redder galaxies so it's sort of replacing the theoretical model of galaxy formation in semi-analytic with just enforcing a match to the observed galaxy luminosity function or the observed color distribution and it's called abundance matching because the way you decide to what luminosity to assign to a given halo is you match the abundance of galaxies above some luminosity threshold to the abundance of halos the space density of halos above the corresponding mass threshold so that this matching of the space densities of galaxies and halos gives you the rule by which you're able to assign things. And then halo occupation distribution modeling which is populating the halos of n body simulations or sometimes using analytic approximations to calculate galaxy clustering in halos in place of actually using a simulation but HOD modeling generally doesn't make use of subhalos. So in terms of philosophy I would say that each of these two techniques uses a complete not necessarily perfect but it's imposing some complete theory of galaxy formation. So by some means numerical or semi-analytic you're trying to calculate gas cooling star formation feedback etc. abundance matching is imposing a prior about galaxy formation namely that more massive halos host more luminous galaxies or higher stellar mass galaxies one can adjust that in some ways you can introduce scatter or other things but basically this is saying instead of following all the physics we impose this particular assumption to populate the halos and that's a fairly strong assumption so say this is imposing some strong prior and again gives you a fully predicted galaxy population. And so I think the difference in philosophy with HOD modeling is to try to have a flexible representation of galaxy formation fit to data. So when you compare the results of these to observe galaxy correlation function for instance you test the combination of your cosmological model and your galaxy formation theory. When you compare this to observations you test the combination of the cosmology and this assumption about how galaxy formation works and empirically it works surprisingly well. And also this assumption turns out to be a pretty good description of what comes out of either of these approaches. And here if you go in for a fixed cosmology then what you get out of doing this fit is information about the relation between galaxies and dark matter halos. So what mass halo hosts galaxies of different luminosities of different colors and so forth. So that's something you can use as kind of empirical information about galaxy formation physics or you can use it to test the predictions of your other galaxy formation theories. And then from the point of view of cosmology you say well I'll try different cosmological models I'll allow myself to always give any cosmological model the best chance it can have to fit the observations and if it can't fit them then I go to a different cosmology. So HOD model or HOD is a fully non-linear description of galaxy bias in standard form. So the way I and others have most often used it. It assumes that P of NM the probability of finding N galaxies in a halo of mass M is independent of the halos large scale environment. Because if I put galaxies preferentially in the halos that are in denser regions or under dense regions then that will change the clustering for a fixed HOD for a fixed P of NM so I would need to take into account that environmental dependence in order to actually be able to reproduce everything about galaxy clustering. So the completeness of the HOD description is tied to this assumption or you can generalize it to build in some sort of environment dependence. And so the initial motivation for this assumption came partly out of the extended preceptor formalism for the calculating of the formation of halos which at least in its simple form predicted that the assembly history of halos depended on their mass but didn't depend on their larger scale environment. And then there was also some support for that from numerical simulations studying the formation history of halos which showed that there was some correlation with the large scale environment but that it was relatively mild. I think this was Sheth and Tormann, is that right? And so that seemed to be, you know, that at least was what we were appealing to in thinking that this was a sensible approach to thinking about galaxy bias. However, as you know people did larger simulations to get better statistics it became clear that there is some level of halo assembly bias and in particular when we think of our halo mass function looking like this and here is that characteristic halo mass I was referring to as M star that if we're looking at halos down here, halos that are well below M star so for low mass halos, halos that form earlier are more strongly clustered and they tend to avoid the lowest density regions and this is particularly the effect that the very oldest halos are preferentially in dense regions. Once you get to the sort of garden variety halos this effect is weaker and it's also weaker as you get up around M star but there is some correlation between halo formation history and halo environment. So then the question for galaxies becomes how tightly correlated are the galaxy properties with the formation history of their parent halo? So if the galaxy properties are basically stochastically related to the formation history then even if the halos have assembly bias the galaxies won't but if the older halos also form the redder galaxies then you'll expect to the galaxies can sort of inherit assembly bias from their parent halos and in that case it's something that you would need to include in an HOD description if you're going to capture that effect. So these two slides here are showing results on the left from hydro simulation, this is a paper that was submitted, not posted to AstroPH and then the first author has moved into industry so it remains to be seen whether we'll ever get this out into the world but it was in Kushal Mehta's PhD thesis. So here we took results from hydro simulations, SPH simulations and identified galaxy populations above different thresholds. So there are two different mass thresholds here shown by the heavy and the light lines and then divided the halos into the ones that were in the most over dense environments, the most under dense environments and all the ones in the middle. So there's curves there for the top 20%, the bottom 20% of halos ranked by their environment on scales of a few megaparsecs and what you can see for each mass threshold is kind of within the noise of the simulations these things lie on top of each other. So in this simulation there even though there may be halo assembly bias it's not affecting the galaxies the HOD is predicted to be independent of the large scale environment. However in semi-analytic models you predict some degree of assembly bias for the galaxies and there have been papers on this and there's been very nice work recently by Andrew Herron and Doug Watson and their collaborators using abundance matching and age matching to assign galaxy colors and they find that there is a galaxy assembly bias in the resulting catalogs and in particular when they assign luminosities to galaxies they use the circular velocity of the halo rather than its mass and halos that form earlier tend to be more concentrated because the central regions of the halo formed at earlier times when the density of the universe was higher so at a given mass it's the older halos that have a higher circular velocity so if you're using that as your way of deciding on galaxy luminosity then you tend to build in the galaxies tend to inherit some of this information about the formation history of their halos and so they've had various papers about that this is putting this in the same form showing the average occupation for galaxies above some luminosity threshold and it's the same construction as over here there's all the halos in the middle 60 percent the densest 20 percent and the least dense 20 percent and these differences you know they're fairly subtle but that's enough the fact that a halo is more likely to host a central galaxy if it's in an over dense region than an under dense region is enough to have a noticeable impact on the clustering so so this galaxy assembly bias which I like to think about as environment dependence of the HOD is a potentially important complication in HOD modeling and particularly for this program of we want to marginalize over all HOD parameters to infer our cosmology well the question is do we need to include environmental parameters as part of that so this is one of the frontiers of the field figuring out how important this is figuring out how to account for it I will talk about one particular application which is of of interest in itself and also kind of an illustration of this approach which has to do with galaxy galaxy lensing so usually when we think of of weak lensing a weak lensing survey like the CFHT lensing survey or what Euclid is going to do or what the dark energy survey is doing we typically think about cosmic shear okay so you take images of a large area of sky you measure the shapes of of lots of distant galaxies and you're asking on average are nearby gal nearby source galaxies are they aligned with each other because of some intervening stuff that's shearing all of them in the same direction and therefore producing correlations of their ellipticities so in cosmic shear you are just looking at the sources that are being lensed by intervening dark matter and you're trying to measure the degree to which galaxy shapes are correlated with each other as some at least statistical measure of the clustering of intervening dark matter in galaxy galaxy lensing the idea is that that you pick some sample of foreground objects that you think of as your lenses and then you look at the the background galaxies around so things at higher redshift but in the neighborhood of that lens and on average those should be stretched perpendicular to the line of sight between that galaxy between the lens galaxy and sorry between the source galaxy and this foreground lens so this is the kind of thing we see in galaxy clusters where we see giant arcs that are stretched tangential to to the line to the center of the cluster and when we look at individual galaxies these stretches that are produced are only changes of half a percent or one percent in the shape of the galaxies that have strong ellipticities so you can't measure this for any individual galaxy but statistically averaging over over many pairs of source galaxies and lens galaxies you can detect the statistical signal of that average tangential shear stretching of galaxies tangent to the line or perpendicular to the line of sight to the galaxy and so yes yes so you know it's a patent it's simply a matter of of how many you know whether you've got enough lenses and source whether you've got enough pairs to predict it you see if I've got I don't have a great plot of that here but the so the yeah so you you're you know you need your you need a survey that's deep enough that you've got a population of sources that's behind the things you're taking as a population of lenses but for instance in the SDSS most of what's been done is to take the galaxies in the spectroscopic survey as the lens population so these are at redshift of say 0.1 0.2 people are now doing it with boss so out at redshift of 0.5 or 0.6 and then you take the galaxies in the imaging survey you might use photometric redshifts to try to divide them into shells but on average the galaxies in the imaging survey are more distant and you look for the average distortion of the galaxies in the in the background relative to the foreground and this is something that so the first the first detections of the signal were in the late 1990s but this was one of the big early results from the Sloan survey because once you had a couple of hundred square degrees of good imaging then you could actually measure the signal out to many mega parsecs and in particular what this gives you so analysis of GGL galaxy galaxy lensing measures the product of omega m times psi gm okay so psi gm is the galaxy matter cross correlation and the so if you think you know right around if we're looking at at you know 100 kiloparsecs around the galaxy you can basically think of this as measuring the halo mass profile all right so as the dark matter drops off this lensing signal gets weaker and you can map the halo mass profiles of the average halo mass profiles of your sample of lenses but more generally even if you're 20 mega parsecs away if you're sitting on a galaxy the matter will be over dense at 20 mega parsecs around galaxies compared to random places and so that still shows up as a signal and there are at this point the measurements of this do extend to to tens of mega parsecs and the so you're measuring psi gm because you're trying to pick up what's the your you're sensitive to the excess matter density correlated with galaxies but it's proportional to omega matter because again if there's if the universe were empty then or if the universe were nearly empty then there wouldn't be enough mass to produce any lensing so the more matter there is in the universe the stronger the lensing signal is for a given correlation so there are some complications in how you go from what you actually measure to this because you've got projections and and so forth but this is this is really the physical quantity that you can extract from galaxy galaxy lensing measurements and this is interesting because we can also measure psi gg the correlation function of the galaxies with each other and so you know this in itself allows you to say things about say the the halo profiles of different classes of galaxies there's a wonderful paper by rachel mandelbaum and collaborators in 2006 looking at how the halo masses and halo profiles of of galaxies in the Sloan survey depend on the luminosity of the galaxies the color of the galaxies the the large-scale environment but from the point of view of cosmology what's interesting is combining galaxy galaxy lensing and galaxy clustering because in in linear theory plus linear bias so by linear bias i mean that the density contrast of galaxies is just some bias factor bg times the density contrast of matter we expect that the galaxy correlation function is just a bg squared times the correlation function of the matter psi mm and psi gm this galaxy matter cross correlation is going to just be one bias factor times psi mm because i'm doing psi gm is the expectation value of delta g delta m so i just pick up one of these bgs and you know my weak lensing observable is really that multiplied by omega m so let's write this and at large scales this is what we expect to hold linear bias should be a good approximation on sufficiently large scales so this we can measure just from the galaxies this we can measure from galaxy galaxy lensing and so if we divide omega m psi gm which is one observable by the square root of psi gg then we can get this galaxy bias factor right this is where all the complicated physics of galaxy formation is going in it's just determining this one number but we can make it cancel out and this should be equal to omega matter times psi mm to the one half and we can do this is a i've dropped my r's here but you can do this as a function of radius but if you and the amplitude of the matter correlation function scales with with sigma 8 so this is going to be proportional to omega matter times this overall amplitude of mass fluctuations so this is one route to trying to constrain the the amplitude of mass fluctuations using the same weak lens the same kind of survey that you might use for cosmic shear which will also go after something like the same combination of parameters but the you're using different information and there are some a number of ways in which galaxy galaxy lensing is a little bit easier to measure than cosmic shear and there's a similar amount of information you're using the same measurements of the galaxy shapes and and so in principle you should be able to get answers from this that are comparably good comparably precise to the answers you get from a cosmic shear analysis of the same survey so people have done this and there have been a few papers that have carried out versions of this program on large scales and the and so but they've taken a fair amount of care to isolate the observations to use of only information beyond five megaparsecs or so and the because they're worried about these approximations breaking down on smaller scales and so that's some limiting factor in the precision of the measurements but in that figure I showed you on the first day with all the red points not lining up with the black points one of I think the most one of the more convincing of those red points that's giving a lower amplitude of fluctuations is actually from Mandelbaum at all to 2013 where they're using the galaxy galaxy lensing plus galaxy clustering measured for sdss galaxies so hod analysis is one potential way of pushing this down to smaller scales right because really what we need is is to be able to describe these two functions into the nonlinear regime and now we can't rely on having a single bg but we might have all of our hod parameters to worry about yes so I would I mean I think it's I wouldn't even attempt to disentangle them I think that really they you know they are in some sense the same thing when you're measuring cosmic shear you are measuring the fluctuations that are being produced by the dark matter distribution that those galaxies are also tracing but if you ask about the the actual observation in the one case you're taking your your background source galaxies and you're asking are their orientations correlated with each other in the galaxy galaxy lensing case you're asking are these background galaxies tangentially stretched relative to the foreground galaxies and so if you do these measurements from the same survey there should be some covariance of the of the measurement errors because you're measuring the same structure but it's actually pretty weak basically because weak lensing measurements are always extremely noisy because you know for any one of those individual background galaxies your signal to noise of the measurement is much much lower than one so one should think about whether about the level of correlation of the answers you get from these two different analyses of the same survey region but actually I think it's fairly weak simply because weak lensing measurements are noisy so here's one way of looking at this is just a theory paper but showing if you consider a succession of models with stronger galactic matter clustering going from matter clustering sigma 8 of 0.6 up to 1 and you change the hod in this way as you go to higher sigma 8 there are more massive clusters so you have to start putting fewer galaxies at a given halo mass or else you end up with too many galaxies and too much clustering but if you do that you can get the galaxy galaxy correlation to be almost the same so these galaxy clustering matches up when you make these changes to the hod and these different cosmological models but the predicted galaxy matter correlation or this is the direct thing to measure with weak lensing as you go to a more stronger matter clustering the predicted signal goes steadily up because the galaxies are residing in more massive halos and you pick that up with the weak lensing measurements so that's the principle you use hod to fit the galaxy clustering you predict the galaxy matter correlation and that becomes a test of the cosmological model and one of the questions is is how much does the possibility of assembly bias screw this up because you because maybe this model is is insufficiently complete and one of my one of my students is is looking at at this problem and we've got you know an encouraging result so far is what you really care about from the point of view of this technique is what's usually called the cross correlation coefficient rgm which is psi gm divided by the square root of psi gg psi mm so this is telling you how well the galaxy and matter fields are correlated with each other this is something that is expected to go to one at large scales so really i should actually have it in this equation rgm and because if the galaxy and matter fields were uncorrelated with each other then i wouldn't get psi gm even if there was matter clustering but i left it out because the expectation is that this will go to one on smaller scales so really what we need out of our galaxy clustering theory for this particular approach is to be able to predict rgm and it turns out so we've what we've done is taken those abundance matching catalogs which have some assembly bias as uh as we saw earlier um and said suppose we fit that blindly with an hod ignoring this and then we calculate this thing which at large scales so you know if this is one megaparsec 10 megaparsecs you know this is rgm this is going to be one and then it sort of does some some stuff like that it might dip below and come up above by 10 or 20 percent but it turns out that at least on the scales of a megaparsec what you end up predicting for this thing which is what you really care about for the purposes of this analysis you get the right answer from that if you've chosen an hod model to fit the observed galaxy clustering because basically assembly bias is having some impact but it's having an impact on this and an impact on this that are proportional to each other and they cancel out uh sort of in the same way that the bias factor cancels out in linear theory so this is an example of where the field is now um that uh we're trying to uh we're worrying about effects that are beyond those in the uh in some of the the earlier models but which may be important if we're trying to do precision cosmology and in some cases those will compromise the results and in other places they won't and you just have to investigate each one and figure out whether you can what you need to do in order to extract cosmological information out of generally not galaxy clustering on its own but galaxy clustering in combination with something like galaxy galaxy lensing red space distortions masses of clusters and so forth um and uh so this is a recent paper by uh recent like last week uh by uh Ying Zhu and Rachel Mandelbaum where they've uh taken measurements of galaxy projected galaxy correlation functions and galaxy matter correlation functions measured from weak cleansing for different uh samples of galaxy mass in uh in the sdss so top panel is the galaxy clustering bottom panel is the uh the surface uh the galaxy galaxy lensing measurements and they're fitting them with hod models or souped up hod models uh that uh that Ying has uh has Ying Zhu has figured out how to make very good use of of these samples uh and the focus here of this paper is about what you learn about galaxy formation and uh and halos and so forth but one thing that I think is interesting is you know they get this very good fit joint fit to the data for omega matter of 0.26 and sigma 8 of 0.77 uh which uh is significantly lower than the the Planck values of 0.31 and 0.82 or something and so uh you know this is not independent from that previous red point I showed you but but the fact that you know our our modeling so far suggests that things should remain robust into at least the kind of megaparsec scale uh I think it's an interesting direction for them to go next to ask how far can you push the cosmology before uh before you can and in particular the amplitude of matter fluctuations before you can no longer jointly fit these data okay I'm going to uh now switch uh over towards a little bit about surveys in general but mostly about uh barion acoustic oscillations but any questions on what I've said so far um I think that that uh you know right now we have two sigma discrepancies from you know several but not all measurements of these quantities where that means you know clusters cosmic shear galaxy galaxy lensing red just face distortions and more kind of one sigma right so I would believe there's a problem when we've got uh three sigma discrepancies from three different analyses preferably using three completely different techniques and I think this is something where on on the weak cleansing side I mean I guess that that in some sense gets us uh to hear that that were uh when we think about the imaging surveys being used for weak cleansing uh currently the the best weak cleansing results are from the CFHT lens survey which covered about 150 square degrees so that's 0.5 percent of the sky okay if you don't know it a number you should know is that there are 40 000 square degrees in the sky um and the uh so if someone surveyed 10 000 square degrees they've surveyed a quarter of the sky um and uh so the dark energy survey uh which is ongoing now is going to go to basically the same kind of depth and image quality as CFHT lens uh but it's a survey area that's 30 times larger it's about 5 000 square degrees um so that uh you know with with uh uh 30 fold improvement in the data volume either these things will become quite convincing um or the measurements will move closer together um and then in a somewhat longer term uh we'll get much larger galaxy redshift surveys uh and those will enable redshift based distortion measurements that have small enough errors that you can see whether whether they give the same discrepancy so um so I think that that in uh if if the current I should say I think it is most likely that you know the current discrepancies will go away just because you know most interesting results go away um but not all of them do right the supernovae did not go away um and the uh and so uh I do think that on a timescale of uh of three years uh if the current central values of these measurements are correct then we'll have much more convincing measurements of them by then um so our kind of current state of play uh is uh a lot of the the cosmological data sets come from uh from the Sloan survey which has done imaging over about a quarter of the sky okay if you've heard stripe 82 and not known what it means uh stripe 82 the Sloan survey was was divided into stripes that were scanned across the sky uh and there was one particular stripe in the south galactic cap that was scanned over and over and over again um to look for time variable phenomena to look for supernovae but also to make a deeper imaging so that's about a couple of hundred square degrees um so it's part of the Sloan survey but one that was a deeper but smaller area um and this is where most of the a lot of things have come from and now pan stars has completed observations over about three quarters of the sky and there's analyses of those going on but not yet we cleansing measurements from there and spectroscopic sdss originally did a sample of about a million galaxies of all types and a hundred thousand galaxies that were selected to be luminous objects probing a large volume and and then in sdss three uh we wanted to focus specifically on baryon acoustic oscillations and probe a very large volume so there we went after luminous galaxies uh and observed about one and a half million of those extending to higher red shifts uh out to about point seven and also measured the spectra of 160 000 distant quasars to measure the liman alpha forest absorption in those quasars so uh i'm going to say some general things about bao analysis first focusing particularly on galaxies and then i'll give you uh at least a lightning version of of the liman alpha forest and how you use that for cosmology um and then i'll stop okay so uh this is one view of the Sloan imaging survey uh so uh so about 8 000 square degrees in the northern galactic cap uh 2 500 square degrees in the southern galactic cap uh so this map is color coded by the uh by the density of galaxies in different pixels uh and you can see just uh there's some zoom ins uh that show you kind of the level of of detail you get down to so that's you know a star forming region in in a galaxy uh in there and um so when we wrote a press release on on this uh dataset uh bob nickel came up with the uh the memorable uh analogy that you could display this map at full resolution using half a million high definition televisions okay so there's uh there's a lot of a lot of detail in these imaging maps and then it's from that imaging that we selected galaxies put fibers you know drilled holes into uh into plates and plugged those holes with fibers and measured spectra and measured red shifts so you know this is a pretty picture of uh large scale structure in a slice uh where each point represents the galaxy and the color uh represents the the color of the galaxy so the red points are made of older stars uh and you can see large scale filaments and voids you can see that uh older galaxies like to live in denser regions uh and so forth um but actually a lot of the cosmological information uh and particularly the BAO measurements from uh from the original Sloan survey came not from that densely sampled galaxy distribution which you see here which in this picture is shown as as the white points but from this luminous red galaxy sample so things that were selected based on color to be objects uh that were at higher red shifts uh and uh and so they had to be luminous in order to uh to have the uh to have their observed apparent brightness and these objects are relatively easy to get red shifts of uh and so the luminous red galaxy sample did a much sparser map you can see you know this this doesn't look nearly as pretty uh but uh but it does trace structure over a much larger volume so sometimes for some purposes you're interested in detail uh with which you map a given volume and for other purposes uh you're interested in covering large volume and you don't need all of that detail um and this was the galaxy distribution uh from boss so uh yellow and red are the same sets of points you saw before but now white is from uh the boss survey from sds s3 uh and so here you know we're using the same kind of technique but pushing further out in redshift and going uh going somewhat higher in density so we've got a denser sampling of structure but also a much larger volume uh because we're going uh out further so the basic idea with uh b a o is that there are pressure waves uh that propagate in the pre recombination universe when the photons uh are tied to the electrons uh and the electrons are tied to the baryons and you can think of this as uh analogous to dropping a rock in a pond and it sends out a ripple uh that propagates but at recombination uh the photons decouple from the baryons because uh all the the free electron opacity goes away and those waves stall uh so you end up with something that's made at some particular distance okay so this is a more quantitative version of of this little thing over here showing the evolution of a perturbation uh where you start with an over density in the dark matter um and uh and let it evolve forward so the red and the blue curves which are basically super posed here are showing the gas and the photons so the photons and the gas are locked together uh until we get to uh the redshift of recombination which occurs between here and here and then what happens is the photons continue to stream out stream away from this initial perturbation the dark matter is still over dense near the middle but we've built up this peak in the baryon distribution and the baryons are only a small fraction of the total matter so once there's no longer pressure driving them you know they're feeling the gravity of the dark matter but the dark matter is feeling the gravity of the baryons so as things continue uh this baryon peak gets weaker because the baryons are falling towards the dark matter but we start to develop a bump in the the dark matter profile uh and uh finally at late times this is now by redshift of 10 the baryons and the dark matter both have this this excess of material away from this central peak so this is what happens for one individual if you had just just one spike in the initial density distribution so really you have Gaussian fluctuations and you have this process you know you have these kind of ripples going out from all the the over dense regions uh in there but the consequence is that you predict a bump in the correlation function um or uh or oscillations in the power spectrum so let's just summarize that pressure waves uh in the pre-recombination universe imprint a characteristic scale matter clustering at this scale that I introduced in the um in the first lecture when I was talking about the uh the cosmic microwave background the sound horizon scale and this depends on when recombination occurs it depends on the evolution of the sound speed but those basically just depend on the matter uh and baryon densities and the radiation density of the universe so those are things we can now infer from modeling cmb fluctuations very well uh and this is uh 147.49 uh megaparsecs co-moving and the uncertainty due to the uncertainty in the cosmological map parameters uh is about 0.4 percent actually it was 0.4 percent when Planck 2013 so it must be smaller now but I haven't actually figured out how much small I didn't look up how much small so we know uh what the scale is that is making some assumptions about pre-recombination physics in particular it's assuming for instance that dark energy was unimportant uh at that time which we expect because uh certainly for a cosmological constant uh dark energy is negligible at high redshift uh it's making the assumption that we got the standard number of neutrino species there's not uh some extra degrees of freedom uh that are altering the age of the universe or altering the sound speed so uh this is not absolutely model independent uh but it depends only on uh things that would affect the pre-recombination universe um and then uh that gives us a ruler of known scale and so if we pick that up in in transverse galaxy clustering oh I should so let's say this shows up as uh appears as a bump in the correlation function or oscillations in the power spectrum so if I do uh psi of r actually linear uh rather than logarithmic um then you know this is sort of uh dropping way down and then I get this little bump uh out here where this scale is rs this 150 megaparsecs and one caution in looking at papers often this axis is marked so this axis might be marked in megaparsecs uh or it might be marked in h to the minus one megaparsecs okay so if it's in h to the minus one megaparsecs then this bump will be at about 100 okay because h is about 0.7 um but we actually you know the cmd determines this not in h inverse megaparsecs but you know in centimeters uh or or megaparsecs the um so often in papers we we want to show the bao bump work better so we'll plot r squared psi of r uh and uh so that's shown up here right that's uh that's r squared psi of r uh and then this bump looks more prominent but you should remember it looks more prominent because we've multiplied this by uh you know much bigger factor than these things here and the correlate the Fourier transform of a delta function is a sine wave this is not quite a delta function but it's a narrow peak so if you do uh p of k versus k then what happens is instead of a smooth power spectrum like this uh you get a power spectrum with oscillations that are basically the the Fourier transform of this and because this has some finite width those oscillations are damped so that's how this appears in either the power spectrum or the correlation function uh and so if we can measure that in transverse clustering then we can infer uh the angular diameter distance uh from taking that length dividing it by the angle and if we can uh measure it in the line of sight direction as a velocity separation we can divide a length by a velocity and get the Hubble parameter at that time so so relative to supernovae as a distance indicator there are several interesting properties of of b a o um so uh b a o uh distances are measured in absolute units right in mega parsecs or centimeters rather than uh just giving you relative distances as a function of redshift you can uh can separately measure d a the angular diameter distance and h of z and here i'm uh i'm particularly contrasting a b a o with uh with supernovae the achievable precision increases with redshift because at higher redshift there's more co-moving volume so you can measure correlation functions more precisely uh and you can determine these distances more precisely um whereas supernovae on overall it gets harder the further away you go now b a o you have to do more work to map that more distant universe so this is this achievable precision doesn't come for free uh but uh but it's a method that gets more and more powerful in its precision the the higher redshift at which you apply it um the cmb uh measures the same scale z equals 1100 so particularly that that angular scale and the cmb is using the same uh same standard ruler and what i think is the uh the most you know has made b a o seem particularly valuable uh is that even the most powerful b a o surveys seem uh are likely to be limited by statistics rather than systematics now that's partly because you know even if you map the whole universe uh the statistics still only get you to uh you know maybe uh 10th of a percent or uh 0.05 percent um whereas in principle with supernovae if you had 100,000 supernovae which are not that hard to observe you could divide by square root of a very very big number and get extremely small fractional errors uh and you'd be dominated by systematics but um but and i'll justify this statement a little bit more um it does seem like uh b a o in principle can get to very high precision uh and still uh not be uh not be limited by systematic uncertainties in the measurements um so these are all virtues compared to supernovae but really they have complementary information uh supernovae can give you very high precision at low redshifts um and they uh and this uh this difference between sort of absolute and relative measurements uh is quite interesting in in various ways uh so this is not an argument that you should do b a o instead of supernovae uh but the two of them together are actually quite a bit more powerful uh than either one on so okay so now i'll talk a little bit more about the statistics and the the the uh the sources of error uh in b a o measurements but any questions on what i've said so far yeah so in the plonk papers if you see you know if if if they just say you know plonk t t or t t plus ee or something then that's using only the cmb data if they say uh you know t t plus ee plus b a o then the b a o measurements they're referring to are the ones from are the galaxy ones from boss from sdss question yeah yes and you know in some sense that's what we're doing in a b a o analysis is uh is you know you can analyze the whole shape of the galaxy power spectrum and you can analyze the the wiggles themselves relative to the cmb you know the wiggles in the galaxy power spectrum are much weaker and they're much weaker because in the cmb you're directly seeing the photons right and so uh so so those baryon fluctuations were big at recombination but they're small today because baryons are a subdominant contribution okay so in the cmb you know the the power spectrum is doing this uh in the galaxy is it's sort of doing this so there's less statistical power there um but i think that uh you know it's quite analogous uh using the galaxy power spectrum to get at uh things from the b a o but also to get at the tilt of the primordial spectrum and the radiation and matter densities and so forth um and in general you get more better constraints out of using the cmb together with galaxy clustering than out of using either one on its own because they're you know they're degeneracies that affect one measurement but that are broken by considering something that's 3d or something that's in a different range yeah um let me come back to that so the um so i'm gonna talk about sampling variance uh versus uh shot noise and i'll do it in the specific context of b a o measurements but many of these uh considerations also apply to uh to other kinds of measurements register space distortions uh or with some uh with some changes of uh of terminology to uh to weak lensing so if you uh if we mapped the entire co-moving volume in some redshift range uh z1 to z2 then uh the b a o measurement would be limited by uh cosmic variance and uh so that's you know just the limit from the fact that we have only so many so much volume only so many structures in that volume so there's only so much precision with which we can measure uh that clustering and so this uh this notion is familiar from the cosmic microwave background you know we measure all of the modes on large scales and there's just only so many to measure so we can only measure the cmv power spectrum so precisely and the same is true for galaxies even though now we're measuring in 3d those cosmic variance errors for for b a o are quite small they're plotted here and most of the figures here are from this review article so if you want to look them up you can find them there um so for instance uh so for uh at z of about one the cosmic variance error is about uh 0.2 percent d a uh and 0.35 percent on h of z and if we uh if we map instead of the whole sky uh if you map a fraction f sky so if you map 10 000 square degrees and f sky is sky as a quarter then the sampling variance error is the cosmic variance times uh f sky to the minus a half okay so just as you'd expect uh errors drop like the square root of the volume so if you measure you know only uh one quarter of the entire volume of uh of the universe then that means your errors will be twice as big so uh when i talk about achievable precision i'm basically referring to mapping you know we're never going to map the whole sky because the galaxy's in the way uh but we might be able to map half the sky uh and sample it well enough to completely sample the structure in some redshift range uh and boss has basically uh done that for a quarter of the sky uh out to about a redshift of a half uh out to a redshift of about 0.6 uh and you know these numbers depend on how thick a redshift shell you're you're taking so this was for a redshift shell it's you know about 30 percent uh of its range so that is uh the limiting factor due to finite volume but you have to actually map that volume accurately enough to measure the clustering within it so the other uh contribution yeah spectroscopic yes so this is assuming that you've uh this is basically assuming that you're resolving uh structure well enough that you're actually able to resolve the width of that baop um and the uh so from so if you map a volume v then the number of 4a modes in that volume uh is uh 4 pi k square dk that's my number of 4a modes per or density of 4a modes per interval dk uh times the volume uh and then there's some factor that comes in from your 4a conventions um but the number of 4a modes is just proportional to the volume um and the uh if you uh map with uh tracers number density n then uh the error on the power spectrum is the the fractional error with which you measure it uh is going to be nk to the minus a half right it drops like the square root of the number of 4a modes you've measured times one plus one over np so the power spectrum uh has units of volume uh and so if you multiply it by a number density you get a dimensionless number uh and this is the shot noise contribution so it just comes from the fact that you know whatever structure is there if i don't have many things with which to trace it uh i will make errors uh just in actually mapping out the uh the structure that i've got and uh so in general uh if i want to get small errors on the power spectrum then i want to map a large volume because this is proportional to v to the minus a half times one plus one over np but i also want a high enough density of tracers that this uh that i'm not paying a big penalty from not mapping the structure well enough and uh for uh for bao the relevant scale uh is roughly um k of about 0.2 uh h inverse h megaparsecs um and the uh you know so really you have to think about the overall impact on the uh on the the feature you're trying to measure but roughly speaking you can determine the importance of of uh of the shot noise from thinking about the power at about this scale uh and the um let's see for uh so for np much bigger than one this term is negligible uh and you're limited by sample variance and for np uh less than one then you're limited uh by shot noise and so now we come to this question of well what kinds of galaxies do you do uh do you use to to measure this uh and so a rough uh rule of thumb uh is that the uh number density required for uh np equal to one is about four times 10 to the minus four uh that's in h cube per cubic megaparsec uh times one over sigma eight for those galaxies squared okay because the p that matters here is the p of the galaxies all right so if the galaxy if there's stronger structure in the galaxy distribution then uh you can measure it more easily uh and so more strongly biased galaxies have uh we have sigma eight galaxy uh is bg times sigma eight matter so if you have more strongly biased galaxies uh then the power spectrum is higher by bg squared uh and so you need a lower number density so in general you would like to uh observe a uh highly biased sample if you can uh because then you don't need to observe as many of them uh in order to avoid being limited by shot noise uh so in the case of a boss uh this space density of boss galaxies is a little below this it's about three times 10 to the minus four uh but the galaxies have a bias of about two um they're luminous objects that are strongly clustered uh and so uh the result is that that per boss uh this np is is about two um and so shot noise is not completely negligible but it's small um and that's how uh that's how things were designed and if we uh you know if we had uh if we had more observing time we would spend it going to a larger area because it's more important to increase the volume rather than observing fainter galaxies uh and increasing n on the other hand if we'd been in the limit of you know if you'd had np of a half and we had more observing time then we would probably be better off just uh mapping the same volume but taping deeper exposures in order to increase n so this is always one of the basic uh trades in any cosmological survey is uh depth versus area is it better to increase your area or is it better to uh to observe deeper uh in the case of weak lensing you know would you rather uh image a larger area and get more galaxy shapes that way or you rather image deeper and see a higher surface density uh of uh of lensing galaxies uh and it's a quantitative question and it depends you have to think about what you're trying to measure uh and what's the most efficient way uh of using your observational resources and the um uh and so uh in the case of uh BAO surveys you know this is affected by how you're measuring the redshift how big your telescope is how hard is it uh to get uh to get further out uh and so forth um so there are several I I want to finish by actually showing you some of the results from boss so there uh there are several uh topics that I'm going to not cover uh but there there are some things in the notes and there are leads from the notes uh to uh to other things so uh look them up if you're interested um but the uh one of those topics is uh is reconstruction uh and this has to do with the fact that as galaxies move they sort of diffuse out of uh of those shells uh and that lowers uh this peak well you can see it up there on the left it starts out as a fairly sharp peak and really you know the precision of your measurement is you're trying to ask how accurately can you determine the location of that peak and roughly speak the the nice thing about BAO is this peak is only about uh 10 megaparsecs in width and it's out at 150 megaparsecs um and the uh and so the result is that uh you know the peak is only about 10 percent the width is only about 10 percent of the scale and the precision with which you can centroid a peak is basically the width of the peak divided by the signal to noise of the detection so if you have a 10 sigma measurement then you can centroid the peak to one tenth of its width and so if you ask why so a 10 sigma BAO measurement will give you about a one percent distance measurement because you've got a width whose peak whose width is about 10 percent and if you can centroid it to a tenth of its width then that's a one percent distance measurement um so non-linear evolution broadens this peak as galaxies diffuse out of that shell reconstruction is a way of trying to sharpen that peak up by moving galaxies back to their original position uh and on this question of of systematic uncertainties um the uh this is showing uh how the peak shifts for for in n body simulations for the matter distribution so you actually get shifts of about 0.2 percent uh for matter this is showing for halos uh and here the for for highly biased halos the shifts are something like a half a percent so if you completely ignored that then you would get answers that were wrong by noticeable fractions of your uh of your desired error um but you can calculate this and you can correct it uh and you know then your your your uncertainty becomes your uncertainty in your correction okay so if you've got a half a percent shift uh and you know the correction uh to 20 percent then that means you've you've got a 0.1 percent residual uh uncertainty and actually one of the nice things about reconstruction is it seems to remove even these small shifts so uh the statement that BAO are likely to be uh limited by uh by statistics rather than systematics uh largely comes down to these kinds of experiments that trying to put in uh ideas about non-linear evolution and galaxy bias uh at least uh with reconstruction the shifts that occur are still too small for us to measure with uh simulations that cover hundreds of cubic cubic giga parsecs so you know there's still more to be done to demonstrate that uh you know a survey like daisy or or uh or euclid or uh or w first will be uh cosmic will be limited by statistics but but the signs there look good um so in boss you know here was the the galaxy sample here is the correlation function uh is transverse separation and line of sight separation uh and you see this overall flattening this regis-based distortions and then this uh thing here this is that ring uh that excess of clustering at 150 mega parsecs uh and the reason if if we had no regis-based distortions then this will be a perfect circle uh but regis-based distortions uh are uh changing the relative amplitude uh as a function of angle so uh this was the detection from uh the first year first year of data uh this is this r squared psi of r there's the peak it's very clearly detected you can centroid it um and then this is uh more recent analysis so now this is showing the correlation function for separations that are transverse to the line of sight and along the line of sight uh so from one of these you extract the angular diameter distance from the other uh you extract the Hubble parameter and then you can use that for various cosmological constraints um and the uh I'll just show you one particular application of those uh that comes back to something we're talking about the first day about the Hubble parameter okay so if we take the lambda cdm model with Planck parameters uh this is what the Hubble constant ought to be uh 67 plus or minus uh about one and these are several different measurements uh from Hubble space telescope using Cepheids to calibrate distances to type 1a supernovae uh and then measuring their distances out into the Hubble flow these things aren't all independent they're using a lot of the same assumptions a lot of the same data um but the uh but there are details of the analysis that are different uh and so the question is you know how seriously should we take this discrepancy between these direct measurements uh and this one so one of the because of this absolute these bao distances are measured in absolute units if you had a bao measurement at redshift of 0.05 you could just divide that distance uh by the redshift and get the Hubble constant velocity over distances is h0 now the problem is that low redshift bao surveys uh don't cover a very large volume there's just that not much volume there so these errors are kind of four percent from these low redshift bao measurements uh and you know there are enough that you can't uh you can't really definitively uh favor this one uh or this one but out at redshift of 0.3 and 0.5 we have measurements for baos that are uh have a precision of about uh one or two percent and what we need is a way to extrapolate that in uh to a low redshift so you can do that for any particular cosmological model like lambda cdm but we'd like not to be sensitive to that but this is where supernovae come in because with supernovae we get very good measurements of the relative distance scale uh and so uh the green points here are from supernova data um and basically what we're doing is saying we'll we'll you know calibrate the supernova data so that they fit that point or more generally they submit this set of four points and then ask you know where do you end up at redshift of zero so rather than calibrate supernovae based on the Hubble constant we'll calibrate them based on bao uh and you can see where this is going that uh when you do a joint fit to the bao and supernova data using bao for the overall calibration you're using supernova for the relative distances uh and you end up with an h naught that's 67.3 plus or minus 1.1 um and basically imperfect agreement with the lambda cdm prediction so this is uh the basis of uh this is why I said that that I think most likely uh the resolution to this particular discrepancy is going to be uh that the the direct measurements are uh are just too high um and that h naught really is below 70 uh and the one way around this is that uh we are still assuming uh that this sound horizon is the one that we compute with standard cosmology so the thing about this approach the difference between the red point in the black and the magenta point is the magenta point assumes a cosmological constant the red point does not the red point is making almost no assumption about dark energy but it is still assuming that the pre-recombination physics that sets the sound horizon uh is the usual stuff um but the uh so if you have extra neutrino species for instance uh then you can move both the red points and the magenta points up uh together with uh to to coincide more with the the direct measurements so uh there are possible ways out but they point you to things uh happening in the pre-recombination universe not to things happening uh with dark energy at late times question the black points are measurements from Hubble space telescope of h naught um and from sephia so it's using Hubble space telescope to measure sephia distances to galaxies uh that have hosted supernovae and then using those supernovae to measure galaxy red ships so the green ones are supernovae but calibrated not to match these but calibrated to match the bao measurements yeah so really we're sort of doing Hubble constant from uh instead of building a distance ladder from the inside out starting with you know parallax to star clusters to sephia's uh to uh to nearby galaxies to distant galaxies we're building a distance ladder from the outside in starting at redshift to point six uh and then calibrating supernovae to that uh and then moving them in so so that's that's one way of thinking about what's going on in practice we do this overall joint fit let me take you know two more minutes and uh and stop and then I can take any uh I take a couple of questions and then more during the coffee break so other things that I'm uh I'm skipping are uh unfortunately I just didn't get to the Lyman alpha forest but the Lyman alpha forest is a very powerful tool for tracing structure uh in the high redshift universe uh basically you're mapping fluctuating intergalactic hydrogen that's tracing the underlying dark matter uh you can measure the power spectrum of matter uh from uh in the Lyman alpha forest and use that to learn about the power spectrum of matter on scales of uh one to a few megaparsecs this gives you interesting constraints on neutrino mass on tilt of the primordial spectrum and so forth uh and you can measure baryon acoustic oscillations uh by looking at the correlation uh of Lyman alpha forest absorption across sight lines um and the uh you know I was wearing my nasa shirt uh because I was going to uh to tell you about w first uh which uh now looks pretty likely to happen uh to be launched probably in 2024 uh do a big infrared imaging survey and slitless spectroscopic survey and you can roughly think of w first as doing at redshift of one uh what what the slown survey has done at at redshift of zero um and there are similarities between w first and euclid there are a lot of interesting differences between them um so I'm happy to answer uh any questions you have about w first and its comparison to euclid um but uh it's uh anyway it's been a pleasure talking to all of you and uh and and talking about some of the things about precision cosmology with large-scale structure uh and if you're interested in hearing about low precision cosmology uh that still has something to do with large-scale structure uh then uh show up at two uh and I'll show you more pictures like these so thanks and I'll take any any further questions