 Okay, we are ready for more gravitational waves, and we have a Michelin Balisneri. Thank you. So, before talking a bit about data analysis today, I wanted to share the new results that I couldn't talk about yesterday for, you know, I don't know, for legal reasons or for legal reasons or for secrecy. But some of you actually met in a room on the side to watch this press release by the LIGO collaboration, LIGO-vergo collaboration. So what did they show yesterday? I have some plots actually from the paper that was just published yesterday in physical review letters, and basically there's one more event, there's one more good detection which was named GW-15-1226, so as you can see we're still going with the GW although IAU wants Grav, and that's the date, so in honor of our British colleagues it was named the Boxing Day event, so Boxing Day is December 26th, although for most of the US it was still at the time when this thing was detected, it was still Christmas, okay, so this generated a series of stories of people being taken away from their gift wrapping or gift opening to do data analysis, and as you can see this was not as strong as the September, the first Monday event was, it was longer though, so what you see there is again the spectrogram or something actually that's called an Omega scan is a more refined and tuned version of a spectrogram, showing very faintly the upper chirp, right, at the bottom, I think I have a pointer also, but the time series, even if these are already widened and been passed, don't really show noise, the signal is under there, and the black is not the signal as you get it from the data, is the signal that you infer is in the data, okay, so this needed digging out from the noise, and throughout today's lecture I'm going to try to show you how you can be certain that you know something like this is really under this noise, is really there and is there with great confidence. Now the other thing that you may see is that the signal is much longer, okay, the original Monday event was just eight cycles, here you can actually make out 55, there are 55 cycles of the signal in the sensitive band of the instrument, so if you listen carefully to what I was trying to teach you, what gets you a longer signal, and if you haven't seen yesterday's press release, because then you know already, how you do get a signal that lasts longer in the detector band, so you know it actually goes to a higher frequency, it goes to about, what is it, 450 Hertz, something like that, come on, it's been interactive, you know surely it was inspiring yesterday, so let's try, so what kind of masses would you need, smaller, right, you need smaller masses to be able to last longer and get closer, and so smaller masses make also small black holes, small black holes can get closer before they emerge, and therefore you can push the system to a higher frequency, this lasted one second, okay, and this is a comparison of the then the three signals that were seen in the 48 days of LIGO's first observation around, these were 48 days in coincidence, okay, it's the only time where you trust yourself to be able to see something, so this here is just the noise profile, just a spectrum of noise for the two detectors, Hanford and Livingston, and these three are the reconstructions of the three signals, you see JW15 the first Monday was very short and lower frequency, this one lasts a lot longer, more cycles, and this is LBT, is this signal that happened in October, it's the second Monday event, which was kind of in between, it's not a confident detection, there's only 90% probability that it's actually a signal, and so the corresponding false alarm rate is something like one in three years, so some of the properties SNR signal to noise ratio are the statistic that tells you how both how strong the signal is and how confident you are in its detection was 24 for the first one, very strong, this one 13, okay, and the marginal one was 9.7, so between 9 and 10 is where you're threshold of really certainty falls, and the certainty goes exponentially as we'll see with the SNR squared in fact, so it's kind of like a steep, if you're at 11 or 12 you're already in very good shape, and let's see what else, a distance of the box in the event was comparable, although there's a big uncertainty on this number but was comparable to the first Monday, so 0.1, and in this one it was possible to see a spin, but the most interesting thing perhaps is this masses, so these are masses that are more reasonable for black holes, okay, they're not the outrageous 36 that you needed to change for which you need to change the theory of binary evolution, the things that we've seen also in X-ray binaries, so I guess the trivial statement is that nature produces black holes that emerge across a range of masses, not just 36, that's interesting, from the dark matter perspective, so there's this paper that we were talking about briefly that I looked at again which proposes that all dark matter are 30 solar masses black holes, effectively, and then works out, if you take all the dark matter you know in galactic halos and so on, you build all of that out of 30 solar mass black holes which were produced as primordial black holes, right, not from just evolution, what kind of merging rate you get, and the merging rate is kind of compatible with what was inferred from the first Monday event, however I think that lighter black holes like that 14 and 7.5 are already, would be seen by microlensing probably, there are constraints that come them out, so maybe not all dark matter is 30 solar masses black holes, at least can be a mix, it's a very interesting suggestion though, if you look at the paper it's a, maybe it's a physical review letter I think and it's a bunch of authors, most of the work there is to try to figure out what merger rate you're having of additional waves if you put all those black holes there, because you get different merger rates depending how clumpy, what the clumpiness is of the black hole dark matter in that sense. Okay, so these are these significant plots that I was showing yesterday at the colloquium, the idea is, this is number, 23 or 13 that you compute, it tells you how strong the signal is with respect to noise, if you really believe that noise was perfect, Gaussian described exactly by the spectrum without any bad surprises, well then you could just compute what the false alarm rate is for a signal that you've detected, but since you don't trust that, you don't trust the instrument to actually be ideal in the sense, what you want to do is to try to measure how many triggers, how many false alarms you get from noise alone, you do that by doing coincidence between time shifted, time shifted time series for the two detectors. Actually, who was at yesterday's colloquium, just so I know if I can assume that. Okay, not quite everybody, so I'm going then to repeat myself a little bit. Time shifting, okay, so the idea goes back to Joe Weber, 1960s, you have a very noise experiment that gives you triggers, okay, you keep it going and every now and then it goes bing, you suspect that it will go bing when there's a guidance away, but it may do that also for other reasons, there's a cosmic ray that's something local that went wrong, so what you do, you build another identical one, and now you're asking for coincidence, you put them in different buildings or in different cities, you synchronize your clocks and now they have to go bing together, okay, for that for you to believe that it's a gravitational wave, fine, it's a nice principle, how do you quantify it, how do you actually compute probabilities on that? Well, what you could do is to say I'm going to take the probability that either one goes bing by measuring how many bings I have, just multiply the two and that's the probability of my false alarm, but what you can also do is to effectively build a longer experiment by taking coincidences that are shifted in time, so I have a day of data, of triggers, of bings, another day and I shift it by one hour and then I figure out when I get random coincidences between them, and then they have to be random because the gravitational wave is just coming simultaneously to both, it's not waiting an hour to go to the second one, that's building a background by way of time slides and that's what these histograms are, in doing that there's this, so this is the first month event, the big one and the background that you measure, actually does this start, no, this starts here, this is the big event, way more significant than any coincidence that you get from the other one, however what's this hump here, these are these little dogs, by the way little dog is a shibboleth, who knows what the shibboleth is, yes this one, shibboleth is a word or a concept that shows that you belong to a community, I think it may be a Hebrew or perhaps originally, so when you hear somebody talking, it's something that gives away that they're one of yours basically, because nobody else would know it outside the community, so this little dog thing, except now people like me tell you, it's something that's not in the papers, but it's something that was in countless emails and internal collaborations, and the origin of the name is actually the following, that there was this big blind injection exercise in 2006, maybe 2007 where a big signal was put in the data just to see if the data analysts could get it out, it was blind because they weren't told when it was across the observing round, also we weren't told whether there were zero one or two events put in, so it could have been zero, it could have been two, now this was detected and it was detected as coming from the Canis Major constellation, that direction in the sky, so that event which was, you know, a paper was written about it actually internally because the pointed exercise was not just to see if you could detect it, it was also to see how the collaboration would go about vetting it and being sure about it and writing a paper, so there's an aborted paper somewhere in there, so because that was in the Canis Major, so the big dog basically constellation, that event became known as big dog and that was the first time actually one was kind of doing this exercise with a very strong signal and so there was this point of what to do with this kind of hums that you got because when you do the time shifting, you have some time shifts that have in one interferometer the big dog, the big guy and nothing in the other one, you still get a high probability of having something, because you get a very high alarm in one interferometer, you get some smaller random chance thing in the other one and together they give you a relatively large trigger, so this became known as the little dogs and the little dog problem and there were lots of discussions how to deal with it statistically, now we have a pretty well established procedure which is you start with a stronger signal and then when you want to look at the lower signal, you actually take it out of the data so that you don't get the little dogs when you're doing time shift coincidences, so this takes you down from this black histogram to the purple one and to this plot and now you can assess the significance of GW 15 with respect of I guess the non-humpy one with respect to this black curve, it's still not quite Gaussian but the minus 8 and so on and then if you want to look at the last one you take even this one out of the computation so if you do this, both the first month of the September event and the Christmas event have very large significance, so the false alarm rate is actually less than you can measure, it's less than 6, 10 to the minus 7 which is 1 in 600,000 years. Moving on to parameter estimation, these are the three events with the estimated masses and other parameters so I think these are one sigma and two sigma contours in parameter space so they contain either what is it 68% or 95% of probability that you can see definitely GW 15 was the highest mass of 36 and something like 29 this region is excluded because M1 is always larger than M2, just for definiteness and this one GW 15 is definitely lower mass with LVT being kind of in the middle but very uncertain especially the mass ratio between the two is very uncertain. Correspondingly these are the masses of the final remnant black hole and the spin of the final remnant black hole. These are mostly not measured directly, they inferred effectively from your general relativistic models of what merging black holes do through numerical relativity. Then we have, this is Q is the mass ratio so it goes from 0 to 1 GW 15 the original one prefers almost equal masses for this LVT and GW you basically don't know, it's very uncertain. You have a good idea of the total mass but not so much of the mass ratio and there's some degeneracy with this is the effectic spin so it's the projection of the two spins onto the angular momentum which is the combination of spin that affects the inspired rate. Finally distance, so comparable distances for the two strong ones, a much broader thing but probably farther distance for LVT which is fainter after all and sky positions. So this was 600 degrees and people were complaining about this already but they would have had more to complain about the other ones where you can effectively, this is a ring right in the sky, it's a broken ring except it's a weird projection but that's what it is. You do timing, you do triangulation with the line, not to triangle you end up with a ring. So I haven't heard yet from, I haven't really looked at the papers or paper drafts of electromagnetic follow-ups for this event but I expect that for the first one although it was a large patch in the sky the completeness of the follow-up so the idea of taking all the telescopes that looked and all the windows that they looked in and and see how much of the probability they covered was very high, it was like 90% at least if you put several bands together for this one it might have been less, I mean one of these is 1500 square degrees in the sky. Rates, so now we have three events or two and a half you know and you can again do a some kind of inference on the rates that you have and so these are the rates for the three types of events individually, so the idea is that you see one like this and you estimate how many you have and you see the fact that you've seen a fainter event with smaller masses means that your rate must be higher since it's going to be if you see only a strong thing that means that and you see only one of it that leads you to a low rate probably there aren't many like that otherwise you'll see more but if you see a fainter thing then that means that they're going to be more like that and these two down below are as I was explaining yesterday are rates where you do counts of observed events and then you have to assume something about the distribution of masses and the further distribution of strengths of the signal that you actually have in nature so if you make for a flat means flat in logarithmic mass you could assume that's the distribution of the cause and you take that and you figure out how far you can see them yes you don't but but astronomers are interested enough that they give it a chance and look and also astronomers are not told initially what the masses are they just told this position in the sky which is you know I'm not quite sure why this policy is like this whether it's to make sure you don't give them I suppose it's out of safety that if you give them the wrong masses you've done bad parameter estimation they don't look because they're black holes and then they lose and you don't start binary so yes charged electrically so I would say that it's a question for the astrophysicists also but of course there are you can do solutions to charge black holes in Maxwell Einstein theory that they're cold I don't even remember but astrophysically you think you lose the charge immediately to the interstellar medium effectively and so on there's no way for really any macroscopic body to remain significantly charged and it may have a little I guess but not in the units where it changes the geometry and so on so but if you believe there was a reason I suppose you could certainly change your solutions to change your solutions to look for it in this part there are some strange solutions are these pappapetru solutions is it where if you can arrange black holes actually to be in a static configuration by balancing out the gravitational field and the electric field that's correct okay so what else I promised that we were going to try to look at data interactively there's a very high risk exercise but I like you enough that I'm willing to be embarrassed in front of you so we go to a browser I must tell you I'm a big fan of python as a computer languages and a computer language for science it's gotten more and more popular over the last few years so much that it's even hard at this point to be original and say I like it because of course that's what we use everyday right so this was adopted so there's LIGO has a very virtuous I would say thing very virtuous okay can I go full screen here probably like this no like this LIGO Open Science Center okay it's actually I'm saying it's virtuous because I played a part in you know building it at the same time it was not spontaneous virtue it was something that the National Science Foundation actually required LIGO to do because they were saying look we've given you all this money and your taxpayer funded if your taxpayer funded you'd better offer your products and your data to the public to which LIGO said oh my god no we can't do that because you know it's so complicated to look at we haven't seen anything and then we're going to get a million crackpots we're going to look at the data find events there publish a paper and then it will be a nightmare it will be a PR nightmare you know so some kind of compromise was struck in that well certainly for instance after every confirmed detection immediately LIGO really released the data around the detection but also there's a plan for releasing the full data after you know we've looked into it and made and cleaned it also it's not just a question of finding finding the gravitational waves it's also a question of taking out the instrumental events that again would be interpreted like something you know exceptional or extraordinary by people who don't have cognizance about the experiment and the instrument so then the first exercise actually of the LIGO open science center was then to take all the data from the last two science runs of initial LIGO so S5 and S6 which together are more than two years of data and just release those so if you go to this website LOSC LIGO.org you can do things like I think it's called timelines five interviews so let's see this is run S6 open network works okay yeah so for instance here it's a little small but you can see this is all the data that was collected in science run six the height here is the duty cycle so over some period is the percentage of time that the detector was actually taking data so you can zoom in uh uh-huh I have to drag here okay and then you can request to download the data and you get some links to the strain data it comes to you in HDF which is a common format and here there are also software tools to you know actually load it into various languages and do things with it the other thing that this site has let's go back to the homepage is the data releases themselves so these are the discoveries so for instance if you go to the September event there's a bunch of information about it and plots and then there's a you can get to the data so the other thing there's a tutorial that that lets you play with the data since it's a little more complicated than I would like I I simplified it a bit and I wanted to show you a simpler version but you could download it here there's also a newer tutorial that they put up yesterday that I hadn't seen before which will let you do the same things with all three signals and actually it also does some kind of match filtering which is a technique will the standard technique will talk about a bit but again it's somewhat confusing and I think it could be simplified perhaps by a factor of two because you know you write code once and it's a mess you write it again and you do a little better it still needs that kind of distillation that thing so you know so in the spirit of doing things live I'm going to take an iPython notebook a new one and I've downloaded the data already in my directory but again you can you can actually run this okay let's not get ahead of myself who knows about Python okay pretty good it's an interpreted language it's not something like C that you compile you can just type commands and run them who knows about iPython and the iPython notebook and Jupiter a little less so if you know Python and you know the other one definitely look it up it's a way to run the code interactively it's something that's a notebook it's something like a mathematical notebook for instance so you don't just type things you see the result you actually collect what you're doing both the things you type and the result in a document you can also annotate the document with comments and mathematics you can save the document and give it to somebody and they can just open it on their side and they'll see all your computations with all your results and all the plots it's a great way to exchange exploratory data analysis or calculations and it's also been hailed as a way to do reproducible science because if an analysis is simple enough it doesn't require strange software packages and so on you can maybe publish the paper and also ship the analysis the data analysis with a data set and somebody else can run it which is not reproducibility it's just a replayability it's a little less but they can play with it by doing the spectrum a little different and seeing if that changes the final conclusions of the paper so very healthy way to do things and a way to get some openness into computational science there's this problem that I think is vexing and that in that a theory paper at the best paper is like mathematics it proves things, it makes arguments that are tight and you're the referee or you're the reader, you look at this thing you spend time on it and in the end it has the value of proof almost for you you go back to you know you can build on it it's scientific and so on now take a paper in usually hopefully now take a paper in numeric relativity so by collaboration of 10 people they're using a code numeric relativity code that was developed over 10 years and that is lots of lines of code but they don't even give it to you because they don't want somebody else to run their code and write a paper with it so all you see is a waveform all you see is some claim result is there any way to check that? not really, I mean it's you'd have you have to trust them you have to trust them at the same level that you trust an experimentalist who tells you I ran this thing with photons and atoms in my lab this is what I did this is what I got, that's my experiment and even then in a good experimental paper there should be enough information that you can replicate the thing in your own lab just as people try to do with the initial resonant bar detection of gravitational waves so there's a way for science to do it but replicating the numerical work can be even more daunting because if it really took 10 years to write that code you just see a few equations there but there are immediate implementation details that you don't have there's really no way to do it so things like this where you can offer code simply and how it runs and offer your procedure are a way to improve on that a bit because the other thing I should say and then I'll finish my rant but these are the things that I care about is that the standards in computational science are probably much lower than they are in experimental science so I bet that the average scientist keeps their code much worse than the average experimentally keeps their lab because it's just easier to do things more quickly, more loosely and really nobody will look at it not even maybe your students usually look at it right anyway, so iPython Notebook great way to run these things so it's great that there's a tutorial for gravitational waves it's a great way to teach about it so let me get then doing it and you certainly to run it you need the Python installation which these days is pretty simple you can download one of these big things like Anaconda or the what's the other one, Canopy and it gives you all the packages and it's one installer it works in Windows, Linux, everything however you can even run online there's something called Binder now so this is a repository but the idea is that you can put there's a way to put one of these Python notebooks online where one click to this website that's running the service that's offered by some very generous lab and it will instantiate there a Python instance running your notebook and you can then just run it in your in your browser just connecting to it so it doesn't seem to come up but we'll see maybe lots of people are doing it right now I'll leave it there for a second we'll see what it does there are ways to do these things in the cloud so you don't even need to do it on your machine okay let's go now to my Python notebook so this is the environment here I'm just importing some packages that I had written this in advance because so NumPy is how you do matrices SciPy is how you do some kind of mathematics these are all Python extensions Matplot Libby is how you do plots and then we're going to read data from disk ReadLigo is a little library that Ligo provided to just more easily access this data although it's already in so let's go up a little so we're going to load data and I have the file here it's called this thing and I want to get data for interferometer 1 and I'm going to get the time and something called a dictionary that I don't really care about okay so let's see if this works so what is strain strain is going to be some big array how long is it it's 130,000 samples so I happen to know actually but it's in that dictionary that this data is sampled at 4k kilohertz okay so for every second there are 496 samples so then the time time h1 is going to be another array so if I look at the difference of the first two samples it's going to be something like so that is it's 4096th of a second so let's plot it so what do you expect to see if I plot this presumably these are 32 seconds of data 32 seconds of data around the first Monday event so you'd expect to see your nice waveform perhaps or not let's have a look here we go see in the I Python notebook it's quite nice because the plots even stay in line within your document if you save it they stay in the save file so all self-contained are nice however no I don't seem to be getting a signal it should be okay here there's a big value in time so I can just let me grab the initial time so that we can do relative things minus t0 here we go from 0 to 32 okay that's fine what do we see here for instance what we see that the the overall amplitude it's of order you know almost 10 to the minus 18 that's much larger than the signal that we promised which was a little below 10 to the minus 21 so the reason here is that as piezocratic was the reason there's noise right yes yeah so remember the noise curve really goes up at low and high frequencies so the high frequencies here will just give you a lot of fuzziness but the low frequencies will give you a big you know wonder a big random walk effectively so that's probably what we're seeing that means we do need to do something to see a signal that lies in the sensitivity band of the instrument so I could also grab the the other one the other interferometer so L1 for Livingstone one and change a few things here okay this should be enough and then I can plot them together and you say you don't want to plot them together because to get on top of each other but they actually don't the red noise this low frequency noise for Livingstone is so strong that the signal is even displaced from zero it's just as an average of minus two there's nothing wrong with that because low frequencies, low frequency noise big ones we just don't care about that's not where the instrument is sensitive and that's not where we're trying to look okay so what's next let's try a spectrum let's try a spectrum so for that you have to take me on faith that I'm going to use some functions of my libraries oh any theory on Python savvy and a wizard I dare you to do it at the same time as I'm doing it faster than me come on you can do it you can do it no it's a actually this guy didn't come up so the mind binder thing so it's possible that the service that lets you run is being overrun by Caltech students who want to look at this data or anyway so there's something called the there's a PSD function okay so we're going just to give it a strain and we also need to tell it what the the sampling the sampling cadence of the data is which is we set 4096 per second so I actually don't want to this gives me the PSD the spectrum effectively and an array of frequencies and so then I can plot them I want to plot them on a logarithmic scale and let's and I need to plot also the PSD okay so that this kind of looks like a spectrum it's actually better to plot the square root of this because so there'll be less of a dynamic range you can actually see something so I'm going to use the square root function for numpy abbreviated as mp which is let's me take the square root of an entire array at once okay this looks nice it's a little smooth it's very smooth doesn't have much fuzziness so there must be some averaging going on this is what you do in spectral estimation usually so one thing we can do is actually tell it how how long what kind of the length of the individual ffts that you should do before averaging okay and this is what we get this looks reasonable so here you see cosmic variance not quite you see the low frequencies down here at the very lowest frequency you know the instrumentals don't even bother calibrating the signal cut it out because it doesn't matter it's things that are changing slowly it's probably due to the stretching who knows of the arm over a few hours as temperature changes it's not where you're looking experiments are always what was the word for vision tunnel vision they have to have tunnel vision in the right frequencies these are these lines so we discussed the lines a bit so lines are two things so lines are usually just instrumental noise so if you count here so these are going to be hertz so there's going to be one line that is you see my pointer there yes that is at 2, 3, 4, 5, 6 strong line there what is that if you've been to America 60 hertz what happens at 60 hertz second sorry is the power right is the power so which is 60 hertz in the US is 50 hertz in most of the rest of the world for some reason and this one here is 30 so it's half so these are definitely noise feature that comes in through your electricity and any experiment there's no way to take them out pretty much that they're always there some of these like this one at what is it 200, 300, 400, 500 is related I think it's one of the the violent modes of the suspensions so is the typical mode of vibration of this very thin silica threads that keep up the mirrors and the point there is that you're going to have noise there whatever you do because you're working at room temperature or maybe you have some AC but you're not working at the temperature of zero so everything has their KT thermal KT everything is going to shake however how things shake things shake at their typical frequencies of oscillations you describe the entire system and you analyze its modes of oscillation and if it's something like even a glass has something but it's not very good when you're excited you're going to excite kind of like a complex motion with some broad spectrum however if you take this and make it into better and better crystal I don't know if you make it into pure carbon lattice or something like that eventually you'll get more and more symmetry in it and the actually modes of oscillation are going perhaps to have a very pure frequency so when you're excited you just get that mode you get it very strongly all the energy goes into that but you get an oscillation that looks like a line like that so that's one principle you're going to have noise but you can push all your noise in a very pointed delta like almost thing and then you just ignore it you can notch it out if you want which is you build a filter that just takes it out or you know it's there there are also some lines that are calibration lines that are actually put in experimentally so you for instance at the end at the end mirrors you have a parallel hanging pendulum structure that can push on the mirrors using little magnets that are set on the mirrors and using currents and when you push you want to push some definite frequency so you're doing some kind of actuation on them which you're reading off in your science measurement and the reason for that is that because that lets you monitor the response of the interferometer instantly to motion at some known frequency so that that would be a signal injection at a definite frequency and something that lets you monitor the experiment but where were we? the idea was to then try to use this shape of the noise to to get rid of the low frequency noise here and this is known as let's do more quizzes what is white noise? like yes? constant across frequencies so what does it mean then to whiten a signal? means that the signal may have noise that looks like that that is very strong oh yes, yes in the frequency domain that's true so we're going to basically divide by that noise by this noise profile so that all the noise components appear at the same level with the equalization effectively in audio terms so let's see how do we do that to do that we have to work in the frequency domain so we're going to take our strain which is called strain H1 and we're going to take a Fourier transform of it it's a real process so we're going to take a real Fourier transform and then just to see that everything works the inverse Fourier transform IRFFD and if I do this nothing should happen so I can just plot it again and it's time minus T0, let's see if this works yeah, I'm back to these two operations are the inverse of one another so I'm just getting back to what I need now then so I'm going to call this this will be the whitened data except it's not yet because I haven't done anything to it so once we go to the frequency domain then we're going to have to divide it by something I'll call it a kernel and what's the whitening kernel whitening kernel is going to be this curve effectively so I'm going to divide the signal by that curve so to do that I have to do I have a little problem which is this curve I have the same length I have the spectrum of the other one so I'm going to have to do some interpolation of it so that's done, let's see I'm going to first use a little function it's a utility function that will just tell me what the frequencies are for my final, for a signal of the length I need I need also to tell it what my delta t is between steps and then I can build a kernel doing interpolation of the spectral frequencies I got and the square root of the PSD so let me define a dt equal to this 10 over 496 here I'll use the dt and now for reasons that are always confusing to me so I want to explain I have to normalize a little and divide by 2 over dt this gives you the right dimensions really so let's see if this works nice arrow, let's see PSD is not defined because I called it PSDH1 this is good enough seems the operation was done so let's see how it looks like now some ppplot time H1 white and H1 in fact clearly something happened here which these things always happen when you do FFTs but let's look somewhere in the middle and in particular I want to see around the time of the event the time of the event in this here is I have to copy it so time of event is 1126 259462 422 September 14, 950 UTC and then I'm going to build a I want to restrict these arrays to just some time around it so I can do it now we're getting into somewhat a little sophisticated python so I'm going to be the build an index like this that tells me that the selects just the elements in the array where the time lies between 5 seconds before and after the event we'll call this index and then let's look first at the known white and signal and by doing this brackets in I select just those times so this is just a blow up of the big curve here that thing was very noisy couldn't really see what's going on here it's very clear that the signal is dominated by frequencies of order going up here over the 5 seconds with periods of 5 seconds definitely we cannot see the event there so let's look at our white end plot yes sorry I do forward for your transform I divide by my white and internal so by the square root of spectrum and then I take the inverse transform so I go back to time because I want to see a time but yes good point I did it a little more implicitly so let's see if it works well it's white end still kind of messy so there's what else can I do here well remember the signal was a signal between 30 and 150 hertz there's a lot of LIGO LIGO takes data up to well up to 60 kilohertz but in this case we're using data sample at 4 kilohertz so there's going to be all the noise components between 150 and 2000 the Nyquist if you know a little data noise frequency so it's all there although this peak here that may be something so also by doing this normalization now the units are sigmas effectively of the white noise so most of it will be 1 there's a little 2 the 4 sigma that may be the signal so what else can I do in addition to whitening I can do some band passing how about that how about we kill off some of the frequencies we don't care about so for that I'm going to be very pragmatic the other tutorial builds a butterworth filter which is a nice smooth filter that you could use in a audio amplifier to get a nice response I'm just going to take the kernel and how did I do it here for brevity yes and I'm going to say where the frequency is less than 30 maybe or where the frequency is more than 200 actually let's do 20 and 200 I just set this to what 0 or actually infinity divide by this so this would be whitened and filtered so let's have a look did I run this oh right ok so maybe I want to look a little closer then so let's go between minus 1 and 1 this is still not very inspiring well this guy begins to be there so let's go even a little closer remember this thing was 0.2 seconds so I'm changing my index still not visible there but here it is that's a plot that was on the paper they did it more carefully they were carefully filtering and so on but this is pretty much it and we can just do a little test of consistency by doing the same thing for the other guy for the limit zone interferometer I need to make the spectrum I think because I didn't make it up here spectrum is similar but to be careful let's you know let's do it separately so again PSDL1 SNH1 and then here I want PSDL1 and then finally it will be white and filtered and then let's plot them together okay and they don't line up completely because remember there's the time of propagation so we need that to shift them by what was it 7 milliseconds so let's see if that that works I think it worked for me here so it should work for us also plus 007 oh and what's missing somebody was pointing it out yesterday a minus, excellent so here we go so they definitely line up up here and that's your coincidence down here it's mostly noise and just as a further check that the theorist also know their job from the same webpage you can download a time you can download a numerical relativity waveform that was computed at the best guess for the parameters so it's this guy here I have to turn it around to do the two columns and then I can just let's have a look, PP plot time nr train nr so the units are the physical ones whereas here we went into units of sigmas so we probably want to divide by something like one e to the minus 21 it may be multiplied by 3 something like that I plot it on top of the other two and give me something longer this looks fine let's see what did I do wrong big size well I put it at 0 so I need to do something to this time index probably add the event almost there let's see so a little too big maybe, no a little too long so let's limit the axis we'll go from t event minus 0.2 x min to t event plus or 5 something like that so I think there's still a little shift just because the numerical relativity didn't quite put it at the right time and I actually computed it it's 0.02 oh no no it doesn't go here it goes here okay here it is and we can make it a little smaller a little smaller oh what's missing something is missing here it doesn't quite match because then I promise I'll do something else after this I'm just having fun but it can be a little boring so we need to widen this guy also actually let me let's just do it okay so so I'll just copy this code I have here and we're going to use it on strain and R let's use the Livingstone PSD it doesn't matter strain and R widen and R then let's plot that widen I probably don't need to even do this job anymore because the whitening will do it for me the problem being that X and Y must have the first dimension this thing took me 20 minutes to figure out for some reason the numerical relativists gave us away from with an odd number of samples therefore when I'm doing the Fourier transform I use one and then when I do the inverse Fourier transform it doesn't match anymore but now it does okay here's my demonstration that the numerical relativity works that the signal propagated at the speed of light propagated in 7 milliseconds and that if you do a sign flip they match nicely and also making that figure for PRL was not that hard so I'm going to I won't do live the other thing I wanted to do I'm just going to show you here what it looks like which is if you do a spectrogram of the data for instance H1 the function is called spectgram you get something like this so this is time again the seconds within this little data file between 0 and 32 and on the y axis is frequency I should annotate it so what can you see so definitely what are these green things there should put a legend but green means big here and blue those are the instrumental lines, the noise lines really yellow down here at low frequencies but still there's a little thing there a little cloud that's probably our signal so what you can do is then plot the whitened data not quite the we don't even need to band pass it for this and you know it's not there's quite a bit of tuning you can do in how you make the spectrogram how long the individual FFTs are going to be and so on but it's there, it's the chirp maybe it's not what I plotted here is just the thing that we derived the other day the evolution of the instantaneous evolution of the frequency due to just the quadrupole emission for a Newtonian binary so I matched it as well as I could just by eye actually I didn't even do a fit but it's not expected to match actually too well because remember this is a very very advanced stage in the spiral it's going very fast so most Newtonian corrections and eventually even the merger are going to dominate the signal it's not supposed to be Newtonian but if you do this job actually in this earlier part and be a little careful do a fit that's how you get out this chirp mass of 30 which was the first thing you can measure about the signal which is the thing that convinces you that these are two black holes so I encourage you if some of you like this kind of things you know definitely download the data and look at the other tutorial you have to read through it a little more carefully and you know play with it because it's I think it's competitive I think if you had to play with the CMB it would be harder to do probably to download the map and do a fit with cam although you know we saw how to do it so maybe no maybe it's comparable you should do both download the plaque map and fit it and download this and try to plot it so ok let's see then I have 15 minutes is that right? ok questions on this? since I had so much fun yes you cannot clean it enough to see it so what they plot is actually the reconstruction and the idea is that with the match filtering so ok that's what I can talk about in 15 minutes about match filtering you can actually you can actually draw out even a signal that's under the noise and the rule of thumb is that although normally noise looking at you noise you discuss in terms of sigmas right? if something is above one sigma you more or less see it if it's above two sigmas or four like there it's definitely big ok but however you also have many cycles so if you know the shape of your signal if you know the frequency if it's just a sinusoid and you know the frequency you effectively get an enhancement of a square root of the number of cycles so then it's 55 cycles so the signal may be something like 0.2 sigma and still you can see it in a statistical term you can reconstruct it but you won't see it in the data so Raphael did you convince yourself that you were seeing it or you don't know? maybe so I think if you did because if it was just a monochromatic signal you could really do just a very tight filter and then you should get it out but it's not so it's not a filter it's a filter that you need to do but it's not spectral it's a chirp filter you do but ok that's a good point so you can filter out some frequency you still get something so maybe there's a chance to do they didn't do it but maybe if you're careful about it you can actually filter in a way that you see something at least so I think Raphael was doing it live yesterday during the press conference and it was impressive so ok so let's go back to my presentation then I think it gets to match filter pretty quickly so that's what I can do today then I had planned to tell you also about Markov chain Monte Carlo which is my other passion in life and I'll do that tomorrow so gravitational wave science in a nutshell really, yes they look like chirps that would be a specific thing that they use in LIGO it's being written by LIGO people there's something called the LIGO algorithm library it's actually you can download it there's a repository it's a little painful to build it you need to have the right compiler actually you have to have some libraries and so on but I can build it in half an hour 20 minutes maybe and that library has lots of stuff actually the people started it maybe in the 90s to do it and it was some very idealistic people they even wrote the matrix libraries there lots of basic things today it's also an interface with Python so the library is a C library but it's got Python with the word wrappings it's wrapped in Python so you can actually write little Python scripts that do it so maybe I'll try to do it tonight just to show you an example so for instance with that you can easily generate a waveform using one of the standard interpolants one of the standard models that have all the post Newtonian corrections and even numerical relativity corrections you can certainly make a bank so make many different ones and then you can filter the data through it it will take some time because it's a lot of data and it's normally done on big computing clusters in LIGO but if you know where the signal is you can just cheat and do 32 seconds and then you'll find it quickly so ok gratification wave science in a nutshell data is signal plus noise ok that doesn't quite goes without saying ok because the hypothesis we have here is that noise is additive but noise that is multiplicative or something is so scary that almost nobody wants to think about it but I think it comes about for instance in C and B analysis in some cases or in in Planck analysis ok so here's my signal it's just a little blip up and down this is white noise so I'm hiding the signal in the white noise so once you're here you kind of see it goes up and down but it's pretty well hidden and SNR of 5 is kind of high already so now then what you're going to ask one way to look at detection is the following you say what's more likely ok it's more likely that this thing that I see is just noise or that there's the signal and if there's a signal it means that the real noise is this thing I see minus the signal and therefore is this guy here ok so just on a very qualitative term I'm asking which one is more likely which one is more probable and so how do you tell how can I tell if this is more probable than that remember it's white noise take what now I have only one detector I'm doing it even more simply at the level so assumptions about noise we're saying that it's white noise actually I'm going to tell you something else it's Gaussian noise ok it's distributed the simplest way I can think of and it's also the let's say the average of noise is zero which people usually like to take in that case what you can do is just well ask which one is bigger of these two particular which one has more power so take the square of all these things sum it up and the one that has more power is going to be more less likely right it's noise is smaller noise is always more like likely than larger noise in this sense so then that would mean that if I compute this thing this guy would be the power of this subtractive signal would be smaller than this one so there's going to be some probability that this is larger and if you actually do the work out the probability well maybe we'll do it tomorrow for it's very simple but the idea is that the probability for this noise to happen as opposed to that one the ratio of that is something like the exponential of s and r squared this 5 I was telling you about so in this case it's 270,000 so they really look kind of the same but if you compute the power this one is much more likely than the other one and the difference is this signal itself in practice there's another way to look at it which is match filtering okay match filtering is based on correlation and the general idea is this take my signal again not very inspiring but a simple one and I take the signal the product so if I take an integral of the signal sum it up I get zero actually because it's symmetric however if I multiply with itself I get something that obviously is always positive and then if I take an integral of this I get say 25 which is square of 5 right it's not it's not random in this case now let's say I take just noise some instantiation of noise and I multiply that by the signal what do I get I get more noise right because this thing is going to be up or down positive or negative just randomly so then this is noise again and the integral of noise it cannot prefer by symmetry positive or negative it's going to be small so the idea then is if I can compare these two numbers I can compare these numbers by taking my data multiplying it by the signal I think isn't it and if there is no data I'm going to get this number if there is a signal in there identical to this I get this number plus this of course right something like that this is the basis of match filtering and it's really the same as what I was describing here except that you know here I'm doing it with subtraction and here with multiplication but it's it's really the same thing that I'm doing so then to show it to you in practice before I finish is let's say this is my this is the data it's whitened and a little filter to make it easy for you but let's say that I doubted that there's actually a signal here that I wonder whether there's this may not not be only noise so what I'm going to do then is to take my theoretical notion of the signal just of the perfect chirp and I'm going to do this job of multiplying it correlating with the noisy data and I'm going to slide it and do this multiplication at different time shifts so kind of like this and you see as I do it I build this curve the red curve so I'll do it again so while this is sliding through you see a red bar here that marks kind of like the values on this red curve then I can make it go out the other way so you see this red is this is a correlation between the detector data in my signal template my ideal form for the signal time shifted and as a function of the time shift there then I'm going to see a peak that tells me what is the optimal alignment so where the template is perfectly or not perfectly but it's aligned as best as possible with the signal so this is the time of coalescence or the time of coalescence if I put the zero of my template at the top and if you do it for the two you get again the seven milliseconds also the height of this peak with respect to the random fluctuations that you have here is again this signal to noise ratio is what tells you how much confidence you have in your detection that thing is really signal in practice you can get fooled by the following the noise of the instrument is not going to be perfect it's not to be Gaussian noise sometimes the instrument just goes a little crazy and just has a fit of some big noise some big peak something that you don't expect statistically that's known as a glitch an interferometer glitch so one thing is that if my noise here has a huge peak at one point just a delta even if whatever function I slide against it it will still give me something very high and the reason is that and something that will make me suspect there's really signal there's a detection and the reason is that I don't expect noise that big it's just outside my model in a way so to kind of cure this problem people have come up with ways to make sure not only that this thing is big but also that for instance you get reasonable contributions from all parts of the spiral so that's what's known as in the business as a chi-square test because you're splitting up your signal your template may be in frequency or sometimes in space and you want to check that you're matching all parts of it and it's not just one little piece that's giving you a huge match so that's then this chi-square function you see in this case big is bad so if this green curve is big it's because I'm getting some SNR but the alignment is very bad so then the actually the blue was this so the red is the blue corrected by this this quality this data quality check which makes sure all the way from matches and the reason is that you want to avoid things like that so this from a webcomic by one of the LIGO students actually so but what she's showing here is that so that's a chirp that's a real signal these things are all glitches the things that just happen in the instrument due to some strange coupling between I don't know interferometry or what not and yeah you can get the raindrop this is just what is this it's a broadband burst of noise right very red so very strong you probably don't expect something like that in terms of statistic or even can have some repeating frequency pattern or I don't actually know what who Tomty is so I was I tried to google it yesterday I couldn't find it so if you know you can enlighten me afterwards but getting rid of this kind of glitches is something you need to do and this kind of high square statistics help you do that so let me actually stop here and so I can send you to coffee but if you have a quick question I can take it otherwise just come ask me and we'll continue tomorrow