 Thanks again, Sally. Thanks, Steve, for inviting me. I'm very pleased to be in front of a very varied audience that, I think, that comes both from the application side as well as the data analytics. Because I do believe that dealing with a large data set of many different kinds, but in particular data that are collected by physical sensors, they do need the cooperation between many sides of sciences, they need the domain application person that understands the physics or whatever the phenomenon it is beyond the data that we collect, as well as the data analytics one. In particular, I'm interested in seismology and in the past few years there have been a very fast development of seismic sensors. And that has to do both with the cost, with power consumption, with the potential of being able to deploy it in very varied configurations. And that, I believe, is making a huge difference in seismology and in particular when we just listen to what Earth has to tell us, not when we stimulate the Earth with active sources. I should say that I do come from the active source side of seismic surveying that is my day job just to tell you the kind of a level of effort. I have about 15, 17 PhD students and at every time I had in the past few years one or two students working on passive data. But I do think that that may increase because there are indeed a lot of opportunities there. I'm going to talk mostly about using dense sensors arrays to monitor subsurface processes. And all the data and all the result that I'm going to show you is really produced by Schoeder-Rieder, that was a student of mine until a few months ago. Now he moved to Hefei, China to be a junior faculty at USTC. And what is beyond the results that are more the speculation and the absolute thinking that is probably not necessarily correct, that is due to me and I want to blame, but the good results are sure it's not mine. So what the active seismic surveying is a big data industry. I don't think that there are many questions about that, but you may see there that the big data is in lower case. So indeed we do collect a large amount of data like by modern seismic boards like the one on the right that they pull 10, 15 streamers with a lot of sensors. The most modern seismic boards, they have about 100,000 sensors collecting at a rate of two milliseconds per sample. If you make simple math, it's about 20 terabytes of data a day and since they are expensive boards, they go on 24 hours a day. And there are about 90 of them in the world operational, not of the top ones. What I just remember that I gave you is the top one, but this is an arm race among the geophysical contractors. So they will upgrade to that kind of numbers pretty soon and the top one will get larger numbers. That has to be expected. And this is for marine surveying. Line surveying is about collecting the same amount of data as for marine, so you need to put a factor of two. So that is an active seismic surveying. Though in the past 10 years, as sensors have become cheaper, many have been deploying large dense arrays of permanent sensors. So far they have been justified in terms of cost based on repeated active surveying. So these two sensors are a way that are showed here on the slide. They are in the North Sea and they really pave for themselves by every three, six months having an active surveying and understanding how the reservoir is changing and then optimizing the production. But they have an original amount of number of sensors. Again, 10,000 from the one on the left, 16,000 from the one on the right. That is fiber optic sensing. And since they are there, we ask companies to say, what about if you collect data, not just when you're shooting, but when there is nothing going on? And that is what started our projects there. And I'm going to talk about how to use that kind of passive data to monitor subsurface processes. This can be applied for conventional oil and gas reservoir, but can also apply for unconventional uses like monitoring CO2 sequestration project. And indeed GSEP was very generous a couple of years ago to support some of the work that I am going to show. Another very important application of passive seismic sensors is the one of micro seismic. In particular that I think that we'll have a huge impact when it's in conjunction with hydraulic fracturing, fracking, that's probably, you ever heard of that term for unconventional oil and gas reservoirs sometimes are called shale gas. And not only can improve economics but can actually decrease the number of wells, decrease the number of flat jobs, and so we have much less of an impact on the environment, less use of fresh water, less production of wastewater. Now that's it from the industry side. There are potential great applications on the earthquake seismology side. At the end I have slides about that. I just noticed that my colleague Greg Beraza is sitting there in the corner and he has been leading those efforts. There is a potential of a real-time earthquake detection and if you do be able to detect an earthquake that is happening in real-time, you can shut off a lot of infrastructure, save lives, and save infrastructure. That is a huge impact. As well as a more scientific one, you can start to, using big data analysis, try to see the kind of earthquakes that have been called in that community. Trimers that are not apparent from kind of smaller data conventional seismic applications are an application that does give us more insights on earthquake mechanisms and potentially understanding and maybe even predicting earthquakes. So that's our application. Let me go instead to what the main topic of my talk that has to do with passive seismic interferometry for reservoir scale imaging. It's really made by two steps. The first one is transforming the noise that is going on in the earth to what we call virtual sources. So what is look like on coherent noise? Now we transform in coherent data in which then we can transform in estimation of parameters in particular for example seismic velocities. So the first part has a statistical data analysis component that starts to bring the big data with capital letters so in seismic imaging that is the next step we do use statistics and maybe we abuse it sometime but it's really driven by understanding of the physics and is using the understanding of the physics what's happening. Now in the first step and this is what I'm mostly going to explain today the statistical data analysis instead is somewhat new. We are trying to use statistics in particular spatial correlation data. In more even general terms as you will see really the first step is some sciences discovery is the data discovery that gets again closer to the big data problems. As you will see actually we first took some blind allies because we didn't look at the data or we didn't listen to the data carefully. We used our conventional blind sites and we misinterpreted what we were seeing. So there is a lot of data discovery if you want of sciences there. Then the next step is what we do in our day job is going from a coherent experiment or virtual experiment to estimation of parameters using some kind of large-scale inverse theory and that's somewhat definitely less relevant really to the big data into today discussion. Now I will run very quickly about the history of passive seismic interferometry because I think that is quite interesting how sometimes science goes through detours. It was actually first hypothesized by John Claude Barth here at Stanford in exploration seismology in the 60s. We never had the sensor to do anything useful with it and actually the heliocysmologist did something useful at NASA and at Stanford starting I believe in the 80s and the 90s and then I have to say really hurting our pride of reflection seismologist the global seismologist did something useful using their arrays and they did the vices shown in 2005 the first examples of tomography using interferometry and here is a little of the history instead of doing dense array industry ones and we actually saw it coming, we tried we got data from companies to try to do it but as I said the blindsides were I mean our own fault we failed we were basically looking at the wrong wave modes we were looking for what we used every day that our body waves and reflections while instead most of the energy was in the surface waves we looked at the wrong frequencies we looked at the 5 to 25, 40, 100 hertz that we are accustomed to use instead a lot of the signal was in the lower frequency so the good news is the student that did that ended up graduating his thesis was inspired by interferometry but was not about really interferometry and then I got assured that he is a very perseverant sometimes stubborn student and you say I want to make it work and he came here, we looked at the data same data, some different data and indeed he produced in around 2010 the first reservoir scale images from interferometry now where is the data discovery I'm showing the next few movies this is a frequency band that we are used to look at and if you look at that frequency band you have a lot of noise coming from the platforms this is a producing field and not only some noise coming from nearby fields these are active surveys that the contaminate 30, 40 kilometers away but they do contaminate our data even if we go at the lower frequency we are still getting a lot of the platform noise we needed to go to even lower frequency but we are not accustomed to look at to start to have a pure noise or noise that we could use at least in a relatively simple way with simple correlation so this is lower frequency and this is even lower frequencies so frequency was important understanding what the Earth was telling us was important and also the wave modes the next few slides depict for the known seismologies what are surface waves in this case they are at the interface waves the interface between the ocean bottom and the water column they are made to wave modes on Earth they are called the Rayleigh's waves on ocean bottom they are called Shorty waves and Lab waves and have few movies here to give you a sense that actually if you live in the Bay Area or like we have few miles away from San Andreas these are the waves that you need to fear these are the waves that first they shook you up and down pretty badly these are the Shorty waves and the next one are the Lab waves that are the ones that shake you left to right and you are building and may make the real damage in this case they are actually useful to basically see something in the subsurface that otherwise we cannot see now the statistical tool that we use is very basic is spatial cross-correlation between the sensors in seismologies we believe in spatial heterogeneities and in stationarity with time at least within the time span of the experiment so we try to not to be sophisticated in the spatial correlation you can have a higher order of correlations and people have tried but we believe basically in simple first order correlations but we do believe in redundancy of time and that will actually help us to extract signal out of the noise so we do spatial correlation between every sensors any every other sensor and from there we can basically create a virtual experiment that give us as if we had a source at that point and that's for every point on the array so that fact that we can do for every point in the array is the one that allows us to estimate the heterogeneities in the subsurface and this is one of those virtual sources and as the movie goes on one of the virtual sources will keep moving in different locations because you can indeed do that for every location now we use that also for the stationarity at least within a few hours of the subsurface to average over time so this is the result between two sources to receive us is a correlation average over time and you see that from basically no signals slowly we converge to something that is actually signal short did a very sophisticated statistical analysis there and depends on the field condition we can get good data every 6 hours 24 hours and that allows now to monitor changes in the subsurface with the time scales in the 6 hours 24 hours nothing that with active survey will be economical we cannot go back and shoot another active service within that time frame but sometimes important phenomena happen with that time frame in the subsurface that was just two sensors this is a set of sensors same things as you average longer and longer now out of the noise comes out the signal and this is one of those virtual sources this is just to introduce the concept of the distance between source and the receiver that we call offset just to compare the data collected or synthesized from the noise to the active data the left is the active data is much richer those things that we everyday use are the reflection body waves they are there that's what we were first looking when we looked at this data but what we have been reliably being able to get are the surface waves that do exist at higher frequency in the active data but these are the one collected from the passive data one thing that you will see here is that different frequency travel at different speeds the lower frequencies tend to come earlier and indeed that can be done what we call dispersion analysis this apparent velocity is a functional frequency but just to tell you that is actually because we have that phenomena we can investigate not just the surface but down down in depth that phenomenon of frequency dispersion allows us to have some depth resolution so these are surface waves really waves in particular propagating a low frequency on the top and a higher frequency on the bottom you see the low frequency investigate much deeper than the higher frequency so that's what allows us basically some depth resolution so far I have been really talking about the uniform data from a big data point of view the statistics what was somewhat interesting but really we don't have so far multi-parameter data reality of our sensors they do have multi-component sensing so you start to be able to actually to do a multi-parameter analysis compared to some of what is done on web data is fairly trivial or simple but we can start to think about again the different components of the seismic sensors we can again interpret them physically in terms of short ways and love ways as depicted in this diagram but you can start to think about doing more sophisticated data analytics on this different sensors so this is the first move towards more if you want big data at least the way that I understand it the way that Jeff was showing the big book about an hour ago as you will see also the next step will be actually different physical measurements and I will mention that in a short while so now we good news is that we had the active seismic survey to ground through what we are getting you know we tend to be skeptical it's nice to see nice patterns and nice figures as Margot pointed out and claim that reality but whenever you talk with statistical methods that is very dangerous the good news here is those arrays were also collecting active data and give us much more reliable information and here is a comparison of the different death slice on the right obtained from active serving and pretty sophisticated and expensive computation intensive processing and imaging with the frequency slices of the group velocity maps that I was seeing and different frequency tells us about different death and of course the active sensing has much higher resolution but we do see the similar features from the active sensing and the passive one so that is a snapshot is the static images our important confirmation was the fact that we can monitor changes over time and again here it helped to confirm that the fact that the active sensing was being done every few months for many years in a row and we had passive data for different years so this now are differential images our changes in seismic velocity and if you look at the scale bars we are talking about small ones in the order of one two percent but we can sense those one two percent changes in seismic velocity and again the general pattern is confirmed from what was extracted using the active sensing we can not only find the velocity but we can also estimate the change of velocity with the direction of propagation we call that seismic and isotope one of the arrays was collected on a field that has drug subsidence basically the bottom of the ocean is going down is creating differential stresses at the near surface and such those differential stresses makes the seismic wave propagating different directions sorry, different velocity in different directions on the bottom here is a gradient of the bathymetry of the ocean bottom and on the right is a display of the seismic and isotope the length of the little vectors is the amount of an isotope and the direction is the fast velocity of the seismic velocity and this map here correlates incredibly well with the differential bathymetry on the left and it correlates with our understanding of the way that this anisotope is induced by the subsidence in the ocean floor so we can take different snapshot looking at differential changes and also more more than just velocity we look at changes in stress and potential fracturing in the reservoir using passive data so that is the conclusion of my reservoir scale imaging and monitoring we can estimate seismic p and s velocity that has a lot about fluid flows and changes of the fluid flow patterns in the subsurface we can estimate seismic anisotope that tells us about stress fields and changes in stress fields and the formation as I said it seems that we can extract useful information with a time frequency of 6 hours, 24 hours and so we can see very fine and temporal scale at least in the scale of subsurface changes and can be a continuous monitoring today for the fields that justify for active serving putting both expensive permanent sensors the monitoring by passive seismic is justified because the additional cost is minimal and we may actually extract very useful information for managing the reservoir avoid environmental disasters and so that is really something that can be done at least for the fields that already have permanent sensors and there are more and more fields that are instrumented with permanent sensors all over the world both on land and in the ocean tomorrow as the sensor technology keeps improving and gets cheaper and cheaper I think that it is conceivable that a permanent sensor may be deployed just for the sake of passive seismic to monitor in a cost effective way the changes going on in the subsurface so that was one if you want the industry application I do think that probably in the near future even one that may have even more impact and companies are working active on that is for really helping to optimize and minimize environmental impact of the shale gas and oil revolution of fracking and hydrofracking probably of a basic technology that is going to I believe a lot of impacts is this idea that you can deploy fiber optics and get sensors out of this fiber optic very densely and not extensively so this for seismic we will have distributed acoustic sensors or DAS these fiber optics they can go up to 30-40 kilometers long you can have a sensor every meter so here again the big data coming and the kind of density that we like to sample other way shields and even more intriguing they are very flexible so really the configuration of your array is somewhat up to your imagination and the tools that you have deployed them so you can be really creative in a way that you deploy this those arrays and the good news is that they are cheap so companies are considering of basically putting by default when they complete the welds they put in every single weld and you may be collecting passive data or maybe active data but really the economics might be there now on the big data and the analytic side the interesting and intriguing part is that this same fibers will be able to collect temperature data as well as pressure data so here we have not only multi component seismic but different physical data that can be correlated with each other so there is a lot of data discovery a lot of interesting statistics understanding the physical phenomenon and the application that can happen out of this data that is in the future now the last couple of minutes I would like just to talk about the earthquake application I have a slide that Greg gave me I think for further question I really will direct to you during the break that is coming this is his slide and I think that actually he is using most sophisticated statistics that we have been using that are one that I will be familiar with some of the data analytics expert in which basically he is trying to deal with the problem that not only there is a large amount of data but the information is useful if analysis done in seconds we are accustomed in seismic imaging using tens of thousands of processors for months to get out an image you cannot afford to do that if you want an early warning for an earthquake so they do need to apply much more sophisticated statistics to be able to basically extract and distinguish what is seismic noise and actually serious earthquake that is hitting you in that area and finally I would like to thank mostly all the people in the companies that gave us the data sets that I showed you the result that were short stasis and in particular were BP for one of the data and ConocoPhoenix for the other data I would like again to thank Steve for inviting me and all of you for listening and I would be very happy to have any questions we got one back there Hi I am Trevor DeMail with Chevron very good presentation I guess I had a question on one of your last slides where you talked about using these distributed sensors to reduce the number of hydraulic fracturing well jobs and potentially reducing water usage so just wondering if you could elaborate a little bit on how you propose to do that I should say that I am not a micro seismic expert as I said so that is what I see going around me so all what my comments come from educated layman if you want I think that one way of understanding the fracturing process and how that release hydrocarbons that has been proven even if not I would say that is a standard tool but is very much deployed is identifying the micro seismic events that are generated in correlation with hydrofracking and that helps you to understand where you may want to frack your formation and where instead maybe not worthwhile to do it because not many hydrocarbon will produce for that particular area with those conditions so that is where you may save wells or if you have already a wells where you do have your sensors you may save the same or that hydrofracking but indeed need a lot of fresh water and produce a lot of wastewater and for which actually there is a problem of induced seismicity that is another interesting twist between unconventional oil and gas and seismology for which again there is a group of my colleagues that have started very recently in the industrial consortium and again Goag would be able to tell you more about that Sven Kruger with SpakerUse I think very very interesting presentation the question I have on this example that you put on the screen here again you touched on this briefly and in fact what we have seen images that come from the dust from geophones are measuring surprisingly well how active are you in this area how much work have you done related to big data and distributed acoustic sensing I not done much of it but I will very much interested to get into it because I do think that the kind of expertise that we have in more conventional reflection seismology dealing with not only large data but large data coming from denser rays with many different kind of shapes and forms has a lot of relevance together we have better understanding the data analytics with our colleagues in the certificate department or other part of Stanford really you need to put together those two fields that traditionally have not been together to get the best out of that kind of dust or that data so as I said I'm very much willing to learn and I have a lot of by students that they can really learn I've got a technical question I know nothing about your field but it looked like you were filtering down to those low frequencies to dodge noise that was contaminating your data set and a question when you do that obviously you're throwing away information is the real phenomenon limited to the frequency bands that got through those filters so that you really have sufficient information to work with another related question is obviously there's a real strong signal if they're doing a contaminant from one of these other active things we've had good luck in other fields of monitoring that signal and then being able to subtract it out and basically still get to use that portion of the spectrum that from first blush would appear to be contaminated you can get rid of the contaminating signal using more sophisticated techniques that is an excellent question but it has many facets to answer let me first talking about the low frequency signal that has to do not only with the noise but with the nature of the earth generated energy that frequency spectrum is mostly generated by ocean waves interfering with each other and pounding on the sea floor and these actually can go even at lower frequencies or global seismologies that have seismometers that can go well below the one hertz the much longer period that have been using them there is not much of that energy at higher frequencies so there is a part of kind of what the signal is however I have not lost hopes to find body waves in that data and actually I should have said and I'm going to say now one of the posters that are going to be shown later I think this morning is by a student of mine and a postdoc in the department looking with a different slightly different kind of data on land higher frequency and body waves so we are still searching for that because those interrogates the earth way down kilometers not just the first few hundred meters and I have few hypotheses that I would like to test with the data that we have the potential with additional data where eventually find some of body waves created by different mechanisms in the earth and I'm quite familiar with a few decades ago my background is double E and signal processing so I think that I'm quite familiar with all the array processing techniques at least with some of the array processing techniques but I've been using active seismology as well because we do have many interfering noise as well but excellent question Do you want to work to characterize these seismic patterns and relate them to physical phenomena like your ocean waves and seismic pattern and then being able to reverse that out of your big seismic jungle there? In some cases yes that is the case of the micro seismic for example so while we are using the say quarter two hertz two hertz energy propagating in the earth to estimate parameters in the subsurface so we don't somewhat care about the physical there but the physical phenomena generating the energy is not our primary objective the micro seismic case is indeed one of the principal objective is characterizing the physical phenomenon creating the waves that in this case are these smaller earthquakes that are induced by the formation to produce oil and gas and the physical characteristic source it does help Patrick to understand the stresses in the subsurface and the help of minimizing welds, minimizing frag jobs, optimizing production so it can work both ways depends really on the particular situation and that's I think it does illuminate what I think is crucial at least for physical sensors you need to have a domain expert that understand the physics or the chemistry or whatever you are looking at because the data analytics people that helps you to have most sophisticated data analytics tools to look at your data Thanks again