 Pleasure to be here so ten minutes you say I'll have to rush through the slides Just kidding. I'll stay if I think do I get food for my for my efforts Do I get lunch also if you're willing to stay for lunch you get food? And I'm sure there are a lot of people here, okay, so I'll stay I'll give you my ten minutes feel nine minutes now I guess and I'll hang around for lunch so if you have questions you want to talk about this I'll be happy to to talk to you So as Steve said, we're trying to put together this center initiative Institute It doesn't have a formal name because it doesn't exist yet. We're just in the process of getting organized Why do we want to do this? Well, I guess for this audience? I don't have to explain why big data or data analytics It's important. So I'll skip this slide Next slide is that as I said we're working on this so things are being defined But this slide sort of tells you why we want to do this There's a lot of activities related to data data big data analytics Data science at Stanford and elsewhere and we're trying to bring together the resources have people that work at the core algorithms engineering side of data science work together with domain scientists in energy and environment in medicine Social science and so on because we think it's a real benefit for the researchers in one area to interact with researchers in another area But in addition to just fostering research, we want to provide a a resource for this community For example, a lot of us are interested in Twitter data, which is I don't have to explain what Twitter is But it's a lot of work to get Twitter. It's a huge data set. It's changing constantly You have to negotiate with Twitter or pay lots of money to get it and a lot of professors are Stanford and companies are doing this So wouldn't it be nice to have this center get one copy for Stanford that we could all share and and and use together So that's a goal. We already are negotiating with Twitter We're trying to get Google books for example to give us data Some of the corporate partners might give us additional data Some of it might be restricted in the way we can use it some of that will be open But that's another one of the goals provide is repository of tools and useful data sets to support the research This diagram sort of illustrates that this initiative that I'm describing doesn't try to cover everything at Stanford It's it's too there's too much going on So what we're trying to do is a federation of Centers organizations with ours being at the center not because it's the most important one but because we Deal with generic tools that can be used by multiple disciplines Most of the faculty that are going to be involved here are going to be in either statistics or engineering Disciplines where our goal is to develop tools and solutions as opposed to The goals say of our colleagues in the medical school who want to cure cancer. That's their main goal They use data as a tool. So they have a big initiative and we're interacting with them So we will be able to get some of their data sets and hopefully they'll learn from us We'll learn from them. So we're in the process of setting up connections with some of these pebbles There are others to be determined. We probably won't have resources to interact with everybody on campus But that's sort of the the big picture Okay, I Already said this Yeah, we in addition to research and this infrastructure that I described We want to also have a teaching aside to teach both our students short courses Maybe on new techniques or how to use some of these data sets that we'll have but also for our corporate affiliates I don't have time To go through all of the research topics. We don't have the the list yet I mean these are suggestions ideas that we've been Discussing this quarter. We've been running a seminar with Stanford faculty. It's not a real traditional seminar It's more of short presentation. So we get to know each other And the last two sessions next week and the last week of the quarter It's gonna be sort of open for discussion of flagship projects We want to define some real big challenging projects for the initiative to take on Projects that involve several faculty from different disciplines and several of our corporate partners ideally So the topics go from more statistics topics the machine learning topics to some Traditional data management topics just how to ingest the data how to clean it how to keep track of the lineage Uncertainty in the data and so on. So we're trying to cover the whole spectrum of data science So We obviously need some funding to realize this vision for for supporting students for supporting this infrastructure We're going to try to get Foundations government grants, of course, but our initial effort is getting a relatively small set of Corporate partners who really want to work with us closely and support this initiative So we're looking on the order of five to ten corporations to help us get it started and help us define the initiative Since it's not quite off the ground. We want help in in defining what are our flagship projects What are interesting problems what are good data sets that we might be able to get So just like my previous speaker was looking for customers for his book. I'm looking for her Potential companies slightly more expensive than the book. I don't know exactly how expensive that book was But Well, these are sort of the benefits of working But I guess as you're all members of this energy and environment initiative You know what the advantage is of being involved with Stanford research. Otherwise, you wouldn't be here But as I said, the price is a little bit heftier than the book Because we're looking for indeed partners who really want to support us And have a person on campus for example working with us We already I can announce we already have Google has already said yes, at least verbally We don't have the check yet, but it's it's in the mail as they say We have three or four other companies that are like in the 80% probability State and then we're talking to a number of other companies the set of companies that we'd like to have is relatively Diverse we have some technology companies like Google and others so that we're talking to and then we have Application companies insurance companies banking manufacturing companies oil companies We've been talking to some oil companies that I think are quite interested in this so Last slide The schedule is that we don't have a schedule yet because we're just getting started, but this is our wish list We are in conversations as we speak with various companies Hopefully by the end of the summer or starting of next quarter We'll have at least an initial set of companies and we can officially kick off so I Still have maybe a minute for any short questions and then as I said, I'll be around for lunch if you want to Talk to me So as Hector said we have some time for questions now He'll be available at lunch and in addition for those of you who are interested in learning about this in more detail We'll be happy to set up meetings with you in the future any questions now. Yes Hector, I wish you had more time for your presentation My question is just one question You Usually want to understand in a short way in a form of motor what Particular science is doing if I take inversion for example, which is part probably of the problems you're describing here I would say that what I expect from inversion is taking data and obtaining parameters of the earth or some other objects which can be Is this such a small expression of what these broadly from related formal problem is about what it takes and what it delivers So you're asking sort of for a definition of data science is that is that the question or what probably But in general, I know that I take data. That's what I deliver. That's right Well, there's a result of analysis. Yes the way I like to think of of There's a lot of people that that using data for the research And we're not being exclusive. So I'm not defining what we will or will not cover whatever our faculty and our Sponsors are interested in that's what we're going to do But the way I think of it personally the exciting part of this data science initiative is when people are Getting the data and they're using the data to drive Discovery right they don't know exactly what they're going to find or what they're going to prove But by looking at the data and analyzing the data you discover new Hypothesis anomalies interesting things and that's I think sort of the exciting side if you have already your your formula Your theory you just want to verify it. That's fine But the the new exciting direction is to just collect lots of data from your instruments from your sensors from Social sites from your cars from your phones and examine it to make discoveries That's one of the exciting things that we're going to focus on Did you want to follow-up love or is that fun? Well, well, you're the timekeeper. It can take a lot of time but this enlightening because You're targeting some process and you're trying to guide with these processes Discovery process the discovery Which I want to reach or obtain I'm ill I buy into this and I have this feeling but you're confirming that that's the target here Not specific not specific model and not specific something But give us ability To to be guided that's right. That's and guidance means that we can say that these hypotheses are more likely And these are not likely and that's fair enough. What one term that I guess data-driven discovery, right? So you discover based on okay, do we have time for that one last short question back there? Sure last question Thanks. I just know you didn't have very much time to talk about your program here I was curious between the two programs now that we've heard about the ICME and your program How do you see them interacting or? Complimenting each other because it sounds like a very similar target or that's right We are working together Margo is part of our working group And remember I talked about talked about this teaching component that we want to have well It's the same as the ICME teaching component. It's not different We it's going to be the same effort because Margo is going to be doing both sides of this So this is complementary I mean what we're trying to do is focusing on getting some additional corporate funding and Trying to get more researchers that are interested in in data to participate But it'll be in conjunction a lot of them are going to be from ICME Yeah Thank you