 here. We're on the ground at the Ariagas Center, the Alumni Center at Stanford University. For a pretty neat little conference, it's the Women in Data Science. It's the first year they've ever had the conference. I think they're gonna do it over and over again, so we want to come over, take a look at what's going on. I think there's probably 500 women here, some great panels and presentations this morning, so we grabbed a few of the smartest people we could find and bring them on air. So we're excited to have Kiri Grimes-Bostick, Distinguished Engineer from Google. Welcome. Thanks. It's great to be here. Absolutely. So we were talking about some interesting things offline about kind of the merger of data science, plus computer science, plus math, right? And math guy, computer science people, it's green light, red light, it's on and off, but statisticians, there's no right answer, right? As smart as they get, there's always a confidence level. So talk about where data science plays and then all these algorithms that we're applying to the Internet in places like Google. Well, I think this is really the confluence of math and computer science and statistics. I mean, traditionally there's been some distinctions between the fields, but with data growing so much and so many new applications, we really have to look at where they can help each other and where they overlap. There's this kind of joke people always make about a common algorithm called K-means clustering. K-means lets you extract clusters or meaning out of different types of data, like users that are similar or movies that are similar. And the joke is K-means converges, but we don't know what it converges to. And I think it's a really good example of algorithms that have long been used to try to extract meaning from data, but now we have to actually ask questions about how do you make a business decision based on that? And that's something where you need the kind of computational ability of computer science, but you also need the confidence from statistics to really know that Netflix or Amazon or Google could make a decision based on that. Now it's interesting too, you talked about making some business decisions because there's just so much data, and even Google, which has arguably as much computing horsepower on demand as anyone, still has to make all these trade-off decisions on something as simple as search. And you brought up a good example, I'd love to get more insight on, like searching on a tweet and how much effort do you need to find the tweet? How do you get the tweet? What's the business value of the tweet and then what's the expected behavior to return that? It's a quickly evolving and growing challenge. Yeah, and it's something we don't always think about. You hear a lot about traditional business intelligence, looking at customers, looking at users, what's their value. But you also have to think on the back end where data science is really important is in helping a company decide, hey, I can't brute force this. Or if I do, it means I have no computing power left to adapt to a new tweet or adapt to a new kind of content that comes along. So how do I use data science to pick the optimal point? How much computational power do I put into indexing that tweet versus what's the value of going out and getting this whole new set of content that somebody's created on their own homepage that's more static? And even the algorithms we use to actually index and crawl and find the right meaning of that content and look for key words, you can't brute force it. There's just too much stuff being created. And if you do, then you've spent a ton of cash on compute power that maybe will be better spent on a feature or a cool new rendering or the ability to understand structured data. And that's the kind of tradeoff internally that I think products that recommend things for users, look for search results are making every day. That's interesting because there's so much talk about machine learning, and the machines are taking over. But you bring up such a great point that without context, you could spend so much more resources trying to get to the place where somebody with a little bit of knowledge, a little bit of experience can get you there to start the journey. So you're not grinding through all that massive amount of stuff in the back end. Yeah, I think it's very similar to when you look in math, you want to optimize something. You're looking for the optimal point on a function, but you want to pick a good starting point because otherwise you're spending a lot of wasted cycles trying to grind through options that really never made a lot of sense for you. It was interesting this morning at some of the kinos talking about different types of algorithms based on what is your objective? Are you trying to bundle? Are you trying to get something back quickly because there is no perfect algorithm, right? It's all based on an objective and you have to make tradeoffs are still with all the computing horsepower that Google has all the money that you know, search is driven, you still have to make decisions. You still have to make tradeoff. Yeah, because something like running an algorithm in a batch form can give you a lot of power because you have time to find the best document for a query or find the most recent document because you can look at everything at once. But you really want to be able to do it online too because then you can always be incorporating new content as it's created. And so making that choice or finding some hybrid where you're passing some stuff through online and some stuff is being handled as a batch is a really important part of how compute systems work today. So you graduated from Stanford. I think you've got your PhD in 2003, been at Google ever since. So you've been there for a while. How have things really changed? Because it's funny they're talking about all this different research in the keynotes and most of the dates on the research were like 04, 0807. So maybe I'm just old. I'm like, wow, that seems really recent. Most of this science has done, I think there was only one that was 1999, everything else in the 2000s. Talk a little bit about how it's evolving, how it's changing both as a sophistication of the tools gets better, but also as the massive data is growing. And then of course more is law as a gift that keeps on giving the ability to add so much more horsepower to these problems. Give a little historical perspective. Yeah, well, I think one thing that doesn't get mentioned often is just the pressure from users, right? As systems like Siri or web search get more intelligent, right? As they appear more to understand what you want, people have higher and higher expectations that they'll continue to understand, right? So when I first worked at Google, a lot of time and effort went into content is static. How do we scale the ability to handle all this, even move this much data around was kind of a new thing. But now, you know, in a lot of these different fields, if you listen to what they talk about in Netflix, it's about personalized recommendation, right? It's about systems which need to really understand who you are. They need to understand what type of content you're looking for. Are you looking for reviews? Are you looking to buy a product? Are you looking for real estate listings? And all that content is kind of dynamically generated, right? So the same thing is true of, say, movies, right? Are you looking for a traditional movie, a documentary, a short film somebody made on, you know, their own home computer because there's a lot of different types of content? Are you looking for a self-published book? Are you looking for a Kindle book? And really understanding those types of nuances about what is an object? What is this data record? And, you know, how recently was it created? How fast can we handle it? Is a much larger focus of a lot of this data science today than it was 10 years ago, right? 10 years ago it was about this is a user, this is revenue, this is a document, this is, you know, it was it was more about these kind of traditional categories. Right? It's funny, you all have become a victim of your own success, right? Just keep moving the expectation curve up and, you know, people complain and you were carrying so much power in our pockets these days. I know my phone works great, so I expect it to work perfectly all the time. And I know I used to have some Motorola Razr that was a great phone, but it didn't do any of this stuff, right? And now I expect my phone to talk to me while I'm in the car. Right, right, right, get and talk to it, it tells you where to go, it tells you if there's bad traffic. But it's funny to, again, back on the recommendation, because what exactly are you trying to give me something that you think I like? Well, maybe I'm in the mood to discover. So, you know, there's so many subtle nuances to try to answer that question and there really isn't an answer, right? It's, there's a lot of conversation in the earlier sessions about experimentation and being able to experiment, but at the same time making sure you're providing a level of service to all the customers, even if they happen to be subject to experiment one and somebody else is subject to experiment two. Yeah, I mean, it takes a lot of kind of finagling to understand what people want, but at the same time, businesses want to feel confident that they've made these decisions based on real data, and that's where it gets a little fuzzy, right? Right, and then there's, of course, there's always that little thing causation versus correlation, which always gets in the way. Yeah, I mean, I just had somebody talk to me today about, is a user valuable? Is that something to do with the user or is that something to do with the product? I mean, would they be more valuable if the product were better or what does it even mean to be a valuable user? Are they someone who comes back or someone who recommends you to their friends? You know, what does that mean? Lots of opportunity still ahead. Haven't solved the puzzle yet. No, we definitely haven't. All right, Carol, thanks for taking a few minutes to stop by. It's a great conference. Thanks a lot, and I hope the rest of the day is just as interesting. Absolutely, I'm sure it will be. So I'm Jeff Frick. We're at the Women in Data Science Conference at Stanford University, the Ariaga Center. Thanks for watching.