 Live from San Francisco, it's theCUBE. Covering Google Cloud Next 2018. Brought to you by Google Cloud and its ecosystem partners. Okay, welcome back everyone. This is theCUBE live in San Francisco, coverage of Google Cloud Next 18. I'm John Furrier with Jeff Frick. Day three of three days of coverage, kind of getting day three going here. Our next guest this year has to be Director of Product Management, Google Cloud has the luxury and great job of managing big table, big query, I'm sorry, big query. Get big table, big query. Welcome back to theCUBE, good to see you. Thank you. So you guys had a great demo yesterday. I want to get your thoughts on that. I want to explore some of the machine learning things that you guys announced. But first I want to get perspective of the show for you guys. What's going on with you guys at the show here? What are some of the big announcements? What's happening? A lot of different announcements across the board. So I'm responsible for data analytics on the Google Cloud. One of our key products is Google BigQuery. Large scale, cloud scale data warehouse. A lot of customers using it for bringing all their enterprise data into the data warehouse, analyzing it at scale. You can do petabyte scale queries in seconds. So that's the kind of scale we provide. So a lot of momentum on that. We announced a lot of things, a lot of enhancements within that. For example, one of the things we announced was we have a new experience, new UI for BigQuery. Now you can literally do the query, as I was saying, a petabyte scale query or some, any queries that you want. And with one click, you can go into Data Studio, which is our BI tool that's available. Or you can go in Sheets and then from there, quickly go ahead and fire up a connector, connect to BigQuery, get the data in Sheets, and do analysis. So ease of use is a focus? Ease of ease is a major focus for us. As we are growing, we want to make sure everybody in the organization can get access to their data, analyze it. That was one, one of the things which is pretty unique to BigQuery, which is there is a real-time collection of information. So you can, there are customers that are actually collecting real-time data from Clickstream, for example, on their websites or other places and moving it directly into BigQuery and analyzing it. Example, in-game analytics. If in-game you're actually playing games and you're going to collect those events and do real-time analysis, you can literally put it into BigQuery at scale and do that. So a lot of customers using BigQuery at different levels. We also announced clustering that allows you to reduce the cost, improve efficiency, and make queries almost 2x faster for us. So a lot of announcements other than the machine learning. Well, the one thing I saw in demo I thought was, I mean, it was machine learning, so that's the hot topic here, obviously, is you don't have to move the data. And this is something that we've been covering. Go back to the Hadoop days back when we first started doing theCUBE. Data pipeline, all the complexities involved in moving the data. And at the scale of the size of the data, all this wrangling was going on just to get some machine learning in. So talk about that new feature where you guys are doing it inside BigQuery. So I think that's important. Take a minute to explain that. Yeah, so when we were talking to our customers, one of the biggest challenges they were facing with machine learning in general, or a couple of them were, one, every time you want to do machine learning, you have to take data from your core data warehouse. Like in BigQuery, you have petabytes of scale data sets, terabytes of data sets. Now, if you want to do machine learning on any portion of it, you take it out of BigQuery, move into some machine learning engine, ML engine, auto-ML, anything, then you realize, oh, I missed some of the data that I needed. I go back, then again take the data, move it, and you had to go back and forth through much time. There are analysis, I think, that different organizations have done, 80% of the time, as the data scientists say, they spend into moving of data, wrangling data, and all of them. So that was one big problem. The second big challenge we were hearing was skill set gap. There are just not that many PhD data scientists in the industry. How do we solve that problem? So what we said is, first problem, how do we solve it? Why do people have to move data to the machine learning engines? Why can't I take the machine learning capability, move it inside where the data is, so bring machine learning closer to data rather than data closer to machine learning? So that's what BigQuery ML is. It's an ability to run regression-like models inside the data warehouse itself in BigQuery so that you can do that. The second we said, the interface can't be complex. Our audience has already no SQL, they're already analyzing data. These folks, business analysts that are using BigQuery are the experts in the data. So what we said is, use your standard SQL, write two lines of code, create model, type of the model you want to run, give us the data, we will just run the machine learning model on the back end, and you can do predictions pretty easily. So that's what we are doing with that. So Sidir, I'd love to hear that, you were driven by that by your customers, because one of the things we talk about all the time is democratization. If you want innovation, you got to democratize access to the data, and then you got to democratize access to the tools to actually do stuff with the data, that goes way beyond just the hardcore data scientists in the organization. And that's really what you're trying to enable the customers to be able to do. Absolutely, if you look at it, if you just go on LinkedIn and search for data analysts versus data scientists, there is 100x more analysts in the industry. And our thing was how do we empower these analysts that understand the data, they're familiar with SQL, to go ahead and do data science. Now we realize they're not going to be expert machine learning folks who understand all the intricacies of how the gradient descent works, all that, that's not their skill set. So our thing was reduce the complexity, make it very simple for them to use the framework, like just use SQL, and we take care of the internal hyper tuning, the complexity of it, model selection. We try to do that internally within the technology, and they just get a simple interface for that. So it's really empowering the SQL analysts within organizations to do machine learning with very little to no knowledge of machine learning. Right. Talk about the history of BigQuery. Where did it come from? I mean, Google has this DNA of they do it internally for themselves. It's a tough customer. In Cloud Spanner, we had the product manager run for Cloud Spanner, Dipty, she was like amazing. Like, okay, and it baked internally. Did that have the same BigQuery? Take a minute to talk about that because you're now making consumable for enterprise customers. It's not a just here's BigQuery. Talk about the origination, how it started, why and how you guys use it internally. So BigQuery internally is called Dremel. There's a paper on Dremel available, I think in 2012 or something we published it. Dremel has been used internally for analytics across Google. So if you think about Spanner being used for transaction management in the company across all areas, BigQuery or Dremel internally is what we use for all large scale data analytics within Google. So the whole company runs on, analyzes data with it. So our thing was how do we take this capability that we're driving, and imagine like when you have seven products that are more than billion active users, the amount of data that gets generated, the insights that we're giving in maps and all the different places, none of those things are first analyzed in Dremel internally, and we're making it available. So our thing was how do we take that capability that's there internally, and make it available to all enterprises? As Sundar was saying yesterday, our goal is to empower all our customers to go ahead and do more. So this is a way of taking the piece of technology that's powered Google for a while and also make it available to enterprise. Hardened and tested, it's not like it's vaporware. Yeah, it's not. I mean this is what I think is important about this show this year, if you look at it, you guys have done a really good job of taking the big guns of Google, the big stuff, and not try to just say, we're Google, you could be like Google. You've taken it and you've kind of made it consumable. This has been a big focus. Explain the mindset behind the product management. Absolutely, there's actually, one of the key things Google is good at doing is taking what's there internally used, but also the research part of it. Actually, Corina Cortez, who is head of our AI side, who does a lot of research and SQL-based machine learning. So again, the BigQuery ML is nothing new. Like, we internally have a research team that has been developing it for a few years. We have been using it internally for running all these models and all. And so what we were able to do is bring product management from our side, was like, hey, this is really a problem that we're facing, moving data, skill set gap, and then we were like, research team was already enabling it, and then we had an engineering team which was pretty strong. We were like, okay, let's bring all three triads together and go ahead and make sure we provide a real value to our customers with all of that we are doing. So that's how it came to light. So, get your take. Early days, like when there was the early Google search appliance, I just picked that up and that was ancient, ancient ago. But one of the digs was, right, it didn't work as well in the enterprise per se because you just didn't have the same amount of data when you applied that type of technique to a Google flow of data and a Google flow of queries. So how's that evolved over time? Because you guys, like you said, seven applications with a billion users, most enterprises don't have that. So how do they get the same type of performance if they don't have the same kind of throughput to build the models and to get that data? How's that kind of evolved? So this is why I think thinking about, when we think about scale, we think about scaling up and scaling down, right? We have customers who are using BigQuery with few terabytes of data. Not every customer has petabyte scale, but what we are also noticing is these same customers, when they see value in data, they collect more. I will give you a real example. Zuleli, one of our customers, I used to be there before. So when they started doing real-time data collection for doing real-time analytics, they were collecting like 50 million events a day. Within 18 months, they started collecting 5 billion a day, 100x improvement. And the reason is they started seeing value. They could take this real-time data, analyze it, make some real-time experiences possible on their website and all. With all of that, they were able to go ahead and get real value for their customers, drive growth. So when customers see that kind of a value, they collect more data. So what I would say is, yes, a lot of customers start small, but they all have an aspiration to have lots of data, leverage that to create operational efficiency as well as growth. And so as they start doing that, I think they will need infrastructure that can scale down and up all the way. And I think that's what we are focusing on, providing that. You guys look at the possibility, and I've seen some examples where customers are just like, they're shell-shocked. And you're almost too good, right? I mean, it's like, we've been doing generally a large scale, I put this data warehouse like 10 years ago, like, what are you talking about? I mean, there's a reality of, we've been buying IT, Enterprise has been buying IT, and in comes Google, the gunslinger, saying, hey man, you can do all this stuff. There's a little bit of a shell-shock factor for some IT people. Some engineering organizations get it right away. How are you guys dealing with this as you make it consumable? There's probably a lot of education as a product manager. Do you see, is that something that you think about? Is that something you guys talk about? So I think I actually see a difference in how customers, what customers need, enterprise customers versus cloud-native companies. As you said, cloud-native companies starting new, starting fresh, so it's a very different set of requirement. Enterprise customers thinking about scale, thinking about security, and how do you do that? So BigQuery is a highly secure data warehouse. The other thing BigQuery has is, it's a completely serverless platform. So we take care of the security, which we encrypt all the data at rest and when it's moving. The key thing is, when we share what is possible and how easy it is to manage and how fast people can start analyzing, you can bring the data. You can actually get started with BigQuery in minutes. Like you just bring your data in and start analyzing it. You don't have to worry about how many machines do I need? How do I provision it? How many servers do I need? So enterprises, when they look at the- So cloud-native ready. Yeah. All right, so take a minute to explain big table versus, I mean big table versus big query. Yes. What's the difference between the two? One's a data warehouse and the other one is a system for managing data. What's the difference between- So it's a no-sequel system. So I will, the simple example, I will give you a real example how customers use it, right? BigQuery is great for large scale analytics, people who want to take like petabyte scale data or terabyte scale data and analyze historical patterns, all of that and do complex analysis. You want to do machine learning model creation, you can do that. What big table is great at is, once you have pre-aggregated data, you want to go ahead and really fast serving. If you have a website, I don't expect you to run a website and back it with the BigQuery. It's not built for that. Versus big table is exactly for that scenario. So for example, you have millions of people coming on the website. They want to see some key metrics that have been pre-created, ready to go. You go to big table and that can actually do high performance, high throughput. I have last I read on that was like almost 10,000 requests per second on per node and you can just create as many as you want. So you can really create- Auto scale and all kinds of stuff there. Exactly. And that's good for unstructured data as well. Yes, exactly. Okay, so structured data, SQL, basically large scale, big table for real time, new kinds of data, different data types. What else do you have in the bag of goodies in there that you're working on? The one big thing that we also announced with this week was GIS capability within BigQuery. GIS has geographical information. Like everything today is location based. Latitude, latitude. Our customers were telling us really difficult to analyze it, right? Like, I want to know example would be, we are here, I want to know how many food restaurants are in two mile radius of air, which ones are those? How many should we create the next one here or not? Those kind of analysis, really difficult. So we partnered with Earth Engine, Earth Engine team within Google with Maps. And then what we are launching is ability to do geospatial analysis within BigQuery. Additionally, along with that, we also have a visualization tool that we launched this week. So folks who haven't seen that should go check that out. One great example I will give you is GeoTab. Their CEO is here, Neil. He was showing a demo in one of the sessions and he was talking about how he was able to transform his business. So I will give you an example. GeoTab is basically into vehicle tracking. So they have these sensors that track different things with vehicles. And then with, and they store everything in BigQuery, collect all of that and all. And his thing was with BigQuery ML and our GIS capability, what he is now able to do is create models that can predict what intersections in a city when it's snowing are going to be dangerous. And for small, for smart cities, he can now recommend to cities where and how to invest in these kind of scenarios. Completely transformating his business because his business was not smart cities. His business was vehicle tracking and all. He's like, but with these capabilities, they're transforming what they were doing. And new discoveries. New discoveries, solving new problems. It's amazing. Just dig it a little bit too. You know, the fact that you've got these seven billion active user apps that you can leverage, you know, specific functionality or goals or objectives or priorities in those groups and now apply those, pull that data, pull that knowledge, pull those use cases into a completely different application on the enterprise. I mean, is that an active process to people query? But how does that happen? No, we don't, as a customer come, as a customer, they're completely different, right? Our focus in Google Cloud is primarily enabling enterprises to collect their data, process their data, innovate on their data. We don't bring in like the Google side of it at all. Like that's that completely different area that way. So we basically, enterprises, all their data stays within their environment. They basically, we don't touch it, we don't get to access it at all. And they don't know. I didn't mean that. I meant, you know, like say maps, for instance. It's interesting to see how maps has evolved over all these years. Every time you open it up in its directions, oh, now it's better directions. Oh, now it's got gas stations. Oh, now it's where the, and it triggered because you said the restaurants that are close by. So it's kind of adding value to the core app on that side. And as you just said, now geolocation can be used on the enterprise side and lots of different things. So that's where I meant where that kind of connection in terms of the value of what can I do with geolocation? Exactly. So like, that's exactly what we did. With Earth Engine, we had a lot of learnings on geospatial analysis. And our thing was, how do you make it easy for our enterprise customers to do that? We've partnered with them closely and we said, okay, here are the core pieces of things we can add in BigQuery that will allow you to do better geospatial analysis, visualize it. One of the big challenges is lat-longs. I don't think they're that friendly with analysts, like, oh, numbers and all that. So we actually built into a UI visualization tool that allows you to just fire a query and see visually on a map where things are, what the points look like. So just simplifying what analysts can do with all these. Thanks for coming on, really appreciate it. And congratulations on your success. Got a lot of great, big products there, hardened internally now, making consumable. It's clear here at Google Cloud, you guys are recognized that you were making it consumable, pre-existing proven technologies. So I want to give you guys props for that. Congratulations. Thank you, thanks a lot. Thanks for coming on this year. Thanks for coming on. It's theCUBE coverage here at Google Cloud, covered Google Next 2018. I'm John Furrier, Jeff Frick. Stay with us. All day, more coverage for day three. Stay with us after this short break.