 Live from San Francisco, it's theCUBE. Covering Google Cloud Next 19, brought to you by Google Cloud and its ecosystem partners. Hello everyone, welcome back here to theCUBE. Live coverage here in San Francisco, California. We're in the Moscone Center on the ground floor here. Day three of three days of coverage for Google Cloud Next 2019. I'm John Furrier, my co-host Dave Vellante. Stu Miniman out there getting stories out there. He's also been hosting. Dave, great to see you. Everyn Eir Urector, director of product management at Google Cloud doing all the data, streaming the data. We're streaming data right now. You're our guest. Absolutely, this is it. So let's stream some data. So streaming data certainly has been around for a while. Dave and I, when we first started theCUBE 10 years ago, it's part of SiliconANGLE Media, Hadoop was just a small little project. That really kind of was the catalyst moment for around big data. That's now evolved to its own position. Now you have streaming data, you have cloud scale. So the cloud has really changed the game on big data. Changed the nature and dynamics of it. And one of the things is streaming data, streaming analytics as a core value proposition for enterprises. And this is fairly new. Very true. What's your take on it and how's it relate to what's going on Google Cloud? I am glad we're talking about that. This is an exciting time for us. Streaming, like you said, is growing. Batch is not going away. But streaming is actually overtaking a lot of the applications that we're seeing today. We're seeing more streaming applications taking place than batch. One of the things that we're seeing is, everybody is gathering data from all over the place, from your websites, from your mobile phones, from your IoT devices, just like we're doing right now. There's data coming in and people want to make decisions real-time. Whether it's in the banking industry, in the healthcare, retail, it doesn't matter which vertical you're working with. And we're seeing how those messages, how those events are coming in and where the decisions are being made real-time, in milliseconds we're talking about. Why is it happening? What's the real catalyst here? Just the tsunami of data, nature of the value, every all the above, what's the... We believe one of the things is, like you mentioned, Cloud really changed the game. Where people actually can reach global data and messages at scale. We're talking about billions of messages coming in and processing capacity is available. Now we can actually process it and make a decision within milliseconds and get to the results. To me, that was the biggest catalyst. And we're seeing, and many of us have grown up using batch data, making decisions. Now everybody's talking about ML and AI. You need that data coming in real-time and we can actually process it and make the decision. To me, that's the catalyst. Personally, we love streaming data at this topic, one we believe streaming, we're streaming video, but data, real-time, has been one of the key things. You see self-driving cars, munging of data, mixing and matching of data to get better signaling, better machine learning. And I got to ask you because batches, certainly there's a role for batches, it's kind of old school. It's an old technique, it's been around for a while. It's not going to go away though. It's not going to go away, it's going to come in place. But the knee-jerk reaction of existing old school people who haven't migrated to the new modern version, they go to the batch kind of mindset. I want to get your reaction, data lakes. There's nothing flowing in a lake, okay? So there's a role for a data lake. Streaming gives me the impression of like an ocean or a river, you know, something moving fast. Talk about the differences, because it's not just a data lake, that's a batch kind of reaction. It is a complementary. Actually it's not going away, because all that data that we had in the back is something we're relying on to really augment and see what's changing. So if you're in a retail house, you're buying something, you're going to make a decision and your support is actually behind it. Okay, here's Everon, he's actually shopping around this and he wants this for his son. That's what the models built around it is looking at what was my behavior and in the moment making a decision for me. So that's not going away. The other thing is batch users are able to take advantage of the technology today. If you look at our data flow, same set of codes, same set of capability can be used by the same folks who are used to batch. You don't have to change anything. So that actually we help folks to be upskilled using the same set of tools and become much more experienced and experts in the streaming too. So that's not going away. We help both of the developers. So complimentary. Very complimentary. So data lakes are good for kind of setting the table, get a store somewhere, but that's not the end game though. No, okay. There's not the end game. I wonder if we could talk about the evolution of some from batch to real time streaming. And my favorite example, because I think people can relate to it is fraud detection. Ten years ago, it was up to the user to go through his or her bill, right? Oh, that wasn't that. And then you started to get inundated with false positives. Very much so. And now lately, last couple of years, it's getting better and better. Fewer false positives. Usually when you, usually no news is good news, news is usually bad news now. So take that example and use that to describe how things have evolved. Yeah. Well, I am a student of AI. I did my master's and PhD in that. And I went through that change in my career because we had to collect the data, batch it and now analyze it and actually make a decision about it. And we had a lot of false positives in some cases, no, some negative positive misses too, which you don't want that either. And what happened is, our modeling capabilities became much better. And we, with this rich data, and you actually tap into the data lake, you can go in there, the data is there and disparate data. We can pull in data from different sources and actually remove the outliers and make our decision real-time right there. We didn't have the processing capability. We didn't have a place like PubSub where Globic can scan and bring in data at hundreds of gigabytes of data. That's messaging that you want to deal with at scale, no matter where it is. And process that, that wasn't available for us. Now it's available. It's like a candy shop for technologists. All the technologists in our hands and we wanted all these things. Talk about it. You were talking about the, I think the simplicity of, I'm able to use my batch processes and apply them. One of the complaints I hear from developers sometimes is that the data pipelines getting so complicated. You were talking about your grabbing stuff from websites, from financial databases. And so depending on what data store you're using and what streaming tools you're using or other AI tools, the pipeline gets very complicated. The APIs start to get complicated. But I'm hearing a story of simplicity. Absolutely. Can you elaborate on that and add some color? I'm glad you're asking a question. You may have heard yesterday we announced a whole bunch of new things and ease of use is the top of the line for us. We really are trying to make it easy. If you look at the SQL pipelines that we're building with Dataflow, it helps you end to end. A data engineer, no matter which angle they're coming in should be able to use their known skill sets and be able to build their pipelines end to end so that they can achieve your goals around streaming without having to really go through a lot of the clusters of the pipelines. We are continuing to push that ease of use over and over. We're not going to let it go because make it easier, everyone will adapt faster. You mentioned got a PhD in AI, masters in AI. You know, AI has been around for a while. A lot of people have been saying that. But machine learning certainly has changed the game. Totally. Machine learning plus cloud has been in really accelerant in the academic and now commercial aspects of AI. So I want to get your thoughts on the notion of scale, which you talk about, plus the diversity of data. So if you can bring in data at scale, get more signaling points, more access to data signaling, the diversity of data becomes very key. But cleanliness, data cleaning used to be an old practice of you get a bunch of data, stack it up, put it in a pile, corpus, and you got to go clean it. Which streaming, if it's always flowing, there's kind of a behavioral characteristic of data, cleanliness, data monitoring. Talk about that, diversity of data, clean data, and how that feeds machine learning and makes better AI. Good, good one. So that's where we actually are able to, if you look at PubSub, you're actually able to build in, join your table set of datas with streaming set of datas. You can actually put it into data flow to really make those analysis. And within both, we provide enough of a window for you to be able to go back. Hey, are there things that I should be looking at it? Up to seven days, we can provide a snapshot, because you will always find something. You can go back, you know what? I'm going to remove this outlier. Over and above all the processing that we do before we bring in the data. So there's a lot of cleanliness takes place, but we have the built-in tools, we have the built-in capabilities for everyone to get going. It's ready to scale for you from the moment you open it up. That's the beauty of it. That's the beauty of when you start from PubSub to data flow to streaming engine, it's ready for you to run. Talk about what's changed though. When people hear diversity of data, they get scared, oh my God, a lot of work. Heavy lifting. Now it's a benefit. I am, absolutely. It's easier now to deal with all these diverse data sets. What's the easy- If you remember the big data, the 3Vs of big data, right? Volume, velocity, variety. People were scared about the variety. Now I can actually bring in my data from different places. Again, let's go back to the shopping example. Where I shop, what I shop for, that actually defines my behavior around it. Those data sit somewhere else. We bring those in to make a decision about, okay, everyone wants to go buy a scooter or whatever else. That's the diversity of the data. We are now able to deal with this at scale. That was not available. We could actually bring in, everyone did this. Now everyone is going to do this. It's much more sequential. We are now able to bring all of them together, process it at the same time, and make the decisions. What's the key product that'll make this all this happen? Through the portfolio. If I want that, what you just said, which is a great value proposition, it does sound like not a heavy lift. All I got to do is point the data sources into this engine. What are the products that make up that capability? If I look at the overall portfolio on Google Cloud from our data analytics point of view, so you can actually bring in your data through PopsUp. Lots of messaging capability globally. And you can actually do it regionally because we have a lot of regional requirements coming from various countries. And data flow is where we actually chest from the data. That's where you do the processing. And you use all these advanced analytics capabilities through your streaming engine that we released. And you have your BigQuery. You have your AutoMLs. You have all kinds of things that you can bring in your big tables and what have you. That's all easily integrated one to end to end for any analyst to be able to use. What is Beam? Beam us, great. I'm so glad you asked that question. I almost forgot. Beam, as we open that, is one of our open sources. We donated the same set, just like we did with Kubernetes a few years ago. We donated to the open source. It's growing. This year, actually, it won the Technology Awards. So the source is open. Community really took it upon. They use that tool kit to build their pipelines. You can use any kind of code that you want, Java, Goal, whatever you want to do it. And they contribute. We use it internally and externally. It's one of those things that's going to grow. We have a lot of community events coming up this year. We invite, and I've seen the increase. I'm really, really proud of that community. Evern, I love the AI. I can't get my mind off your background and academic. Because I studied AI as well in the 80s and 90s, all that good stuff. Young kids are flocking to computer science now. Very much so. Because AI is very sexy. It's very intoxicating. And it's so easy to deal with now. You guys had a hackathon here with the NCAA using data, really kind of real time. Kind of cool things are happening. So it's a moment now for AI. This is it. This is the moment. What's your advice? You've been through the wars. You've waited, you've done your tour of duty all those years. Now it's actually happening. What's your advice for young people who want to come in, get their hands dirty, build things, use AI? What's your advice, how they should tackle that? I am living it. Both of my sons. One is finishing junior high. The other one is a senior in high school. They're both in it. So when I hear my young kids come and say, hey, Baba, we just built this using TensorFlow, like it is making me really proud. At the middle school level they were doing it. So the good news is we have all this publicly available data for them. I encourage every one of them. If you look at what we provide from Google Cloud, you come in there. We have the data for them. We have the tools for them. It's all ready for them to play. Schools get free access to it too. It's a maker culture, but how do they get someone who's interested but never coded before? How do they jump right in and get ingratiated and immersed into the code? What do they do? We have some community reaches that we're actually doing as Google. We go out to them and we're actually establishing centers to really build community events for them to really learn some new skills. And we're making this easy for them. And I'm happy to hear more and do it. But I am an advocate. I go to middle schools. I go to high schools. I go to colleges. Colleges are a different story. We provide school classes and we provide our technologies at universities because enterprises need that talent, need that skill. When they graduate, we're going to hire them just like I'm going to hire them into my organization. So my number one complaint, my kids have out of school. They're talking about kids that, oh, school's going to be a waste. It's so linear. I can learn everything on YouTube and Google.com. What skills should, all the stuff I learned in school I'm never going to use in the real world. So the question is, what skill should kids learn could be applied to machine learning thinking, the kind of constructs, the data structures, or methodologies. What are some of the skills and classes that could tease out and be natural lead-in to computer science and machine learning AI? You know, actually, they're going to develop the skills. The languages will evolve and so forth. As long as they have that inner curiosity, asking the questions, how can I find the answer little faster, that will push them towards different sets of tools, different sets of areas. If you go to Berkeley in here, you will see a whole bunch of high school kids working side by side with graduate students, asking those questions, developing those skill sets, but it's all coming down to the curiosity. And I think that applies for business too. I mean, there's a big gap between the AI haves and the have nots, I always say. And the good news here that might take away is, you're going to buy AI. You're going to buy it from people like Google and you're going to bring it together, build it and apply it. You're going to spend time applying it. And that's how these incumbents can close the gap. And that's the good news here. Very true. If you look at it, look at all the APIs that we have, from text recognition to image recognition, whatever it is. Those are all built models. And I've seen some customers build some fantastic applications starting from there and they use their own data, bring it in. They update their model for their own business. It's composition, it's composing. Exactly. It's not coding, it's composing. It's composing. We are taking it to the next level. That abstraction is going to actually help others come into the field because they know their field of expertise. They can ask the right questions. You and I may not know it, but they will ask the right questions and they will go with all the tools available for them for the curiosity to reach. What's the coolest thing you're working on right now? Coolest thing, I just, streaming is my baby. We are working on it. I want to solve all the streaming challenges. Whatever the industry is, I really want to welcome everyone bringing it to us. I think if I look at it, one of the things that we discussed today was, Anthos was fantastic, right? I mean, we're going to really change the game for all enterprises to be able to provide those capabilities at the infrastructure. But imagine what we can do with all the data analytics capabilities that we have on top of it. It's, I think this next five years is going to be fantastic. What's the coolest use case you see emerging out of streaming? Ah, you know, yesterday, I actually had one of my clients with me on stage, AB Tasty. They had a fantastic capability that they built. They tried everything. And we were not their first choice. I'll be very open. They said the same thing to everybody. You guys were not our first choice. They went around, they looked at all the toolkits, everything. They came, they used PubSub, they used Dataflow, they used Engine, streaming Engine, and they do AB testing for marketing. And they do it at scale. Billions of messages every minute. And they do it within seconds, milliseconds, 32 milliseconds at most. Because they have to make the decision. That was awesome. Gojek, I don't know if you're familiar with that. It's one of our customers. They provide these real-time delivery. In India, imagine where things are. In globally, just where you can actually ask for a food to be delivered. And they have to optimize, depending on what the traffic is, and go with their scooters and provide you this delivery. They are doing it as well. Okado, they, I believe, provide food in the UK to 70% of the population. They use our technologies for real-time delivery. Those are some great examples. Every great insight, great to have you on. Just a final word here. Next couple years, how do you see the trajectory of machine learning AI and analytics feeding into the value of making life easier, society better, and businesses more productive? We are seeing really good pull from enterprises, from every vertical that you can think of. Regulated, retails, what have you. And we're going to solve some really hard problems, whether it's in healthcare industry, financial industry, retail industry, we're going to make the lives of people much easier. And they're going to benefit from it at scale. And I believe we're just scratching the tip of it. And you're seeing this energy in here, year over year, this has gotten better and better. I can't wait to see what's going to happen next year. Everett, you're right, great energy, expert in AI, streaming analytics. Again, this is early days of a grand new shift that's happening. You get it on the right side of history, it's AI, machine learning, streaming analytics. Thanks for coming on, appreciate it. Thank you so much. Thank you guys. More live coverage here in theCUBE in San Francisco at Google Next Cloud 2019. We'll be back after this short break.