 Hello, welcome to this CUBE Conversation here in Palo Alto, California in CUBE Studios. I'm John Furrier, host. Today we've got a great guest talking about data analytics, future of AI, Google Cloud. My guest is Bruno Aziz, a head of data analysts at Google Cloud. 11th CUBE appearance, CUBE alumni. Bruno, you've been in the space for 25 years with GCP for three, previously Microsoft, Oracle. You helped launch three startups, Alpine Labs, C-Sense at scale, many more. Great to have you back. Well, thanks for having me here today, John. It's always great to talk to you. You know, we've been talking about big data going back to 2010, we've had many conversations here on theCUBE. We were there, Gen 1, big data, Hadoop, then you get Spark, everything else is happening with the data warehouse in the cloud. You see the birth of the data bricks, the snowflakes, cloud is expanding, next level. Now you see kind of this next gen action, right? And this is kind of what we're seeing was app developers building data apps, you got infrastructure, data infrastructure is emerging, AI is a forcing function, and all everyone's talking about is this next level, generative AI and the role of data being the value. You guys have some new functionality for BigQuery, that's a product that you're working with there at Google, serverless data warehouse product that's been very popular and successful. So what's the scoop? What do you guys got going on? What's the news? Well, first of all, you know, every journey and I starts with your data, right? And that's what we've been really focused on here at GCP. You know, and we're moving beyond the data warehouse. In fact, the way we design BigQuery is way more than just a data warehouse, it's an analytics systems is what people want. And so what do I mean by that? It's a system that handles any data at any speed that has embedded machine learning as a key principle, embedded business intelligence as a key principle, supports any type of data structured, semi-structured, unstructured within the same environment. And it's also open to other data platforms. So in our case, we have BigQuery Omni, which allows you from BigQuery, the query data that's in Amazon, that's in Azure. And we have this amazing data sharing platform any week, over 6,000 organizations securely share about 275 petabytes of data. So there's a type of scale that customers need in order to build their next generation applications. You know, one of the themes you mentioned, is security, obviously with data, people are bringing their own data models to the table in a new way, they got to secure their old ones. But the theme of democratization of workloads, secure data and choice are three themes I'm hearing a lot of, how does that relate to some of the things that you got working on right now with BigQuery? Because I'm hearing a lot of good things around how BigQuery is kind of like vaulted itself out into the center of the action with some of these features. Can you elaborate on some of those things? Absolutely, well, thanks for saying that. And you know, we've been working on this problem for quite a long time, you know, BigQuery went general availability in 2011. So we've had this experience of working with customers across the globe at a very large scale. And they've really blown us away with the amount of innovation. You know, I'll talk about a few here that might be companies you've not heard of, that are gigantic organizations in their geography. You know, Tokopedia is an e-commerce giant. And using our technology, they're able to cut analytics computing costs by 25%. This is a company that has an online marketplace connecting 10 million merchants, 100 million customers every month around products, selection, payment, delivery. I think about Mercado Libre, this is a gigantic organization in Latin America, 35,000 employees, 107 million buyers, they process 35 purchases every second. They migrated from an environment with Teradata into BigQuery, 35,000 queries migrated. And their data stack talking about adoption is adopted by 80% of their employees. And so these are the types of examples where I feel like I see the future working with these customers across the globe who really are pushing to the next generation. You know, there's countless examples of customers I'm sure we'll talk about today. But I think that's where people need to go is look at these amazing organizations that have been innovating with our platform as a blueprint for how they can get there as well. Auto scaling is a big feature. You guys have distributed environments, hybrid cloud, big part of it, open source. This is the kind of the current situation and applications are going to be built where they're going to want to have access to data, all kinds of data. There's kind of a data moat developing. It's the new value proposition where people are realizing that data in motion is valuable. But you don't want to just make it in motion and put it away for free and get it away for free into these public data sets. So people are trying to rethink specifically how to deal with their data. And they're bringing more data to the table. It's now more proprietary foundational models are out there and you got the surge of open source developing. How do you guys look at that from a BigQuery perspective? How do I work with you with that trend going on in the mainstream? I love that you're saying data mode actually is a term that one of our customers, Lytix, who has saved a lot of money using our platform and been able to innovate is calling what they're doing. From our perspective, what we're trying to do is make it as easy as possible for customers to onboard on a platform and innovate with it. So a lot of the innovations that we've introduced are really configured around that, right? This idea of having BigQuery additions is about making really easy for you as a customer to assign the right version of BigQuery with its associated capabilities to the workload that you're interested in running. And we'll do the rest, right? Autoscaling is a foundational capability of BigQuery additions. And what it does, very put simply is that it follows your usage at the second level and it charges you only for that. So if you think about your differentiation as a data team building data applications, we want you to think about these features as just forget about it features, meaning we will take care of the infrastructure for you. We will scale it to give you the best price performance so you can do your job, which is building these highly differentiated data products for your constituents internal or external. What's interesting is there's ways to consume things differently based upon the use cases. I want to get to some of the customer examples, but first I want to ask you about additions. I mentioned auto-scaling, that helps. You guys have created this thing called additions. Can you explain what BigQuery additions are? Absolutely, so there was in the past really two ways to think about how you'd consume BigQuery, very popular ways when we started with the technology was, was just papers per query. So you can imagine any data analyst, any data engineer can onboard on the platform very quickly, just start querying the system and just paying us for that. And then over time we evolved that to reservation. So now you could reserve capacity and know that that would be available. And as we learned from customers or innovating across, just like you said, John, all types of data, all types of use cases and more data and more people, we realized, well, there's a better way that we can assist them. And so the first step was, let's look at the specific workloads, where do they start and how do they mature? So they typically start with easy workloads around reporting and so forth. And then they'll mature to machine learning and then they mature to even multi-region more sophisticated needs. And so we created these three additions to make it really easy for an organization to say, oh, okay, this workload, I'm an associated to a standard edition, which gives me all the basics that I need to get started. This other workload over here, I need machine learning. So I want to activate that. One of the great functionalities of editions is you can mix and match. So it gives you both the flexibility, but also gives you the predictability of how much you're going to essentially spend on this platform. Auto scaling is a way for us to give you the best capacity and the best price performance as we're following your usage. And so this is kind of our way to bring, if you will, a key competitive advantage for us, which is artificial intelligence, where we can get a really good sense of how using the platform and optimize all this functionality for you. You know, the other vendors in the market, they think about capacity as VMs, right? So they give you a box here and then when you need more capacity, or you get double the size of the box, we think about it as, how do we make it easy for you to onboard? Choose the right version. And then after that, we'll just do the rest. The infrastructure will just follow your usage and you won't pay more than what you're actually using. You know, it's really good. That's kind of next level features. You got the flexibility, which gives you choice, how you want to consume for the use case, for the app, for your developer, and the predictability is more for the CFO, okay? I don't want it to get charged more, so you can scale up. That's, I love the combination there. What does this mean from a customer standpoint? Can you give some examples of customers that have gone this way? And what have they seen in terms of capacity and savings? Absolutely. So what it means for a customer is that today, you can safely onboard on the platform and just start working with it. You know, best, one of the great companies that I can think about is L'Oreal, I mean, I know you're going to blame me for using a French company, but you know, we got great French customers, L'Oreal and Carrefour. And what I like about L'Oreal is that, just a few years ago, they almost had no data in BigQuery, but quickly, BigQuery became the heart of their systems. They now have beta bytes of data in this organization that has thousands of SKUs. I think they sell about seven billion products, but their environment is a very distributed environment across markets, across departments. So they have that need of saying, well, if in the US, I've got a very mature business and maybe in another geography, I've got a business that's getting started, I shouldn't really use the same level functionality. So this mix and matchability is really helped them a ton. Now, auto scaling is a feature when you watch the video with Antoine there, who's their enterprise architect, he says, look, this is the feature I've waited for the most because essentially, if I have bursty or spiky workloads, I don't have to worry about having someone watching that. Google does it for me. And that's kind of the benefit if you want the magic, the Google magic behind these functionalities is that, you know, because of a vertical integration, we can really return exceptional savings for you. Another great example is companies building applications on top of BigQuery. Lytics is a CDP, so a customer data platform, lots and lots of customers, they got seven billion profiles. They run 400 billion events in real time. The company experienced a 15% performance increase while experiencing a 20% cost reduction because they're building this data mode on top of BigQuery. So lots and lots of examples of organizations like this who are scaling rapidly with us and where the infrastructure and the Google artificial intelligence kind of magic allows us to optimize our systems for them. It's really unique. Talk about the announcement you've had on the storage side. You got compressed storage. Can you give a quick summary of what that's all about? Absolutely. I mean, coming with additions and auto scaling, customers now also have access to something called compressed storage. So simply put, what is compressed storage? It's basically a way for us to take care of a higher compression level on your data. So you pay less on the storage side. So a preview customers have seen compression rates of seven times up to 35 times. You know, in particular data, for instance, if you think about log, for instance, or log analytics, that's data that's highly compressible. And so you'll see as you watch some of the videos, like I think about a go car list, for instance, these companies are saving anywhere between 20 to 70% in their storage and compute bill. Because if you think about compressed storage, it also, because you're paying less and storing less, you're also helping in other parts of the stack where you're able to query differently. In the case of lower outforms, which I really love about, you know, the testimonial we got from Antoine is that it also is helping him with his sustainability goals because you're essentially storing less so your footprint, if you will, is more efficient. And so these innovations, the addition, the auto scaling compression to your point earlier, are really the platform that's going to enable companies to get to the next level with these data apps. Yeah, that carpet footprint is just as upside. You guys have great initiatives there. I think that's one of the benefits of cloud. I got to ask you about what you see this week. Okay, we were at MongoDB last week in New York City. This is kind of data week, kind of put Mongo in the mix because it just happened. But this week we got Snowflake having their event head to head with Databricks. Obviously theCUBE will be there. They're forcing analysts to choose sides kind of thing. It's a cage match kind of thing we see. It's a data week. We'd be at both simultaneously. And you've got an event in Seattle as well. This is like data week. So give us a scoop on what you have going on in Seattle this week and what are the focus? Well, I'm calling you actually today from Seattle. So I'm already there. Customers are coming in. And so we have this event. We call the data engineering and analytics day. You'll be, all of you will be able to join for free on Thursday morning. The keynote starts at 9 a.m. What you should know is that the first two days are actually in person here. We're invited our key customers and practitioners. It kind of connects with what you and Dave talked about in the last week's podcast where we really stay to the genesis of what these events need to be. At least for us, we feel like if we create a platform where it's built by the practitioners for the practitioners, we all get value from it. So what's going to happen throughout this week, essentially three things. One is of course, we are sharing our vision with our customers and they give us very frank, very direct feedback so together we can advance this platform we're building with them. Second is we're going to hear from these customers. And so they're going to tell us their best practices, their worst practices. And then the third thing which I'm really excited about is this creates a platform where they connect with each other. I'm not interested in being in the way of information with these customers. I think that's one of the challenges sometimes of these events is they turn into marketing and commercial events. That's really not our intent. This is a platform for customers. In fact, we have no marketing and no salespeople at this event. It's really the platform for customers who can help each other. And as we watch that, we also learn and build the next systems that they want us to build. Editions, auto-scale and compressed storage really came directly from our customer's feedback. And that's how we get better. And now these events too, when you have these major shifts in the market are where tribes kind of get reshuffled. People want to find their tribe. They want to find their community and they don't want marketing messages jammed down their throat. They want to be open, they're authentic. It's a very bottoms up market right now. And I think good to call out there. Yeah, we did bring up in the pod because we're seeing a lot of people do that in the events just to make money and they kind of structure it a little bit like losing touch with their customer. And as the brother podcast was great, we talked about all the top trends. And since I got you here, might as well ask you what you think you see as the top trends because there is a tsunami coming of this new wave. It's bigger than before. Data apps are a big part of it. We've been talking about that data products. Love to get your thoughts on it because you've been scratching this itch for over a decade, Bruno. Now we're on the beach and the waves are the biggest. And we're out there surfing them. What do you see as the major trends developing in the space? Yeah, so I learned a lot of that working with customers. I mean, what's been incredible to me is over the last three years, how we've been able to scale these platforms, right? Cause when I talked to you about these events, this is not 10 people getting together. This is thousands of people, understanding the space and building the next generation product. So what I'm learning from my customers is three mega trends are changing their world. The first trend is you're not building a data lake anymore, right? And that was actually directly from the CIO Vodafone who told me, the great thing about a lake is that it's defined. I see the end of the water, but in a way it's nowhere near the reality of my data. My data looks more like a data ocean. I never see the end of my data. And I know that some of the data I'm gonna need is gonna be in somebody else's environment, maybe because I've acquired them, maybe because I need to partner with them and so forth. So this first idea of the data ocean is a key trend that we're seeing customers really gravitate towards. So what does that mean? Multi-cloud platform by default. Transactional and analytical workload coming together, right? So it's no longer two different workloads. They have to be able to come together to make it easy for you to build apps. The ability to catalog your data rapidly as it hits the system is paramount because customers need to be able to trust the data, the metadata that they're bringing into their environment and data sharing as a key principle is tremendously important because as I said earlier, you need to have an ability to go out and get data in other systems very seamlessly so you can complete and register information. So that's trend one. The second trend we see is around what I call governance with a big G, right? We used to think about governance as this ability to restrain access, but in fact, the way that people are working governance is they actually think about how do I create pockets of innovation across my organization with decentralized data, but centralized policies really, really hard to do. Luckily, about two years ago, we shipped a product called Dataplex, which really marries itself well with BigQuery. So it allows you to auto discover metadata from the data all the way up to business intelligence. So it reads metadata and looker. And so this idea of understanding your estate is really important to gravitate around this concept of data mesh, which we've heard a lot over the last two years. And the third trend is what you were talking about, John, this idea of building these really intelligent data apps. And that was a big part of powering that for us as looker. So if you think about the Stackforce, BigQuery, Dataplex, and Looker, what really customers are trying to do is they're trying to turn their relationship with their organization from a spending time, spending money, securing, restricting access to opening up access and creating an artifact where you, as a data team, you're actually creating value, bottom line value for your organization and the best artifact for that is creating a data product. Data products need a lot of components, but the main one is consumer grade experience on an enterprise grade platform. You got to build on a platform that will never go down, highly vertically integrated, so you get best price performance. That's what we're focused on. That's an awesome vision. I love that hot take because you want the horizontal scale of cloud, availability of data, protected, governed, but also vertically integrated at the app. This idea of data products is interesting, and I want to get your reaction because we've riffed on this on theCUBE before, but it's kind of playing out in real time here in plain sight in the industry, and that is you have data engineering and now you got data products. Remember back in the cloud, you had SRE, Site Reliability Engineers, and then you had developers. What their job was to set up the guardrails for the developers to code in line in the CICD pipeline and do infrastructure as code. DevOps. Okay, now you're seeing similar pattern. The engineers, data engineers set up the guardrails so that the developers can program with data, and they need the products. So you start to see this trend where you have data engineering, data products, and now data developers. What's your reaction to that? Because that almost completes the stack of the persona. You get the engineers which then can be automated by the way with AI and managed and scaled. Data products could be turnkey, consumer grade, like you mentioned, real time for this, maybe more historical for that. Mix and match your Lego blocks, whatever you want to call it, but the coders are now coding in the applications at the point of code. The data developer. What's your reaction to that? Well, you know what I've learned from customers, watching them build these data products is, first of all, there's three dimensions to what a data product is. I think the first one is, you got a design for what we talked about here, limitless data, right? Your data products need to be able to handle structured, unstructured, semi-structured. You need to be able, for instance, to imagine scenario, where you're going to do machine learning on unstructured data. So how do you do that? And so I think that's probably step one is, think about it as any type, any volume, multi-cloud, open infrastructure for your data. So step one, step two is there's dimension of time. You just talked about it here. You can't afford to not build on real time now, right? So this idea of real time, high concurrency is really important to build these next generation data products. And the third one is what you talked about, people, right? So you need to start thinking about different roles. We learned from customers that the teams, the data teams that they build, are really starting to look more like software engineering teams. You have a data product manager who's going to write the product requirement document, right, the PRD. And what they are supposed to be is the CEO of this product from ingestion all the way to activation. They lead the team towards what's the output of this data? What are we creating? You have a program manager who helps them drive the development of this product. You have a data engineering team that helps them implement. Now what's great about infrastructure like ours, you don't need a lot of administration, right? So data engineers can now rely on a platform like in the case of Spark, for instance, Serverless Spark, they can just work with Spark without having to worry about the infrastructure and we'll just charge them for their use. So very unique kind of transformation of the data engineering role. We have the UX leader who is really focused on earlier, we talked about consumer grade experience. The best UI is no UI. So you need someone to think the audience for these data products are not data experts. So they need to have a consumer feel to them. And then finally, the chief data officer who needs to be driving the strategy around that. The good news here is that 10 years ago when you and I started talking about this, John, 10%, 12% of organizations had shifted officers. Now about 70, 80% of organizations have cheated their officers. So we really started to see kind of this group, if you will, in this typical team around data products building and it's really encouraging because people like us, we've been talking about this space for a long time, nobody cared because we work back office and now finally everybody cares. I think, I mean, I go back to 2007. I remember saying, you know, this is going to be a developer angle. We saw a big data come in. We were all early. We're all, I think all the data folks have been early on this. It's the timing of how cloud and everything kind of comes together, the confluence, how the world spun is really kind of key. And I think now the data nerds, the data geeks, the data hardcore data ops folks, they're into this. We're in a prime time moment and it's built on top of clouds. It's not a bolt-on. It's great. So it's going to be an abstraction. I love your vision. I think that, you know, DevSecOps is here. He's seeing more of that. I think security has went through this. They had that team. Now they're part of the engineering. They put guardrails up for developers. People are shifting left. Now data is in there. So is it going to be, is it going to be DevSec data ops? Is that the full validation? Certainly we'll see. Yeah, I mean, it's really exciting to be in 2023 in the data world. The industry really now has taken advantage of all the innovation. That's really we've been working on for decades and it's all coming together for the acceleration of this value. And I think also chief data officers, CEOs, CIOs, CEOs are now paying attention to data as building their differentiation on it. I mean, you said it earlier when we started this idea of a data mode, you know, 11 years ago when you and I started talking about this, I don't think you could get CEOs to care about that. Now they're realizing that this is a key differentiation for their organization. There's one more company to tell you about in a car four. Car four is this retailer in France. They got 80 million loyalty customers and they realized that their business actually was in the activation of these users and these consumers are coming to their stores. So they created a different company called Car Forward Links, which is about how do we create more compelling experiences? How do we partner with the rest of the ecosystem? So we know more about these people coming into our stores and provide them with the products that they need. You know, retail is doing this, financial services is doing this. I think every industry will get to building their data mode and then, you know, hopefully they'll choose us to do that because we've been thinking about this and building for this moment for over a decade. Data is the lifeblood of the company. It's their competitive advantages. It's their intellectual property. You got to take care of it. They're going to blend it in with these large proprietary data models. They got to be more agile. They got to be more predictive, flexible. Bruno, thanks for coming on theCUBE. I know we went a little bit over but it was great conversation. Last minute, give you the final plug for the last 30 seconds to a minute. What's the pitch? What's the Google Cloud Analytics pitch? What's BigQuery's, BigQuery's pitch? How would you share the value proposition statement for the world? Absolutely. Well, first of all, thanks for having me, John. It's always a great fun to talk to you. I really say that, look, if you're looking to build a platform where you are going to generate value for your organization, we're trying to be the best partner for you across limitless data, limitless workloads and limitless users. There's really only a few vendors that can do that for you. And you'll talk to many of our customers. You can watch the video series and hear directly from them what they're doing. So we together can help power the next generation of data apps. Bruno, thanks for this masterclass conversation. You nailed it. I love it. And I think there's a lot more coming. This is just the beginning of how the world is changing and data's at the center of the value proposition. We've been saying it for 11 years and it's good to see you. Thanks for coming on and congratulations to all the work you're doing as head of data analytics at Google. Great to see you. Great to see you. Thanks, John. Okay, this is a cute conversation. I'm John Furrier here at Palo Alto, going up to Seattle where Bruno Aziz, the head of data analytics at Google Clouds, having his event with data engineering in the community as this next-gen cloud data tsunami of AI-powered applications is coming fast and every company is trying to refactor and figure out how to develop it and then how to run AI applications all powered by data. Of course, theCUBE, we're open, we're data-driven. I'm John Furrier, host. Thanks for watching.