 Welcome back to theCUBE's coverage of HPE's GreenLake announcements. You know, we're seeing the transition of Hewlett Packard Enterprise as a company. Yes, they're going all in for as a service, but we're also seeing a transition from a hardware company to what I look at, increasingly, as a data management company. We're going to talk today to Vishal Lal, who's GreenLake Cloud Services Solutions at HPE and Matt McCow, who's a global field CTO at Esmeral Software at HPE. Gents, welcome back to theCUBE. Good to see you again. Thank you for having us here. Thanks, Dave. Vishal, let's start with you. What are the big mega trends that you're seeing in data? When you talk to customers, when you talk to partners, what are they telling you? What's your optics say? Yeah, I mean, I would say the first thing is data is getting even more important. It's not that data hasn't been important for enterprises. But as you look at the last, you know, I would say 24 to 36 months has become really important, right? And it's becoming important because customers look at data and they're trying to stitch data together across different sources, whether it's marketing data, it's supply chain data, it's financial data. And they are looking at that as a source of competitive advantage. So customers who are able to make sense out of the data, enterprises that are able to make sense out of that data really do have a competitive advantage, right? And they actually get better business outcomes. So that's really important, right? If you start looking at kind of the, where we are from an analytics perspective, I would argue we are in maybe the third generation of data analytics. Kind of the first one was in the 80s and 90s with data warehousing, kind of EDWs. A lot of companies still have that, I think of Teradata, right? The second generation more in the 2000s was around data lakes, right? And that was all about Hadoop and others. And really kind of, you know, the difference between the first and the second generation was the first generation was more around structured data, right? Second became more about unstructured data, but you really couldn't run transactions on that data. And I would say now we are entering this third generation which is about data lake houses, right? So this is what, you know, customers, what they want really is, or enterprises, what they want really is, they want structured data, they want unstructured data all together. They want to run transactions on them, right? They want to use the data to mine it for machine learning purposes, right? Use it for SQL as well as non-SQL, right? And that's kind of where we are today. So that's really what we are hearing from our customers in terms of at least the top trends. And that's how we are thinking about our strategy in context of those trends. So lake house, use that term. It's an increasing popular term, it connotes, okay, I've got the best of data warehouse and I've got the best of data lake. I'm going to try to simplify the data warehouse and I'm going to try to clean up the data swamp if you will, Matt. So talk a little bit more about what you guys are doing specifically and what that means for your customers. Well, what we think is important is that there has to be a hybrid solution that organizations are going to build their analytics, they're going to deploy their algorithms, where the data either is being produced or where it's going to be stored. And that could be anywhere. That could be in the trunk of a vehicle. It could be in a public cloud or in many cases it's on premises in the data center. And where organizations struggle is they feel like they have to make a choice and a trade-off going from one to the other. And so what HPE is offering is a way to unify the experiences of these different applications, workloads and algorithms, while connecting them together through a fabric so that the experience is tied together with consistent security policies, not having to refactor your applications and deploying tools like Delta Lake to ensure that the organization that needs to build a data product in one cloud or deploy another data product in the trunk of an automobile can do so. So Vishal, I wonder if you could talk about some of the patterns that you're seeing with customers as you go out and deploy solutions. Are there industry patterns? Are there any sort of things you can share that you're discerning? Yeah, absolutely. As we kind of hear back from our customers across industries, I think the problem sets are very similar, right? Whether you look at healthcare customers, right? You look at telco customers, you look at consumer goods, financial services, they're all quite similar. I mean, what are they looking for? They're looking for making business value from the data, breaking down the silos that I think Matt spoke about just now, right? How do I stitch intelligence across my data silos to get more business intelligence out of it? They're looking for openness, right? I think the problem that's happened is over time people have realized that they're locked in with certain vendors or certain technologies. So they're looking for openness and choice, right? So that's an important one that we've at least heard back from our customers. The other one is just being able to run machine learning on algorithms on the data. I think that's another important one for them as well. And I think the last one I would say is, you know, TCO is important. As customers over the last few years have realized going to public cloud is starting to become quite expensive to run really large workloads on public cloud, especially as they want to address data. So cost performance trade-offs have started to become really important and starting to enter the conversation now. So I would say those are some of the key things and themes that we are hearing from customers cutting across industries. And you talked, Matt, about basically being able to essentially leave the data where it belongs, bring the compute to data. We talk about that all the time. And so that has to include on-prem. It's got to include the cloud. And I'm kind of curious on the edge, you know, where you see that? Because is that an eventual piece? Is that something that's actually moving in parallel? Is a lot of fuzziness as an observer in the edge? I think the edge is driving the most interesting use cases. The challenge up until recently has been, well, I think it's always been connectivity, right? Whether we have poor connection, little connection or no connection, being able to asynchronously deploy machine learning jobs into some sort of remote location, whether it's a very tiny edge or it's a very large edge like a factory floor, the challenge, as Vishal mentioned, is that if we're going to deploy machine learning, we need some sort of consistency of runtime to be able to execute those machine learning models. Yes, we need consistent access to data, but consistent access in terms of runtime is so important. And I think Hadoop got us started down this path, the ability to very efficiently and cost effectively run large data jobs against large data sets. And it attempted to work into the open source ecosystem, but because of the monolithic deployment, the tightly coupling of the compute and the data, it never achieved that cloud native vision. And so what Ezraal and HPE through GreenLake Services is delivering with open source-based Kubernetes, open source Apache Spark, open source Delta Lake Libraries, those same cloud native services that you can develop on your workstation, deploying your data center in the same way you deploy through automation out at the edge. And I think that is what's so critical about what we're gonna see over the next couple of years. The edge is driving these use cases, but it's consistency to build and deploy those machine learning models and connect it consistently with data. That's what's gonna drive organizations to success. So you're saying you're able to decouple compute from the storage? Absolutely, that you wouldn't have a cloud if you didn't decouple compute from storage. And so, and I think this is sort of the demise of Hadoop was forcing that coupling. We have high speed networks now, whether I'm in a cloud or in my data center, even at the edge, I have high performance networks. I can now do distributed computing and separate compute from storage. And so if I want to, I can have high performance compute for my really data intensive applications. And I can have cost effective storage where I need to. And by separating that off, I can now innovate at the pace of those individual tools in that open source ecosystem. So can I stay on this for a second? Because you certainly saw Snowflake popularize that. They were kind of early on, I don't know if they're the first, but they certainly one of the most successful. Then you saw Amazon Redshift copied it. And Redshift was kind of a bolt-on. What essentially they did is they tiered off. You could never turn off the compute. You still had to pay for a little bit compute. You know, that's kind of interesting. Snowflake's got the T-shirt sizes, so there's trade-offs there. There's a lot of ways to skin the cat. How did you guys skin the cat? What we believe we're doing is we're taking the best of those worlds. Through greenlight cloud services, the ability to pay for and provision on demand, the computational services you need. So if someone needs to spin up a Delta Lake job to execute a machine learning model, you spin up that. We're, of course, spinning that up behind the scenes. The job executes, it spins down and you only pay for what you need. And we've got reserve capacity there. So you, of course, just like you would in the public cloud. But more importantly, be able to then extend that through a fabric across clouds and edge locations so that if a customer wants to deploy in some public cloud service, like we know we're going to, again, we're giving that consistency across that and exposing it through an S3 API. So Vishal, at the end of the day, I mean, I'd love to talk about the plumbing and the tech, but the customer doesn't care, right? They want the lowest cost. They want the fastest outcome. They want the greatest value. My question is, how are you seeing data organizations evolve to sort of accommodate this third era of this next generation? Yeah, I mean, the way at least, you know, kind of look at from a customer perspective, what they're trying to do is, first of all, I think Matt addressed it somewhat. They're looking at a consistent experience across the different groups of people within the company that do something with data, right? It could be SQL users, right? People who are just writing SQL code. It could be people who are writing machine learning models and running them. It could be people who are writing code in Spark, right? Right now, they are, you know, the experience is completely disjointed across them, across the three types of users or more. And so that's one thing that they're trying to do is just try to get that consistency, right? We spoke about performance. I mean, the disjointedness between compute and storage does provide agility because their customers are looking for elasticity, right? How can I have an elastic environment? So that's kind of the other thing they're looking at. And performance and TCO, I think a big deal now, right? So I think that's definitely on customer's mind. So as enterprises are looking at their data journey, those are at least the attributes that they are trying to hit as they organize themselves to make the most out of the data. Matt, you and I have talked about this sort of trend to the decentralized future. We're sort of hitting on that. And whether it's a first gen data warehouse, second gen data lake, data hub, bucket, whatever, that essentially should ideally stay where it is, wherever it should be from a performance standpoint, from a governance standpoint and a cost perspective. And just be a node on this, you know I like the term data mesh, but be a node on that. And essentially allow the business owners, those with domain context to, you've mentioned data products before, to actually build data products, maybe air quotes, but a data product is something that can be monetized, maybe it cuts costs, maybe it adds value in other ways. How do you see HPE fitting into that long-term vision, which we know is going to take some time to play out? I think what's important for organizations to realize is that they don't have to go to the public cloud to get that experience they're looking for. Many organizations are still reluctant to push all of their data. They're critical data that is going to be the next way to monetize their business into the public cloud. And so what HPE's doing is bringing the cloud to them, bringing that cloud from the infrastructure, the virtualization, the containerization, and most importantly those cloud native services. So they can do that development rapidly, test it using those open source tools and frameworks we spoke about. And if that model ends up being deployed on a factory floor on some common x86 infrastructure, that's okay, because the lingua franca is Kubernetes. And as Vishal mentioned, Apache Spark, these are the common tools and frameworks. And so I want organizations to think about this unified analytics experience where they don't have to trade off security for cost, efficiency for reliability. HPE through GreenLake cloud services is delivering all of that where they need to do it. What about the speed to quality trade off? Have you seen that pop up in customer conversations and how are organizations dealing with that? Well, I guess it depends on what you mean by speed. Do you mean computational speed? No, accelerating the time to insights, if you will. We got to go faster, faster, agile to the data. And it's like, well, move fast, break things. Whoa, whoa, what about data quality and governance? They seem to be at odds. Yeah, well, because the processes are fundamentally broken. You've got a developer who maybe is able to spin up an instance in the public cloud to do their development. But then to actually do model training, they bring it back on premises. But they're waiting for a data engineer to get them the data available. And then the tools to be provisioned, which is some esoteric stack and then runtime is somewhere else. The entire process is broken. So again, by using consistent frameworks and tools and bringing that computation to where the data is and sort of blowing this construct of pipelines out of the water, I think is what is going to drive that success in the future. A lot of organizations are not there yet, but that's, I think, aspirationally where they want to be. Yeah, I think you're right. I think that is potentially an answer as to how you, not incrementally, but revolutionize sort of the data business. Last question is to talk about GreenLake, how this all fits in. Why GreenLake? Why do you guys feel as though it's differentiable in the marketplace? So, I mean, something that you asked earlier as well, time to value, right? I think that's a very important attribute and kind of design factor as we look at GreenLake. If you look at GreenLake overall, right, kind of what does it stand for, right? It stands for experience, right? How do we make sure that we have the right experience for the users, right? We spoke about it in context of data. How do we have a similar experience for different users of data, but just broadly across the enterprise. So it's all about experience, how do you automate it, right? How do you automate the workloads? How do you provision fast, right? How do you give folks a cloud, an experience that they have been used to in the public cloud on using an Apple iPhone, right? So it's all about experience. I think that's number one. Number two is about choice and openness, right? I mean, as we look at GreenLake, it's not a proprietary platform. We are very, very clear that the design, one of the important design principles is about choice and openness. And that's the reason you hear us talk about Kubernetes, about Patchespark, about Delta Lake, et cetera, et cetera, right? We're using kind of those open source models where customers have a choice, right? If they don't want to be on GreenLake, they can go to public cloud tomorrow, right? Or they can run in our colos if they want to do it. That way, or in their colos if they want to do it, right? So they should have the choice. Third is about performance. I mean, what we've done is it's not just about the software, but we as a company have a know-how to configure infrastructure for that workload, right? And that's an important part of it, right? I mean, if you think about the machine learning workloads, we have the right Nvidia chips that accelerate those transactions, right? So that's kind of the last, the third one. And the last one I think, you know, as I spoke about earlier is cost. We are very focused on TCO. But from a customer perspective, we want to make sure that we are giving a value proposition, which is just not about experience and performance and openness, but also about cost. So if you think about GreenLake, that's kind of the value proposition that we bring to our customers across those four dimensions. Guys, great conversation. Thanks so much. Really appreciate your time and insights. Thanks for having us here, David. All right, you're welcome. And thank you for watching everybody. Keep it right there for more great content from HPE's GreenLake announcements. You're watching theCUBE.