 From the SiliconANGLE Media Office in Boston, Massachusetts, it's theCUBE. Now, here's your host, Dave Vellante. Hi everybody, welcome to the special CUBE conversation. You know, big data workloads have evolved and the infrastructure that runs big data workloads is also evolving. Big data, AI, other emerging workloads need infrastructure that can keep up. Welcome to the special CUBE conversation with Patrick Osborn, who's the Vice President and GM of Big Data and Secondary Storage at Hewlett Packard Enterprise at Patrick underscore Osborn. Great to see you again, thanks for coming on. Great, love to be back here. So, as I said upfront, big data's changing, it's evolving and the infrastructure has to also evolve. What are you seeing Patrick and what's HPE seeing in terms of the market forces right now driving big data and analytics? Well, some of the things that we see in the data center, there is a continuous move to move from bare metal to virtualize everyone's on that train to containerization of existing apps, so your apps of record, business, mission critical apps, but really what a lot of folks are doing right now is adding additional services to those applications, those data sets, so new ways to interact, new apps, and a lot of those are being developed with a lot of techniques that revolve around big data and analytics. So we're definitely seeing the pressure to modernize what you have on-prem today, but you can't sit there and be static and you got to provide new services around what you're doing for your customers and a lot of those are coming in the form of this mode two type of application development. One of the things that we're seeing, everybody talks about digital transformation, it's the hot buzzword of the day, to us digital means data first. Presumably you're seeing that, are organizations organizing around their data and what does that mean for infrastructure? Yeah, absolutely, so we see a lot of folks employing not only technology to do that, they're doing organizational techniques, so peak teams bringing together a lot of different functions. Also too, organizing around the data has become very different right now that you've got data on the edge, it's coming into the core, a lot of folks are moving some of their edge to the cloud or even their core to the cloud, so you got to make a lot of decisions being able to organize around a pretty complex set of places physical and virtual where your data's going to lie. So there's a lot of talk too about the data pipeline. The data pipeline used to be, you had an enterprise data warehouse and the pipeline was you'd go through a few people that would build some cubes and then they'd hand off a bunch of reports. The data pipelines is, it's getting much more complex. You've got the edge coming in, you've got core, you've got the cloud which can be on-prem or public cloud. Talk about the evolution of the data pipeline and what that means for infrastructure and big data workloads. Yeah, so for a lot of our customers, we've got a pretty interesting business here at HPE. We do a lot with the Intelligent Edge, so our edge line servers in Aruba where a lot of the data is sort of sitting outside of the traditional data center and then we have what's going on the core which for a lot of customers they are moving from either traditional EDW right or even Hadoop 1.0 if they started that transformation five to seven years ago to a lot of things are happening now in real time or a combination thereof. So the data types are pretty dynamic. Some of that is always getting processed out in the edge. Results are getting sent back to the core. We're also seeing a lot of folks move to real-time data analytics or some people call fast data that sits in your core data center. So utilizing things like Kafka and Spark and a lot of the techniques for persistent storage are brand new and all what it boils down to is it's an opportunity but it's also very complex for our customers. What about some of the technical trends behind what's going on with big data? I mean you've got sprawl with both data sprawl. You've got workload sprawl. You got developers that are dealing with a lot of complex tooling. What are you guys seeing there in terms of the big mega trends? We have, as you know, HPE has quite a few customers in the mid-range in enterprise segments and we have some customers that are very tucked forward. So a lot of those customers are moving from this Hadoop 1.0, Hadoop 2.0 system to a set of essentially mixed workloads that are very multi-tenant. So we see customers that have essentially a mix of batch-oriented workloads. They have now they're introducing these streaming type of workloads to folks who are bringing in things like TensorFlow and GP GPUs and are trying to apply some of the techniques of AI and ML into those clusters. So what we're seeing right now is that that is causing a lot of complexity not only in the way you do your apps but the number of applications and the number of tenants who use that data. So it's getting used all day long for various different. So now what we're seeing is it's grown up. It started as an opportunity, a science project, a POC to now it's business critical and becoming now it's very mission critical for a lot of the services that drives. Am I correct that those diverse workloads used to require a sort of bespoke set of infrastructure that was very siloed and I'm inferring that technology today will allow you to bring those workloads together on a single platform, is that correct? Yeah, so a couple of things that we offer and we've been helping customers to sort of get off the complexity train but provide them flexibility and elasticity is a lot of the workloads that we did in the past were either very vertically focused and integrated. So one app server networking storage to the beginning of the analytics phase is really around symmetrical clusters and scaling them out. Now we've got a very rich and diverse set of components and infrastructure that can essentially allow customer make a data lake. It's very scalable, compute, storage oriented nodes, GPU oriented nodes, so it's very flexible and helps us, you know, helps customers take complexity out of their environment. So in thinking about when you talk to customers what are they struggling with specifically as it relates to infrastructure? Again, we talked about tooling, I mean Hadoop is well known for the complexity of the tooling but specifically from an infrastructure standpoint what are the big complaints that you hear? So a couple of things that we hear is that my budget's flat, right? For the next year, a couple years, right? So when we talked earlier in the conversation about I have to modernize, virtualize, containerizing my existing apps that means I have to introduce new services as well with a very different type of, you know, DevOps, you know, mode of operations. That's with all with the existing staff, right? So that's the number one issue that we hear from the customer. So any that we can do to help increase the velocity of deployment, right? Through automation, we hear now, frankly, the battle is for whether I'm going to run these type of workloads on-prem versus off-prem. So we have, you know, a set of technology as well as services, enabling services with point next, you remember the acquisition we made around cloud technology partners to sort of right place where those workloads are going to go and become like a broker in that conversation, assist customers to make that, to make that transition, and then ultimately, you know, give them an elastic platform that's going to scale for this diverse set of workloads that's well-known, sized, easy to deploy. As you get all this data, the data's, you know, Hadoop sort of blew up the data model, said, okay, we're going to bring the, leave the data where it is, we'll bring the compute there. And you had a lot of Skunkworks projects growing. What about governance, security, compliance? As you have data sprawl, how are customers handling that challenge? Is it a challenge? Yeah, it certainly is a challenge. And we've gone through it just recently with, you know, GDPR is implemented, right? So you got to think about how that's going to fit into your workflow and certainly security. The, you know, the big thing that we see certainly is around, you know, if the data's residing outside of your traditional data center, right? That's a, you know, it's a big issue, right? So for us, when we have edge line servers, certainly a lot of things are coming in over wireless, you know, there's a big build out and advent of 5G, you know, coming out. That certainly is an area that customers are very concerned about in terms of who has their data, who has access to it, how can you tag it, how can you make sure it's secure? So that's a big part of what we're trying to provide here at HPE. What specifically is HPE doing to address these problems? Products, services, partnerships, maybe you could talk about that a little bit. Maybe even start with, you know, what's your philosophy on infrastructure for big data and AI workloads? Yeah, so I mean, for us, we have, you know, we've over the last two years have really concentrated on, you know, essentially two areas. We have the intelligent edge, right? Which is, you know, certainly it's been enabled by fantastic growth with our Aruba products and the networking space and our edge line systems. So being able to take that type of compute and get it as far out to the edge as possible. And the other piece of it is around making hybrid IT simple, right? So in that area, we wanna provide a very flexible, yet easy to deploy set of infrastructure for big data and AI workloads, right? So we have this concept of the elastic platform for analytics, right? So it helps customers deploy that, you know, for a whole myriad of requirements. Very compute-oriented, storage-oriented, GPUs, cold and warm data lakes, you know, for that matter. And the third area that we're really focused on is the ecosystem that we bring to our customers as a portfolio company is evolving rapidly, right? As you know, in this big data and analytics workload space, the software development portion of it is super dynamic, right? So if we can bring a vetted, well-known ecosystem to our customers as part of a solution with advisory services, that's definitely one of the key pieces that our customers love to come to HP for. What about partnerships around things like containers and simplifying the developer experience? Yeah, so, I mean, we've been pretty public about some of our efforts in this area around one sphere, right? And some of these models around certainly advisory services in this area with some recent acquisitions. So for us, it's all about automation and we wanna be able to provide that experience to the customers, whether they want to develop those apps and deploy on-prem, you know, we love that, right? I think you guys tag it as true private cloud. But we know that, you know, the reality is most people are embracing very quickly a hybrid cloud model and given the ability to take those apps, develop them, put them on-prem, run them off-prem is pretty key for one sphere. I remember Antonio Neary when you guys announced Apollo and you had the astronaut there. And Antonio was just a lowly GM and VP at the time. And now he's, of course, CEO. So he knows what's in the future. And so, but Apollo, you know, generally at the time, I was like, okay, this is a high-performance computing system, we've talked about those worlds, HPC and big data coming together. Where does a system like Apollo fit in this world of big data workloads? Yeah, so we have a very wide product line for Apollo that helps, you know, some of them are very tailored to specific workloads if you take a look at the way that people are deploying these infrastructures now, multi-tenant with many different workloads, right? So we allow for some compute-focused systems like the Apollo 2000. We have very balanced systems, the Apollo Ford 200 that allow a very good mix of CPU, memory, and now customers are certainly moving to flash and storage class memory for these type of workloads. And then Apollo 6500, which is some of the newer systems that we have, big memory footprint, NVIDIA, GPUs, allowing you to do very high calculations, rates for AI and ML workloads. And we take that and we aggregate that together. And we've made some recent acquisitions like Plexi, for example. A big part of this is around simplification of the networking experience, right? So you can probably see into the future of automation at the networking level, automation at the compute and storage level, right? And then having a very large and scalable data lake for customers, data repositories, object, file, HDFS, some pretty interesting trends in that space. Yeah, I'm actually really super excited about the Plexi acquisition. I think it's because of flash. Used to be the bottleneck was the spinning disk. Flash changes, pushes the bottleneck largely to the network. Plexi can allow you guys to scale. And I think actually LeapFrog, some of the other hyper-converged players that are out there. Super excited to see what you guys do with that acquisition. But so what it sounds like, your focus is on optimizing the design for IO. Yep. I'm sure flash fits in there. Absolutely. As well. And that's a huge accelerator for, even when you take a look at our storage business, right? So 3PAR, Nimble, all flash, certainly moving to NVMe and storage class memory for acceleration of other types of big data databases, right? Even though we're talking about Hadoop today right now, certainly SAP HANA, scale out databases, Oracle, SQA, all these things play a part in the customer's infrastructure. Okay, so you were talking before about, a little bit about GPUs. What is this HPE elastic platform for big data analytics? What's that all about? Yeah, so I mean, we have a lot of the sizing and in scalability falls on the shoulders of our customers, right? In this space, especially in some of these new areas. So what we've done is, we have a, it's a product slash a concept. And what we do is we have this, it's called the elastic platform for analytics and allows with all those different components that I rattle off, all great systems in their own, but when it comes to very complex multi-tenant workloads, what we do is try to take the mystery out of that for our customers, to be able to deploy that cookie cutter module, and we're even going to get to a place pretty soon where we're able to offer that, as a consumption-based service, right? So you don't have to choose for an elastic type of acquisition experience between on-prem and off-prem. So we're going to provide that as well. So it's not only a set of products, it's reference architectures. We do a lot of sizing with our partners. So the Hortonworks, Cloud Eras, MapRs, and a lot of the things that are out in the open source world, so it's pretty good. So we've been covering big data, as you know, for a long, long time. The early days of big data was like, oh, this is great. We're just going to put white boxes out there and off the shelf, you know, storage. Well, that changed as big data got, workloads became more sort of enterprise mainstream, they needed to be enterprise ready. But my question to you is, okay, I hear you got products, you got services, you got perspectives of philosophy. Obviously you want to sell some stuff. What has HPE done internally? With regard to big data, how have you transformed your own business? Yeah, so for us, you know, we want to provide a really rich experience, not just the products. And to do that, you need to provide a set of services and automation. And what we've done is with products and solutions like InfoSight, we've been able to, we call it AI for the data center, or certainly the tagline of predictive analytics is something that Nimble's brought to the table for a long time. So to provide that level of services, InfoSight, predictive analytics, AI for the data center, right, we're running our own big data infrastructure. So it started a number of years ago, even on our three-part platforms and other products where we had scale-up databases, we moved and transitioned to batch-oriented Hadoop. Now we're, you know, fully embedded with real-time streaming analytics that come in every day, all day long, from our customers and telemetry. And we're using AI and ML techniques to not only improve on what we've done that's certainly automating for the support experience, right, and making it easy to manage the platforms, but now introducing things like learning, automation engines, recommendation engines for various things for our customers to take essentially the hands-on approach of managing the products and automate it and put it into the product. So for us, we've gone through a multi-phase, multi-year transition that's brought in things like Kafka and Spark and Elasticsearch. So we're using all of these techniques in our system to provide new services for our customers as well. Okay, great. So you got your practitioners, you got some street cred. Absolutely. Can I come back on InfoSight for a minute? You came through an acquisition of Nimble. It seems to us that you're a little bit ahead, maybe you say a lot a bit ahead of the competition with regard to that capability. How do you see it? Where do you see InfoSight being applied across the portfolio and how much of a lead do you think you have on competitors? So I'm paranoid, so I don't think we ever have a good enough lead, right? So you've always got to stay grinding on that front. But we think we have a really good product and it speaks for itself. A lot of the customers love it. We've applied it to 3PAR, for example. So we came out with some, we have VM Vision for 3PAR that's based on InfoSight. We've got some things in the works, right? For other product lines that are imminent pretty soon. You can think about what we've done for Nimble and 3PAR, we can apply similar type of logic to elastic platform for analytics, like running at that type of cluster scale to automate a number of items that are pretty pedantic for the customers to manage. So there's a lot of work going on with an HPE to scale that as a service that we provide with most of our products. Okay, so where can I get more information on your big data offerings and what you guys are doing in that space? Yeah, so we have, you can always go to hp.com slash big data. We've got some really great information out there. We're in our run up to our big end user event that we do every June in Las Vegas. It's HPE Discover. We have about 15,000 of our customers and trusted partners there and we'll be doing a number of talks. I'm doing some work there with British Telecom, where I get some great talks. Those will be available online virtually. So you'll hear about not only what we're doing with our own InfoSight and big data services but how other customers like BT and 21st Century Fox and other folks are applying some of these techniques and making a big difference for their business as well. That's June 19th to the 21st. We're at the Sands Convention Center in between the Palazzo and the Venetian. So it's a good conference. So definitely check that out live if you can or if not, you can all watch online. So excellent, Patrick. Thanks so much for coming on and sharing with us this big data evolution and we'll be watching. Yeah, absolutely. And thank you for watching everybody and we'll see you next time. This is Dave Vellante for theCUBE.