 Live from Las Vegas, it's theCUBE. Covering Edge 2016, brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. We're back, this is theCUBE, the Worldwide Leader in Live Tech Coverage. Bernie Spang is here, he's the Vice President of Software-Defined Infrastructure. Cube alum, Bernie, it's great to see you again. Thanks very much for taking some time with us. Good to be back, great to be back. So, software-defined, what is software-defined infrastructure? Software-defined infrastructure is the topic. So, and I'm focused on software-defined computing and storage, primarily. So when we say software-defined, we're talking about software that can be deployed on a variety of servers and can work with a variety of storage media from various providers, so IBM and non-IBM providers. And we make the software available for clients to implement themselves and put them on their own servers and put it together. Or through partners or IBM integrates that software with storage and servers and sell it as a system. And we also make that software available on the IBM cloud and through other cloud service providers so it can be accessed as a service. Okay, so, we're talking off camera. Isn't that software-defined computers and what is software-defined computers? Isn't that just virtualization? So that's the easiest one for people to understand. We had early days of client server. You bought a server for each new application. We had server sprawl, highly inefficient, waste of money. We created software-defined computing in the form of hypervisors and virtual machines. And that's great. And those traditional workloads, still you want to virtualize them in those kind of environments. When you look at the new generation of workloads that are on open source type projects like NoSQL databases like Mongo, DB and Cassandra or the new generation analytics frameworks like Apache Hadoop and Apache Spark, these are scale out architectures. You don't load them into AVM or onto a server. They span dozens, in some cases, hundreds and thousands of nodes. And so now you need a software layer that can schedule the workload across your compute resources whether they be bare metal, VMs, or container environments now with the latest generation of microservices-based architectures. So you need a software layer that virtualizes that complete compute grid to run multiple workloads. It sounds a little bit, we had a term we put out a couple of years ago at Wikibon called ServerSAN. And the first instantiation most people look at is hyperconverged infrastructure. So VMware, VSAN, Nutanix and the like, they have, there's compute in there and there's storage in there, but it's distributed to architecture driven by software. So is that to fit into, how does that fit in with kind of, because you've got the compute and you've got the storage pieces? Well it depends on the workload and how you use it. So if you have a hyperconverged environment, and you're running multiple VMs and the VMs are sharing the storage, if the workloads you're putting into those VMs are traditional workloads that go into a VM, that's not what I'm talking about. Now, you could have that environment where I'm going to run a multi-node solution across my VMs in that hyperconverged environment, in which case then it would count with what I'm talking about. But now how do you schedule multiple scale out workloads on that hyperconverged infrastructure? That's what I'm talking about. It's a layer of software-defined computing above that that has to schedule the scale out workloads. Okay, and if I look at most public clouds they probably run something similar to this, correct? Yes, but you know we're finding with the new generation applications that we're working with, the cloud providers, there really hasn't been the focus on the shared infrastructure, these shared clusters if you will, for multiple workloads. So you're still seeing even amongst cloud providers a level of inefficiency because I go to the cloud, I define my 100 node cluster and I put my app on it. Now I have another app I want to do, I go to find another 50 node cluster and I put it on that. I don't have a shared environment that I'm doing on that cluster. Now we are working on the IBM cloud and with other cloud service providers like Microsoft Azure as an example to support this software layer I'm talking about to then virtualize the grid. Why the importance of, well, what does the shared environment get you? How do you utilize that? Is it a metadata sharing for high performance interactions? Is it for data protection? Talk about both how you share that data, what is that, some kind of high-speed pipe and why is it, what are you doing with that shared infrastructure? Okay, so there's two layers. There's the compute layer, what you're sharing, the spectrum computing software is scheduling the work on the compute nodes and it's scheduling the work across multiple types of work. So it's a resource negotiator. Yeah, it's a resource scheduler, right? That's time slicing, right? So it's a virtualization but for a grid, for cluster-based computing. And then there's the storage layer. So as you're doing this across this grid and the shared grid, you have to have shared access to the storage you need for that application, whether it be a Hadoop thing or a MongoDB thing or a Spark type workload, right? The Watson cognitive computing, the deep learning and machine learning type applications are also these kind of scale-out architectures. So you have to have the shared storage layer as well and that's where our spectrum scale comes in. So that's for efficiency, like you say, scale. Otherwise, I'm scaling in stovepipes and it just doesn't scale. Well, right, and if you talk about it from the compute point of view, you've got the cluster-cree problem and I've got poorly utilized clusters, right, that are sitting idle a lot of the times that aren't being shared with resource if you don't have this virtualization layer. From a storage point of view, you've got people talking about I want to create a data lake. Well, creating data lakes are all well and good but they're just bigger silos. You don't want multiple data lakes. You want your data ocean, as we've been saying. You said, you hung out with my friend, Eric Herzog. You know, I did the data ocean, brothers. That's Furrier, I think, started that. You know, you want to have an environment where all the data can be shared and accessed by multiple workloads. Well, how do you manage that, right? You can't do it with these hyper-converged clusters you have to have a shared infrastructure. The differentiation here, though, is the applications, as you said. I totally agree. Most hyper-converged infrastructure, if you took a traditional SAN and you took the HCI stuff, they're running the same apps. It's not those new applications that spark a dupe, those kind of things. Right. Where it's built out there. That's interesting. Yeah, and you know, I started this discussion with a lot of analysts six months, a year ago, at one of these conferences, about how the light bulb went on for us amongst the, what was the platform computing team? We acquired platform computing software that is now Spectrum Computing. This is software that's traditionally for high-performance computing, super-computing workloads, and Spectrum Scale, formerly GPFS, that has its origins among those workloads, too. We realize that these new generation apps and analytics and cognitive computing are just high-performance computing, super-computing workloads. It's just a new generation with a new name on it, big data analytics or cognitive computing. And it requires the IT infrastructure that you'd put under the traditional HPC workloads, not the IT infrastructure you'd put under a traditional relational database application. And the tricky thing is, most clients don't fully appreciate that yet. The light bulb, just within the last six months, the light bulbs are going off over people's heads and realizing the inefficiencies of using traditional IT for these new generation workloads. But the skill base is something we need to grow in the industry. So what's the size of the opportunity for this today? David Floyd, our CTO, has been talking about years about how HPC is going to be bleeding into enterprise. And obviously we see a lot of those interesting workloads there, but it's still a little bit of a niche in the overall marketplace, it seems. In the overall marketplace? Yes, I'd agree. But we're talking opportunities here at billions and rapidly growing. And it's really, one way you could look at big data analytics is it's just commercial HPC and A, high performance computing and analytics. It's really just bringing that out of the back rooms. And I talked to banking clients, insurance clients, manufacturing, healthcare, and we talked to their IT leaders and they're like, oh yeah, we've got that. We've got that over here. Like the genomics research guys are down the hall in a different IT infrastructure. The trading apps and analytics are a different thing over there. And it's like, you guys got to get together, right? You got to bring that knowledge base forward into the commercial analytics space. Okay, so we've tracked for a number of years how a lot of those applications don't fit in your typical virtualized environment. I gave a presentation at Interop years ago to IT people and it was like big data, why it's important, here's where it is and it's not sitting on your sand. So are things like containers going to allow us to have more of a common infrastructure or are we going to have kind of a separate form, my new apps as opposed to some of my other apps? Well certainly there's a separation of the traditional IT for the traditional apps, relational databases, sand architectures, that kind of thing. That doesn't go away. Just because there's a new thing for new generation workloads doesn't mean that the old stuff wasn't the right design point for those workloads. When you bring in containers and microservices based architectures that work with a container environment, that's yet another type of cluster or grid, right? It's just a different one. Now what we've done with Spectrum Computing and specifically the new Spectrum Conductor offering is we support the scheduling of workloads whether it's bare metal OS systems, virtual machines across hypervisors or containers in like a Docker container environment. So you can have a mix of things and this can be a heterogeneous environment and heterogeneous servers, x86 and power servers as an example. And it can be heterogeneous storage you use, the media types from flash to disk to tape and from different providers. You're doing it all at the software layer, gives greater flexibility to be able to absorb these new workloads and deploy these new technologies without having to do a rip and replace. Well we would agree the old stuff doesn't go away. At the same time, it doesn't get, captures much of the spending. If it doesn't capture much of the spending, the vendor community doesn't put as much R and D into it. The practitioners don't want to be part of that, right? They want to go on to the new stuff and it just sort of becomes this managed business. Okay, great. But the TAM for the new stuff, as you're pointing out is enormous. Because the spending shifts to the new. Oh that is true. Now, but one of the things that's exciting and why we're focused with our software-defined storage across both the traditional and the new generation is clients are still very much focused on optimizing the cost efficiency of the traditional workloads. There's a lot of cost to be taken out or reduced in the traditional environment by doing the storage virtualization, by doing the grid-based kind of hyper-converged environments where that's appropriate and by taking the cost out there, that's what's enabling our clients to be able to shift investments to the new generation. We fund that. Yeah. And how about, you and I have talked about real-time, anybody who's done any kind of analytics work is thinking about real-time and it's starting, Rosemilia talked about it in his keynote yesterday, the imperative of real-time. It's really, it's starting to happen, fraud detection, many other use cases, et cetera. So how does that fit in to the whole software-defined discussion? Well, I mean, it kind of comes back to, I mentioned Apache Spark and that's certainly one of our primary focus. In fact, our first version of Spectrum Conductor is Spectrum Conductor with Spark. We include the Apache Spark framework along with our compute and the multi-tier storage in that solution because we are seeing clients in financial services, healthcare, security and public sector moving to this environment. The real-time in-memory analytics capabilities of a Spark environment are critically important and we're seeing traditional high-performance computing analytics type applications evolve forward into versions that are written as Spark applications. Well, even traditional ETL, Extract Transform Load Software, we've seen a case where the software is now a Spark-based application to take advantage of the in-memory analytics optimization of that open framework. So I think this is going to become huge just beyond what traditionally what we think about is analytics. We got to talk to Red Bull Racing yesterday and they're one of your customers. Obviously, tons of data, lots of different sources, real interesting things they're doing. Can you share any other kind of marquee lighthouse examples that are new things that people are doing? Yeah, so the interesting thing about the Red Bull Racing is they evolve from the high-performance computing doing the design on the car to then taking that environment and appreciating the efficiency and the performance of it and then using that same infrastructure to do their in-race analytics from the telemetry coming off the sensors in the car. So that internet of things, real-time analytics of sensors and things coming off of meters is critically important as well as in financial services, the real-time trading and the real-time implications of sentiment analysis that can have. So we're seeing in financial services and there's another great case of they've had the traditional running these big complex Monte Carlo simulations for financial models and now they want to do all these new generation analytics with sensors and meters, with sentiment analysis off the internet and whatnot. Video analytics is another one, so they're bringing that forward. Healthcare life science is another one. Medical imaging, right? And not just the storage and the archiving of it but the analysis of it and then the patient analytics, other examples. You mentioned financial services, healthcare, public sector, that's where all the chief data officers are hanging out, right? Absolutely, right. We're doing an event on Friday in Boston with Bob Pitchiano's going to be there, we're going to be interviewing him and that whole compliance now becomes an issue and it's sort of the pendulum swinging back from skunkworks into traditional IT, really owning the governance and compliance and the platform, if you will. Do you see that? Oh yeah, definitely. So you need that governance and the security responsibility to centralize. You also need the virtualization that I've talked about to make it cost effective because you can't have all these skunkworks having poorly utilized IT infrastructure, so both of those factor in. And Bob used to work for Bob in the early days of Infosphere and big data analytics and our whole focus on the analytics team on working up with the data scientists and applying the software for these new workloads is important and that's an important connection for IBM and our clients and our partners that we have both pieces, including the Watson layer of analytics, but now we've got the IBM Cloud Spark service on the IBM Cloud in Bluemix and a number of the Watson services are examples of workloads that are running on our software to find infrastructure, the spectrum computing and storage infrastructure in the IBM Cloud to deliver that kind of efficiency. So there is that connection. Competition is shifting, right? We saw Dell acquire EMC, you've seen, you know, Oracle's sort of reemergence as a systems player. You're seeing some interesting, you know, dynamics of course with Amazon. When you talk to IBM, it's a different conversation. You guys talking about workloads, you talk about analytics a lot, you know, platform for cognitive. How do you compete? What's the narrative like when you're talking to customers? Why IBM? Well, certainly, you know, at the top level, you've heard our chairman say, you know, cognitive solutions and hybrid cloud platform company, right? And it's really that hybrid cloud infrastructure that I'm focused on our part of the business, what we're talking about here at Edge from the infrastructure point of view. And then from the cognitive computing, you know, that's where these new generation analytics, cognitive computing data analytics come into play. So, IBM is a differentiator having both of those pieces being focused from an infrastructure point of view on hybrid cloud, not just cloud, right? The integration and the proper balance of on-premises and in-cloud IT infrastructure is an important part of our differentiation. So, in talking about, you say not just cloud, okay, but what about cloud? What about just cloud for a second? Are you, is your stated intention to take this software-defined architecture with the sharing architecture, which in part, as we talked about, came out of the hyperscalers, you've improved on that. Is it the intent to drive that as aggressively as possible into the software, you know, infrastructure and provide that sort of homogeneity, whether it's on-prem or in the cloud? Right. Or is it more selective than that? Well, it can be an example. I mean, this is one of our public references. There was an IBM press release about Halliburton doing analytics for oil and gas on the IBM cloud in software. In fact, doing it on spectrum computing and spectrum scale software on the software infrastructure in the IBM cloud. And that is wholly in the IBM cloud on cloud solution. So, we are seeing it cases where it's totally want to be on the cloud, don't want to have any on-premises, capital expense, some the other way. But I would say, you know, 99% of the cases, it's a hybrid cloud infrastructure. Nobody's 100% 0% in neither direction. So, the IBM cloud is a distribution channel for you, whether a customer, presumably, and you've got other cloud customers that you want to sell to, is that right? Well, yeah, I mean, in a sense, the cloud service providers are no different than the systems providers, traditional and current. So, x86 systems providers, power server systems providers, you know, the cloud service providers, they're the physical platform providers, right? Our software, as I said at the beginning here, runs on across all of those, whether it be on-premises or in the cloud, right? So, we're looking to partner with all of these players, because for our clients, that gives the greatest flexibility, right? Our clients don't want to be locked in, you know, to anything as much as they can avoid it. They want to have flexibility to do on-prem and off-prem, change service providers, they want to be open source, you know, technology-based, that's where we're focused, because that gives the client value. There's some, there's a school of thought that says, and of course it's perpetuated by Amazon, Google, and Microsoft, that there won't be many infrastructure as a service providers, there'll be a handful, maybe some could argue three, they would typically argue that. There's another school of thought that says services, cloud services, and services are global, they're distributed in nature, there's gazillions of services companies that add value in different ways, there's vertical focus. I presume you're on the latter camp, but what are you seeing in the marketplace? I mean, how do you respond to those who say it's all going to coalesce around, you know, three monsters? Yeah, so I don't think that's viable in, for the same reason that we have channel partners, think traditional, pre-cloud, right? Channel solution partners who locally have the relationship with the client, they're the trusted provider, they bring the solutions in, they're the physical presence. Seeing a similar thing with cloud service providers, there are regional, country-level cloud service providers, you got to worry about local, you know, regulations and rules, but just customs, and then you've got the relationships with the software providers who are evolving into service providers, right? So I see that continuing, I mean, by our count there's more than 30,000 service providers at the tier two and tier three, you know, the full set globally, and we're engaged with a number of them of all sizes, not just the IBM and that Microsoft and the AWS. Well, the other wild card is SaaS, I mean, everybody's a SaaS player now, right? It's like, like Benioff says, more companies are going to be SaaS players outside of the technology than technology companies, right? So, I mean, if you're not a software company, then what are you doing? Right, well, you know, all that software still has to run on physical servers and physical storage. But to the point though, those SaaS companies are infrastructure providers, is my point. Well, yeah, yeah, though, that's exactly right. They need hardware and they need database. Well, that's right. So if you come and get your complete solution from as software as a service, well, you're actually getting infrastructure as a service within that, right? You didn't go procure it at that level, but it's baked in, right? So that's certainly true. And nobody's saying that the SaaS business is going to coalescer. I mean, that's exploding, right? No, right, exactly. I could, all right, we'll give you the last word. On edge, customer conversations, futures, fun stuff you're seeing. So first of all, a great conference. This is a lot of energy, having a lot of great discussions with clients. Like I said, the light bulbs are starting to go off over people's heads that these new generation workloads require a new generation infrastructure. And when they talk about hybrid cloud infrastructures, they've got a plan ahead for these new generation workloads. So we're seeing that come to life here and I'm looking forward to it continuing. Bernie Spang, rising through the ranks of IBM, you know, kicking by, I love it. Always a pleasure seeing you. All right, thank you guys. Thanks for coming on. All right, keep it right there, everybody. We'll be back with our next guest right after this short break. This is theCUBE, we're live from IBM Edge in Vegas. Right back.