 From theCUBE Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE Conversation. Hey, welcome back, all righty. Jeff Frick here with theCUBE. We are in our Palo Alto studios. COVID is still going on, so all of the interviews continue to be remote, but we're excited to have a CUBE alumni. He hasn't been on for a long time, but this guy has been in the weeds of the storage industry for a very, very long time and we're happy to have him on and get an update because there continues to be a lot of exciting developments. He's Phil Bollinger. He is the SVP and general manager, data center business unit from Western Digital, joining us, I think from Colorado. So, Phil, great to see you. How's the weather in Colorado today? Hi, Jeff, it's great to be here. Well, it's a hot dry summer here, I'm sure, like a lot of places. Yeah, enjoying the summer through these unusual times. It is unusual times, but fortunately there's great things like the internet and heavy duty compute and store out there so we can get together this way. So let's jump into it. You've been in the business a long time, you've been at Western Digital, you were at EMC, you worked on Icelon and you were at storage companies before that and you've seen kind of this never-ending up-and-to-the-right slope that we see kind of ad nauseam in terms of the amount of storage demands. It's not going anywhere, but up in please increase complexity in terms of unstructured data, sources of data, speed of data, you know, all the kind of classic big Vs of big data. So I wonder before we jump into specifics, if you can kind of share your perspective because you've been kind of sitting in the catboard seat and Western Digital is a really unique company. You not only have solutions, but you also have media that feeds other people's solutions. So you guys are really seeing and ultimately all this compute's got to put this data somewhere and a whole lot of it's sitting on Western Digital. Yeah, it's a great intro there. Yeah, it's been interesting, you know, through my career, I've seen a lot of advances in storage technology, speeds and feeds like we often say, but the advancement through mechanical innovation, electrical innovation, chemistry, physics, just the relentless growth of data has been driven in many ways by the relentless acceleration and innovation of our ability to store that data. And that's been a very virtuous cycle through what for me has been more than 30 years in enterprise storage. There are some really interesting changes going on though. I think if you think about it in a relatively short amount of time, data has gone from just kind of this artifact of our digital lives to the very engine that's driving the global economy. Our jobs, our relationships, our health, our security, they all kind of depend on data now and for most companies, kind of irrespective of size, how you use data, how you store it, how you monetize it, how you use it to make better decisions to improve products and services. You know, it becomes not just a matter of whether your company's going to thrive or not, but in many industries, it's almost an existential question. Is your company going to be around in the future? And it depends on how well you're using data. So this drive to capitalize on the value of data is pretty significant. Yeah, it's a really interesting topic. We've had a number of conversations around trying to get like a book value of data, if you will. And I think there's a lot of conversations, whether it's a counting kind of way or finance or kind of a good will of how do you value this data? But I think we see it intrinsically in a lot of the big companies that are really data-based, like the Facebooks and the Amazons and the Netflix and the Googles and those types of companies where it's really easy to see. And if you see, you know, the valuation that they have compared to their book value of assets, right? It's really baked into there. So it's fundamental to going forward. And then we have this thing called COVID hit, which you know, I'm sure you've seen all the media on social media, right? What drove your digital transformation? The CEO, the CMO, the board or COVID-19. And it became this light switch moment where your opportunities to think about it are no more. You've got to jump in with both feet. And it's really interesting to your point that it's the ability to store this and think about it now differently as an asset driving business value versus a cost that IT has to accommodate to put this stuff somewhere. So it's a really different kind of a mind shift and really changes the investment equation for companies like Western Digital about how people should invest in higher performance and higher capacity and, you know, more unified and kind of democratizing the accessibility that data to a much greater set of people with tools that can now, you know, start making much more business line and inline decisions than just the data science, you know, kind of a mahogany row. Yeah, I'd like, as you mentioned, Jeff, I heard at Western Digital, we have such a unique kind of perch in the industry to see all the dynamics in the OEM space and the hyperscale space and the channel really across to all the global economies about this growth of data. I have worked at several companies and have been familiar with what I would have called big data projects and fleets in the past, but at Western Digital, you have to move the decimal point, you know, quite a few digits to the right to get the perspective that we have on just the volume of data that the world is just relentlessly and satiably consuming. Just a couple examples for our drive projects we're working on now, our capacity enterprise drive projects, you know, we used to do business case analyses and look at their life cycle capacities and we measured them in exabytes and not anymore, now we're talking about zettabytes. We're actually measuring capacity enterprise drive families in terms of how many zettabytes they're going to ship in their life cycle. And if we look at just the consumption of this data, the last 12 months of industry TAM for capacity enterprise, compared to the 12 months prior to that, that annual growth rate was north of 60%. And so it's rare to see industries that are growing at that pace. And so the world is just consuming immense amounts of data. And as you mentioned, the COVID dynamics have been both an accelerant in some areas as well as headwinds and others, but it's certainly accelerated digital transformation. I think a lot of companies were talking about digital transformation and hybrid models and COVID has really accelerated that. And it's certainly driving, continues to drive just this relentless need to store and access and take advantage of data. Yeah. Well, Phil, in advance of this interview, I pulled up the old chart, right? With all the different bytes, right? Kilobytes, megabytes, gigabytes, terabytes, petabytes, exabytes and zettabytes. And just per the Wikipedia page, what is a zettabyte? It's as much information as there are grains of sand in all the world's beaches for one zettabyte. And you're talking about thinking in terms of those units. I mean, that is just mind boggling to think that that is the scale in which we're operating. It's really hard to get your head wrapped around a zettabyte of storage. I think a lot of the industry thinks when we say zettabyte scale era, that it's just a buzzword. But I'm here to say it's a real thing where we're measuring projects in terms of zettabytes. That's amazing. Let's jump into some of the technology. So I've been fortunate enough here at theCUBE to be there at a couple of major announcements along the way. We talked before, we turned the cameras on. The helium announcement and having the hard drive sit in the fish bowl to get all types of interesting benefits from this less dense air. That is helium versus oxygen. I was down at the mammur and hammer announcement, which was pretty interesting. Big, big heavy technology moves there to again increase the capacity of the hard drives based systems. You guys are doing a lot of stuff on risk five. I know as an open source projects, you guys have a lot of things happening. But now there's this new thing. This new thing called zoned storage. So first of all, before we get into a why, do we need zoned storage? And really what does it now bring to the table in terms of a capability? Yeah, great question, Jeff. So why now, right? As I mentioned, storage, I've been at storage for quite some time. In the last, let's just say in the last decade, we've seen the advent of the hyperscale model and certainly the, a whole nother explosion level of data and just the veracity with which the hyperscalers can create and consume and process and monetize data. And of course, with that has also come a lot of innovation, frankly, in the compute space around how to process that data and moving from, what was just a general purpose CPU model to GPUs and DPUs and so we've seen a lot of innovation on that side, but frankly, in the storage side, we haven't seen much change at all in terms of how operating systems, applications, file systems, how they actually use the storage or communicate with the storage. And sure, we've seen advances in storage capacities. Hard drives have gone from two to four to eight to 10 to 14, 16, and now are leading 18 and 20 terabyte hard drives. And similarly on the SSD side, now we're dealing with capacities of seven and 15 and 30 terabytes. So things have gotten larger, as you would expect. And some interfaces have improved. I think NVMe, which we'll talk about has been a nice advance in the industry. It's really now brought a very modern, scalable, low latency, multi-threaded interface to a NAND flash to take advantage of the inherent performance of transistor-based persistent storage. But really, when you think about it, it hasn't changed a lot. And so, but what has changed is workloads. One thing that definitely has evolved in the space of the last decade or so is this, the thing that's driving a lot of this explosion of data in the industry is around workloads that I would characterize as sequential in nature. They're securely captured and written. They also have a very consistent life cycle. So you would write them in a big chunk. You would read them maybe in smaller pieces, but the life cycle of that data, we can treat more as a chunk of data. But the problem is applications, operating systems, file systems continue to interface with storage using paradigms that are many decades old. The old 512 byte or even 4K sector size constructs were developed in the hard drive industry just as convenient paradigms to structure what is an unstructured sea of magnetic grains into something structured that can be used to store and access data. But the reality is, when we talk about SSDs, structure really matters. And so what has changed in the industry is the workloads are driving very, very fresh looks at how more intelligence can be applied to that application OS storage device interface to drive much greater efficiency. Right. So there's two things going on here that I want to drill down on. On one hand, you talked about kind of the introduction of NAND and Flash and treating it like you did, generically you did a regular hard drive, but you could get away and you could do some things because the interface wasn't taking full advantage of the speed that was capable in the NAND. But NVMe has changed that and now forced, you know, kind of getting rid of some of those inefficient processes that you could live with. So it's just kind of classic next level step up in capabilities. One is you get the better media, you just kind of plug it into the old way. Now actually you're starting to put in processes that take full advantage of the speed that that Flash has. And I think, you know, obviously prices have come down dramatically since the first introduction and where before it was always, you know, kind of clustered off or super high end, super low latency, super high value apps, you know, it just continues to spread and proliferate throughout the data center. So, you know, what did NVMe force you to think about in terms of maximizing, you know, kind of the return on the NAND and Flash? Yeah, NVMe, which, you know, we've been involved in the standardization of, I think it's been a very successful effort, but we have to remember NVMe is about a decade old, you know, or even more when the original work started around defining this interface. And, but it's been very successful. You know, the NVMe standards bodies very productive, you know, cross company effort. It's really driven a significant change. And what we see now is the rapid adoption of NVMe in all data center architectures, whether it's very large hyperscale to, you know, classic on-prem enterprise to even, you know, smaller applications. It's just a very efficient interface mechanism for connecting SSDs into a server, you know. So, we continue to see evolution in NVMe, which is great. And we'll talk about ZNS today as one of those evolutions. We're also very keenly interested in NVMe protocol over fabrics. And so one of the things that Western Digital has been talking about a lot lately is incorporating NVMe over fabrics as a mechanism for now connecting shared storage into multiple host architectures. We think this is a very attractive way to build shared storage architectures of the future that are scalable, that are composable, that really have a lot more agility with respect to rack level infrastructure and applying that infrastructure to applications. Right. Now, one thing that might strike some people as kind of counterintuitive is within this zone storage and zoning off parts of the media to think of the data also kind of in these big chunks is it feels contrary to kind of the atomization that we're seeing in the rest of the data center, right? So smaller units of compute, smaller units of store so that you can assemble and disassemble them in different quantities as needed. So what was the special attribute that you had to think about and actually come back and provide a benefit in actually kind of re-chunking, if you will, in these zones versus trying to get as atomic as possible? Yeah, it's a great question, Jeff. And I think it's maybe not intuitive in terms of why zone storage actually creates a more efficient storage paradigm when you're storing stuff essentially in larger blocks of data. But this is really where the intersection of structure and workload and sort of the nature of the data all come together. If you turn back the clock, maybe four or five years when SMR hard drives, host manager SMR hard drives first emerged on the scene, this was really taking advantage of the fact that the right head on a hard disk drive is larger than the read head or the read head can be much smaller. And so then the notion of overlapping or shingling the data on the drive, giving the read head a smaller target to read, but the writer a larger right pad to write the data could actually, what we found was it increases our real density significantly. And so that was really the emergence of this notion of sequentially written larger blocks of data being actually much more efficiently stored when you think about physically how it's being stored. What is very new now and really gaining a lot of traction is the SSD corollary to SMR and the hard drive. On the SSD side, we have the ZNS specification which is very similarly where you divide up the namespace of an SSD into fixed size zones. And those zones are written sequentially, but now those zones are intimately tied to the underlying physical architecture of the NAND itself, the dies, the planes, the read pages, the erase pages, so that in treating data as a block, you're actually eliminating a lot of the complexity and the work that an SSD has to do to emulate a legacy hard drive. And in doing so, you're increasing performance and endurance and the predictable performance of the device. I just love the way that you kind of twist the lens on the problem and on one hand, by rule, just looking at my notes here, the zone storage devices, the ZSDs, introduce a number of restrictions and limitations and rules that are outside the full capabilities of what you might do. But in doing so, in aggregate, the efficiency and the performance of the system in the whole is much, much better even though when you first look at it, you think it's more of a limiter, but it's actually opens up. I wonder if there's any kind of performance stats you can share or any kind of empirical data just to get people kind of a feel for what that comes out as. So if you think about the potential of zone storage in general, and when again, when I talk about zone storage, there's two components. There's an HDD component of zone storage that we refer to as SMR, and there's an SSD version of that that we call ZNS. So you think about SMR, the value proposition there is additional capacity. So effectively in the same drive architecture with roughly the same bill of material used to build the drive, we can overlap or shingle the data on the drive and generate for the customer additional capacity. Today, with our 1820 terabyte offerings that's on the order of just over 10%, but that delta is going to increase significantly going forward to 20% or more. And when you think about a hyperscale customer that has not hundreds or thousands of racks, but tens of thousands of racks, a 10 or 20% improvement in effective capacity is a tremendous TCO benefit. And the reason we do that is obvious. I mean, the economic paradigm that drives large at scale data centers is total cost of ownership, both acquisition costs and operating costs. And if you can put more storage in a square tile of data center space, you're going to generally use less power, you're going to run it more efficiently, you're actually from an acquisition cost, you're getting a more efficient purchase of that capacity. And in doing that, our innovation, we benefit from it and our customers benefit from it. So that the value proposition for zone storage in capacity enterprise HDV is very clear, it's additional capacity. The exciting thing is in the SSD side of things for ZNS, it actually opens up even more value proposition for the customer. Because SSDs have had to emulate hard drives, there's been a lot of inefficiency and complexity inside an enterprise SSD dealing with things like garbage collection and right amplification, reducing the endurance of the device. You have to over provision. You have to insert as much as 20, 25, even 28% additional NAND bits inside the device just to allow for that extra space, the working space to deal with, with delete of data that are smaller than the block of race that the device supports. And so you have to do a lot of reading and writing of data and cleaning up. It creates for a very complex environment. ZNS by mapping the zone size with the physical structure of the SSD, essentially eliminates garbage collection. It reduces over provisioning by as much as 10X. And so if you were over provisioning by 20 or 25% in an enterprise SSD and a ZNS SSD, that can be one or 2%. The other thing we have to keep in mind is, enterprise SSDs typically incorporate DRAM and that DRAM is used to help manage all those dynamics that I just mentioned. But with a much simpler structure where the pointers to the data can be managed without all that DRAM, we can actually reduce the amount of DRAM in an enterprise SSD by as much as 8X. And if you think about the bill of material of an enterprise SSD, DRAM is number two on the list in terms of the most expensive bomb components. So ZNS and SSDs actually have a significant customer, total cost of ownership impact. It's an exciting standard. And now that we have the standard ratified through the NVMe working group, it can really accelerate the development of the software ecosystem around. Right. So let's shift gears and talk a little bit about less about the tech and more about the customers and the implementation of this. So are there, you talk kind of generally, but are there certain types of workloads that you're seeing in the marketplace where this is a better fit or is it just really the big heavy lifts where they just need more and this is better? And then secondly, within these both hyperscale companies as well as just regular enterprises that are also seeing their data demands grow dramatically, are you seeing that this is a solution that they want to bring in for kind of the marginal, kind of next data center, extension of their data center or their next cloud region or are they doing lift and shift and ripping stuff out or do they have enough? Do they have enough data growth organically that there's plenty of new stuff that they can put in these new systems? Yeah, well, the large customers don't rip and shift. They ride their assets for a long life cycle because with the relentless growth of data, you're primarily investing to handle what's coming in over the transom. But we're seeing solid adoption. And in SMR, as you know, we've been working on that for a number of years. We've got significant interest in investment, co-investment, R engineering and our customer's engineering adapting the application environments to take advantage of SMR. The great thing is now that we've got the NVMe, the ZNS standard ratified now in the NVMe working group. We've got a very similar and all approved now situation where we've got SMR standards that have been approved for some time in the SATA and SCSI standards. Now we've got the same thing in the NVMe standard. And the great thing is once a company goes through the lift, so to speak, to adapt an application file system, operating system, ecosystem to zone storage, it pretty much works seamlessly between HDD and SSD. And so it's not an incremental investment when you're switching technologies. And obviously the early adopters of these technologies are going to be the large companies who design their own infrastructure, who have mega fleets of racks of infrastructure where these efficiencies really, really make a difference in terms of how they can monetize that data, how they compete against the landscape of competitors they have for companies that are totally reliant on kind of off the shelf standard applications. That adoption curve is going to be longer, of course, because there are some software changes that you need to adapt to to enable zone storage. One of the things Western Digital has done and taken the lead on is creating a landing page for the industry with zonestorage.io. It's a webpage that's actually an area where many companies can contribute open source tools, code, validation environments, technical documentation. It's not a market-tearing website. It's really a website built to land actual open source content that companies can use and leverage and contribute to to accelerate the engineering work to adapt software stacks to zone storage devices and to share those things. Let me just follow up on that because again, you've been around for a while and get your perspective on the power of open source. And it used to be the best secrets, the best IP were closely guarded and held inside. And now really we're in an age where it's not necessarily. And the brilliant minds and use cases and people out there, just by definition, it's more groups of engineers, more engineers outside your building than inside your building. And how that's really changed kind of a strategy in terms of development when you can leverage open source. Yeah, open source clearly has accelerated innovation across the industry in so many ways. And it's the paradigm around which companies have built business models and innovated on top of it. I think it's always important as a company to understand what value add you're bringing and what value add the customers want to pay for. What unmet needs in your customers are you trying to solve for? And what's the best mechanism to do that? And do you want to spend your R&D recreating things or leveraging what's available and innovating on top of it? It's all about ecosystem. I mean, the days where a single company can vertically integrate top to bottom, a complete end solution. Those are fewer and far between. I think it's about collaboration and building ecosystems and operating within those. Yeah, it's such an interesting change. And one more thing again to get your perspective, you run the data center group, but there's this little thing happening out there that we see growing IoT in the internet of things and the industrial internet of things and edge computing as we try to move more compute and store and power kind of outside the pristine world of the data center and out towards where this data is being collected and processed when you've got latency issues and all kinds of reasons to start to shift the balance of where the compute is or where the store and the reliance on the network. So when you look back from a storage perspective in your history in this industry and you start to see that basically everything is now going to be connected, generating data and a lot of it is even open source. I talked to a company the other day doing kind of open source computer vision on surveillance video. So the amount of stuff coming off of these machines is growing in crazy ways. At the same time, it can't all be processed at the data center. It can't all be kind of shipped back and then have a decision and then ship that information back out too. So when you sit back and look at edge from your kind of historical perspective, what goes through your mind? What gets you excited? What are some of the opportunities that you see that maybe the layman is not paying close enough attention to? Yeah, it's really an exciting time in storage. I get asked that question from time to time having been in storage for more than 30 years. What was the most interesting time? And there's been a lot of them, but I wouldn't trade today's environment for any other in terms of just the velocity with which data is evolving and how it's being used and where it's being used. The TCO equation may describe what a data center looks like, but data locality will determine where it's located. And we're excited about the edge opportunity. We see that as a pretty significant meaningful part of the TAM as we look out three to five years. Certainly 5G is driving much of that. I think just anytime you speed up the speed of the connected fabric, you're going to increase storage and increase the processing of the data. So the edge opportunity is very interesting to us. We think a lot of it is driven by low latency workloads. So the concept of NVMe is very appropriate for that. We think, in general, SSDs deployed in edge data centers defined as anywhere from a meter to a few kilometers from the source of the data. We think that's going to be a very strong paradigm. The workloads you mentioned, especially IoT, just machine-generated data in general now, I believe, has eclipsed human-generated data in terms of just the amount of data stored. And so we think that curve is just going to keep going in terms of machine-generated data. Much of that data is so well-suited for zone storage because it's sequential. It's sequentially written. It's captured. And it has a very consistent and homogeneous lifecycle associated with it. So we think what's going on with zone storage in general and ZNSNSMR specifically are well-suited for where a lot of the data growth is happening. And certainly, we're going to see a lot of that at the edge. Well, Phil, it's always great to talk to somebody who's been in the same industry for 30 years and is excited about today and the future. And as excited as they have been throughout their whole career. So that really bodes well for you, bodes well for Western Digital. And we'll just keep hoping the smart people that you guys have over there keep working on the software and the physics and the mechanical engineer to keep moving this stuff along. It's just amazing and just relentless. Yeah, it is relentless. What's exciting to me in particular, Jeff, is we've driven storage advancements largely through, as I said, a number of engineering disciplines. And those are still going to be important going forward, the chemistry, the physics, the electrical, the hardware capabilities. But I think as widely recognized in the industry, that it's a diminishing curve. I mean, the amount of energy, the amount of engineering effort, investment, the cost and complexity of these products to get to that next capacity step is getting more difficult, not less. And so things like zone storage, where we now bring intelligent data placement to this paradigm, is what I think makes this current juncture that we're at very exciting. Right, right. Well, it's applied AI, right? Ultimately, you're going to have more and more compute power driving the storage process and how that stuff is managed. And as more cycles become available and they're cheaper and ultimately compute gets cheaper and cheaper, as you said, you guys just keep finding new ways to move the curve. And we didn't even get into the totally new material science, which is also coming down the pike at some point in time. Well, it's been great to catch up with you. I really enjoy the Western Digital story. I've been fortunate to sit in on a couple chapters. So again, congrats to you. And we'll continue to watch and look forward to our next update. Hopefully it won't be another four years. OK, thanks, Jeff. I really appreciate the time. All right, thanks a lot. All right, he's Phil. I'm Jeff. You're watching theCUBE. Thanks for watching. We'll see you next time.