 Good afternoon, hardware nerds, and welcome back to beautiful Denver, Colorado. We're here at Supercomputing 2023. My name is Savannah Peterson, joined by my fabulous co-host, David Nicholson. Day two, lots going on. What's been the most standout, interesting thing to you so far? I have to say the llama. It was notable. Yeah. We had Grock on the show earlier. There's a llama roaming downtown Denver. Have you decided if it's a local herd or if it's a marketing activation? Have you made up your mind yet, or are you TBD? Either way, I think we need to reintroduce wild llamas to the Denver Valley or whatever this is. Doesn't matter if they're not native. I think they'd like to live here. So yeah, definitely llamas. I think they do well in altitude if I'm recalling my basic biology. That's right, yeah. Yeah, all right, well that's great to know. Speaking of interesting conversations, whether they're about llamas or not, please welcome David Flynn of Hammer Space to the show. David, thank you so much for being here with us. Nice for having me on, thank you. Did the llama, Dave, you've seen the llama? I did not see the llama. I feel like I missed out on seeing the llama. I know, I feel totally gypped. I mean. Right, I feel llama gypped. It's not a good feeling. Outside of our four-legged friends, what has you most excited about the show so far? I'm just excited to see how big a deal it's become. You know, after COVID and all, but with the resurgence of AI, AI is making HPC sexy again. You know, we've kind of been saying that as a bit of a theme. I'm biased, I'm a hardware nerd. I've always thought hardware was pretty sexy, but it does feel like the spotlight is back on the machines that are going to power our artificial intelligence. And it's a really exciting moment. Hammersmith is all about getting information out of silos, which is also a big theme of the show. That's right, yeah. Does it feel like for you, we're at an intersection where all of a sudden, everybody gets what you do a little more? I think of it as the perfect storm. Yes, but in a good way, right? In a good way. In a beach weather storm way. It's the perfect storm for driving demand for what we do. Because, you know, in the old world, you had applications with their data on storage. One-to-one relationship between the application and the storage. You might have shared the storage with another application, but that was always incidental. Right. What we have here is now you need to get the data from many different applications into the models. And those models have to run in potentially many different places, just so you can get availability of GPUs or specialized hardware. Absolutely. So the old mantra of move the compute to your data because data gravity won't work. We have to orchestrate data to where you can do the compute. Which, oh, by the way, is very bursty too, so that really ought to be shared compute infrastructure. So that moves from a one-to-one relationship to a many-to-many relationship. That cross product is really what is forcing folks to go to a data orchestrated world instead of the old world of having data in storage. So when you use the term orchestration, what do you mean by that? So data orchestration is not just, it's not a new name for data management. It's not a marketing term. It has a very literal meaning. Most of the time data management means you're copying things or replicating. When you back it up, it's a copying it out of one system into another. Because the data is an emergent property of the storage system, its very identity is tied up in there because the metadata is trapped with it. So traditional data management, when you copy things around, you're making new data. It's like cutting off Hydra's head. You're just making more of it. Data orchestration is very different because the data movement is happening from behind the file system in a way that's transparent where you keep the metadata unified, the identity of the data stays unified even while the data moves around. So data orchestration, the main difference is that it's happening from behind the facade of the data presentation layer and it's transparent to the ongoing access and use, doesn't disrupt that. So it's really the diametric opposite of data management as we know it. So just to immediately dive down into something that is, I thought that's elicited, one of the veins of any IT infrastructure administrators existence in the past is this idea that, okay, if I'm moving from what I have now to what I need now, there's sort of this very complex fraught with danger migration process is a feature of your orchestration that you provide, the ability to not have to worry about that anymore. Once data is inside an orchestration system, it can move transparently between different storage systems, between whole separate data centers. But not only once it's in, but we take that all the way to the on-ramp. You can simply point Hammerspace to your existing data sources, to your existing network file system, your existing NAS data sets, and we can immediately serve it indirectly through while we scrape in all the metadata, elevate that now to transcend any of the storage system and free up the data to move. So even the on-ramp doesn't involve a migration. We had a movie studio with 600 artists 300 nodes in a render farm with an Icelon system that was maxed out able to install Hammerspace in an afternoon and have the whole studio up and running and the render farm and it doubled the rendering capability. They were able to go to a 600 node render farm with the same Icelon. That's amazing. I mean, and what a cool, one of the things that we're talking a lot about at the show is the application into, say, our family's lives and to everyone else's lives. And when we're thinking about a movie studio being able to operate faster to create awesome content, what a fun example. Are there any other interesting customer examples that you're seeing? Well, first on that, it's cool to be able to say that Hammerspace has been used for production of Disney's Mandalorian, Netflix, Stranger Things, as well as dozens of other movies and TV series. That's so cool. The movie making guys, they're the early adopters. And by the way, this is game changing because it not only allows them to scale up the amount of throughput that they get, but it allows them to branch out into other regions of the world. So we've had studios that have literally changed their business model because they can hire artists in South Africa, in India, in Australia and they're able to spin up regions in the cloud and have the file system with all the data there nearby where they can use virtual desktops within a close enough proximity to have very low latency with it. And then they push a button and the rendering happens in rural Canada where electricity is cheap and you can get server instances. So this ability to delocalize the data means you can engage teams all over the world and you can be competitive with the talent that you acquire. And it means you can engage compute gear anywhere in the world. I was just going to say you can engage edge devices anywhere as well and to those teams and it's all about making that transfer of information so much more efficient. It's really. Another good example that branching out in other industries is in the space science world. Yes. Blue Origin. Yeah. It uses hammer space to orchestrate data across their different facilities within the U.S. Can't say worldwide because it's rocket science, it's export restricted, so locally in the U.S. Yeah, that's true, the government's doing it. There you go, there you go. But it allows them to have the same data set, the same scratch pad for where they're designing the rocket engines, where they're manufacturing, where they're doing the test firings. You know, they can dump the telemetry in to the same file system. You know, where they're launching the rockets. All of that has a singular global data environment because the metadata plane is now elevated and not embedded within any given piece of storage. It transcends all of it and allows data to move across it. So it's really quite an endorsement that, you know, a Jeff Bezos company is using hammer space as their way to do hybrid cloud. Now they can do computing in the cloud, they can do storage in the cloud, all of that's made possible by delocalizing data with data orchestration. And this is really, I'm glad that you brought that up. I mean, this is really the evolution of legacy HPC architecture. This is where we're going, right? This is, we're here to stay. Well, if you think about it, we've already gotten there when it comes to our personal consumer data. You don't even think twice when you get a new cell phone. Or when you go between your cell phone in a tablet or a laptop, your data's there. Every last photo you've taken, every last video, all of your text messages, all of your emails, you know, what's stored here, what's stored in the cloud on a server. You don't even care. It's all orchestrated for you. And it now outlives any device in your possession. It transcends all of that. Well, we're talking about the same thing. A lot of us for really being sassy. Yeah, we're talking about how do you do that with the enterprise unstructured file data that is the foundation of all of these foundation models. How do you have that data be orchestrated so it can literally outlive the storage? And that's transformative to think that through data orchestration, it's actually a new paradigm for how data becomes permanent. It's not permanent because you stored it. It's permanent because it's presumed to always be trivially in motion across whatever you need. It now outlives any of those things. Right. One of the things that we've talked a lot about just in the last couple of days are the constraints sometimes thought of as bottlenecks to progress from an infrastructure perspective in the world of HBC, super computing. We talk a lot about thermal dynamics. As just as an example. Managing heat. Yeah. Managing heat. Hot topic. Yeah. And when you meet a company that has an immersion scientist that has core to their business and you've learned what immersion has to do, it's mind-boggling. It's cool. But in sort of the more traditional sense of you have processing power that could be a bottleneck. Well, we can't process things quickly enough. We can't get in and out of memory fast enough. We can't get in and out of storage fast enough. Where are you seeing bottlenecks in the HPC super compute world and how does HammerSpace help to address those? So am I thinking of it the right way when I think abstraction layer? Hey, you need more bandwidth, more throughput. You can plug more physical resources in that the orchestration layer then lose data to. I would say principally two areas. And they're really the same problem at different scale factors. The first one is between whole data centers. It's the bottleneck of how do you get the data into the data center where you need to have it? Because nowadays computing is so special purpose and it needs to be shared because it's very bursty workloads. We need to be able to move data across data center just to get the trainings done and the other things. So we eliminate that bottleneck through orchestration. And the way we eliminate it is by having policy-based push that can do things in advance and reactive pulling for the remaining parts that might not have gotten caught so that you can have the best of both worlds. And you can do that because it's behind the file system now. So when it goes to access something, if it's not there, it'll go get it. But you can have most of it already there before you even start it. And both of those rely on the same principle in other orchestration platforms like compute orchestration, Kubernetes, and that's granular lightweight encapsulation. So moving things at the granularity of a single file so that what you're moving is just what you need. Not whole volumes, not whole product. It's anything you can move. And you can move it without changing the organizational structure because it still exists within the same namespace, within the same file system. It's just, where is it instantiated under the covers? So that's the first bottleneck. The second one is the ability to feed it at performance into the systems because in the supercomputing world, we've always had this huge divide between the world of exotic supercomputer file systems that are a pain in the ass to maintain, like Luster and GPFS or WECA, these things that require special client and special storage nodes. It's like having to raise your own herd of unicorns, right? And running them in your infrastructure. But on the other side, you had Enterprise NAS, which was too dog slow because the protocols there were very, very slow. So what we at Hammerspace did was we fixed the NFS protocol. We introduced NFS 4.2. My CTO is the kernel maintainer of the NFS client stack in Linux. We made it tricked out so it is as higher performance or more than these exotic parallel file systems. So that allows us to solve the how to deliver data at the more local scale from the storage systems into the computer array, into your GPUs with GPU direct or into your special processors, your tensor processors, all of that directly. So it's really the same problem. One is at a macro scale between data centers. The other is a micro scale between nodes in a storage cluster into the array. And both of those are solved by the same separation of metadata and the control plane from the data path. Once you have the data and the data path separated, it makes it orchestratable and it also makes it parallel feedable. So it's ironic that it's the same architecture solves both ends of that spectrum. That's awesome and what a great solution for the folks that you're working with. Hammer Space has a pretty big announcement this week. What happened with vicinity? No, it's exciting. I have a soft spot in my heart for RDMA. Back when I was- You're probably not alone in this room. Yep, yep. Back when I was the chief architect at Linux Networks, we deployed the first large scale InfiniBand clusters. The first hundred nodes, the first thousand node, the first 10,000 node, mostly for the U.S. Department of Energy back in the early 2000s. So InfiniBand was kind of my thing, helped create the open IB stack, now the open fabric stack and RDMA. So that's where I became good friends with y'all Waldman. He's actually on our board here and an investor. But so I've always loved InfiniBand and RDMA. But you know, it's an interesting parallel here because while InfiniBand is great, what will really make that technology succeed is when it's Ethernet. Right, Rocky. And Rocky, right? Especially as HPC goes mainstream, you need those solutions to be in the mainstream with Ethernet. And I'd say it's the same thing with NAS. We need NAS to finally have the performance of these parallel file systems so that you don't have to go to something more exotic and you can get both the convenience and the performance. Yeah, Rocky should be a 10x word score on a Scrabble board for having an acronym encapsulated in another acronym because the R in Rocky stands for RDMA. Which is nested, yeah, yeah. So vicinity is kind of cool because they're taking that RDMA all the way across data centers to speed how fast you can move data between data centers. And that's really at the heart of this thing is that we now have enough universal connectivity in between data centers that you can move to an orchestrated world where data sets are flowing freely between them. The same way with your cell phone, it's now connected universally enough that you can be given that appearance as though all of your data is everywhere on all of your devices all of the time. So how can we have data appear to be in all of your data centers wherever you need it all of the time? It's only a matter of time before the pipes are big enough, before the technology is mature enough that you can have that orchestrated facade put in front of it. You are also named as one of HPCWire's top five new products or technologies here on the show floor. How does that feel for you and the team? Well, it's super exciting. I mean, we're still fairly early as a company and so to be recognized like that is very exciting. Thank you. Yeah, we're proud of you. I can feel the momentum like you said, perfect storm, but not everyone gets awards or not, industry nods like that in the middle of a perfect storm. So I think that's really... Well, I have to say, I mean, it's a really great technology and we have a great marketing team and some lighthouse customers that are just, where it's making a big impact and changing the way they do business. Give you another example. In the movie studio, thank goodness, we just solved the actor strike and the writer strike before that has its roots in the AI problem, right? Right, right. With disrupting. But one of the things is that's a very cutthroat business, very low margins, highly competitive, building content. And now it has to be done as a collaboration with other studios. So even the largest of the big major studios don't do it all in-house anymore. They outsource it to other smaller studios. And the interesting thing is this concept of data orchestration actually changes how that collaboration occurs because now they can all be working from within the same file system that can feed their rendering farms natively, directly, and yet it's still a shared file system where their artists are working within each of the separate organizations. So I like to think of it as, it's redefining the supply chain of the product that is data. The data supply chain has here to for been done through copy and merger. But now you're talking about being able to actually live inside of the same live data environment across the organization in a collaborative way, where thanks to the parallel file system architecture it's able to feed it to the most demanding applications like AI training or animation rendering, et cetera. Yeah, I never drew that parallel and that's really interesting. I think those of us who have been in this world like yourself for a long time, we're sort of aware of what movie production entails, but the world of AI and modeling and test and training and parallel access from all over the place, it really is, there are really, it is a good analogy. To look at what we've done. Well they both use GPUs at least. Yeah, exactly. And large numbers of them. And it's a parallel task. Yeah, yeah. You know, one is actually building images and the other one is building ways to interpret what's in images. Right, right, right. Perceiving images. Similar approach. Or designing microchips now that's using AI technology, right, which makes it very circular because now you're using AI to design the layout of the chips that go into AI. That's why this is an exponential thing. As we accelerate different platforms along the way it accelerates the total process. And I would like to believe that what Hammerspace is doing in making it possible to automate the movement of data in a way that you could not before allows you to position the data to be trained on and to be used. So it is another recursive thing where you're using AI because at the heart of our product we're using machine learning to make decisions about where to place the data and when to move it to try to achieve the stated objectives of where the data needs to be at what point in time. So it allows you to incorporate AI and machine learning to solve the challenge of how to get the data, where you need it, when you need it in advance so that you can do the machine learning on it. So I've got a kind of a practical question, a go to market strategy question for you. Some would argue that when going to market with anything in our world, that there aren't more strategic seats at the table, especially for big customers. If anything, they're trying to consolidate the number of partners and vendors that they deal with. What's your primary, what's the primary way that you go to market? Is it directly or is it- As with any new technology, you have to drive demand. You have to have an evangelist sales team. But just like when VMware introduced server virtualization and we take that as an analogy, we're virtualizing the data from the storage, storing it, they virtualize the compute from the computers running it. It's really the same kind of abstraction layer approach. You have to first show people how to use it to drive that demand. And then you need the channel to amplify that, to teach lots of evangelists. So we are very channel friendly. Every deal we do is with the channel, even if we brought the opportunity. But more and more we're seeing the opportunities brought to us by the channel. And just like with VMware, there is an awesome opportunity for the channel partners to learn the new paradigm for how to get data where you need it, when you need it, using this data orchestration paradigm, as opposed to the store copy and merger world. Yeah. And so we see that as part and parcel. Your own sales team, then VAR and channel amplified, amplified through. Makes sense. Well, very clearly and deservedly, so a very exciting time for you, David, at Hammer Space. Thank you so much for being here with us. David, thank you for being here. I got a duo of David's. I feel smarter after the last 20 minutes, which is saying something. And I hope all of you watching at home at work or on Mars are also feeling smarter as we're continuing our coverage here at Supercomputing 2023 in Denver, Colorado. I'm Savannah Peterson and you're watching The Cube, the leading source for tech news.