 From theCUBE Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE Conversation. Welcome to theCUBE's coverage of HPE Discoverer 2021. I'm Lisa Martin. Janice Zankis joins me next, the Vice President of Innovation for Social Impact in HPE's office of the CTO. Janice, welcome to theCUBE. Hi, Lisa, great to be here. So let's talk about this. You lead HPE's Tech for Good program. I always love talking about programs like this. Talk to me about that. Industry, tech, academia, government partnering to solve key challenges that society's facing. Uncrack that for us. Yeah, so we are really proud to be able to look at big challenges in the world and look where our strengths, where our innovations, our emerging technologies and our employee expertise could actually contribute to a problem. And so we began a program to actually pick some projects, particularly in food systems, world hunger and health systems, where we thought some of our technologies could really be impactful. And so we have been working with a number of clients and partners to actually work on AI contributions, high performance compute contributions and a contribution around this notion of data spaces that we are talking about. All of these emerged through complex interactions around social good engagement. So the concept, you mentioned data spaces. The concept of data spaces isn't new, but do explain it, give us an overview, Janice, for those folks that might not be familiar with what it is. So the notion of data spaces is to connect data producers to data consumers. And so in the past, connecting producers and consumers has really been limited about, where is your data located? Do you have access to the right data? Is the data a good quality set of data? What's the providence of that data? What's the quality of it? And is it trustworthy? And so our concept of data spaces is actually trying to address all of those notions with a new approach. So collecting and sharing data isn't anything new, but of course what we talk about every day on this program is the volume of data. In that context, what are some of the challenges that you're seeing with clients and how can you help them eliminate those challenges and be able to make data-driven decisions? So the first challenge is finding the data. And there is a big challenge. I mean, there's new roles emerging called data hunters and a great amount of time being spent by data scientists just trying to find sources of data. And that's a big challenge. And then when you find this data, is it in the right format? And how expensive is it to move the data so that you could have it in a place where it can actually be analyzed? So what we're working on, recognizing that there's a vast amount of data at the edge, a vast amount of data that's probably never going to move from the edge and from those locations. But what we're trying to do is recognize that and actually work to bring the algorithms and the analytics to the data and to work with making sure the data is accessible and can be understood and processed in a consistent way. And today, there is a lot of silos in place around where data exists. And so our approach here is to kind of address this from an open source community perspective to build and provide a metadata layer standard of all standards, kind of a super metadata layer for a non-technical way to represent that. And then to use that to help connect to analytics platforms, both citizen users, subject matter experts who may not be data scientists as well as the data scientists. So actually being able to connect a broader set of users into data analytics that are currently available and have kind of the knowledge to be able to get information and insights out of that data. So democratizing that access to data, one of the things I'm curious about what you've seen is that's a cultural shift that you talked about some of the new rules data hunters and people get very sort of territorial about that. How, I'm just curious what are some of the things that you've seen that where HPE and data spaces have been able to help companies to be able to democratize that access and also kind of transform their culture. Well, so a few things. First of all, there has to be a strong motivation for someone to share data. And in order for them to feel safe in sharing that data, there has to be security and trust established. And most data producers want to control who gets to see their data and under what conditions. So there needs to be governance of data as well. So those are important aspects that have to be in place. Our approach is to kind of build an exchange for that so that data consumers understand the conditions in which they can access and use the data and also potentially contribute back the new data sets that they're creating through their analytics back into a cataloging, a provisioning of data. This improving kind of the standardization and the simplicity of how data gets exchanged today in effect allows a greater democratization of access of data so that you don't have to be a data scientist. I mean, data scientists today can spend seven to eight months actually getting their data that they're going to use into a format that they can actually process. And we think that that's inefficient. We think there's a lot that can be done. The other challenge around this is that oftentimes data is multi entity. Even inside of a company, you could find data in different departments and different businesses. But even when you think beyond a company, if you think about entities that are globally distributed and maybe multi entity, there are new challenges about how data can come together from those sources and still be of the right provenance and be understood and be trustworthy. Well, one of the things that I think, one of the many things I think we've learned during the last year is that the need and access for real time data has been a critical factor in helping businesses pivot and survive versus those that might not. What are you saying in terms of, like you said, data science is spending so much time getting access to clean data, the opportunities to miss opportunities for new products and new services and to meet customer demand in new ways. Talk to me about how data spaces can facilitate that faster real time access. Right, so by having an exchange that can be implemented inside of an enterprise or across enterprises, we actually think it allows some of that kind of pre-work to be done. It allows that cataloging and provisioning so you can come to a place, it's a place where an exchange can occur and actually be able to get more ready access to the data. You don't have to necessarily go through a cleansing process and through a deep investigation on provenance. And then oftentimes you learn as you process data about new data or the data sets change, right? So can there be improvements around keeping those ML algorithms current and helping you do that in a very efficient way without having to rerun and rewrite code and rerun your algorithms every single time? So we think there's a lot of improvement that can be done there as well. So let's look at, did a great job of explaining data spaces, the opportunities, the challenges that we've seen, the opportunities. But let us help the audience understand what makes what HPE is doing with data spaces different, unique. What are some of the differentiators there? A few things. One is we're approaching this from an open source approach. So we expect to be able to contribute back to the open source communities and allow for a greater ecosystem to develop around these solutions. And that will enable greater sharing and trustworthy sharing. The second thing is security. We intend to apply a great security layer into this that allows data to be trusted. And then the governance capability. So being able to use things like our data fabric to actually help support the governance that producers and consumers wanna have is also important. And then finally, being able to work multi-cloud across on-prem and in the cloud is a great advantage. So you don't get vendor lock-in. You'll be able to be able to kind of minimize your data egress because maybe you're not gonna be doing data egress out of the cloud and instead you'll be able to process your data right where it's at without having to pay for that movement. And I imagine that would facilitate that speed of real time that I mentioned a minute ago. That's right, that's right. Let's now look at HPE data spaces compared to data marketplace. Give me the compare and contrast with respect to those two. So data marketplaces are typically very siloed and very specific to a sector or an industry today. And they're typically built on their own platforms and to end, they're not always open by design. So we expect to be able to support multiple data market places through a plug-in into the data spaces platform that we build. And that will allow greater connectivity and greater access to many different marketplaces. And so the data spaces is not intended to be siloed by industry or narrowly kind of focused. So helping to remove those silos, which we also as another thing that we talk about, what are some, I'm just curious some of the feedback from the open source community about what you're doing here building on this open foundation. So it's actually been very positive. So the very first thing we did was because of our work as I started at the top of the conversation in agriculture, which is a great example of where there's immense amounts of data that is not well standardized or structured in a way that can be used towards addressing things like world hunger and some of the food supply and food system challenges we have. We, in working through this kind of distilled some of the problem set to being this lack of access to data. And so one of the reasons we explored was like why is there this lack of use of data and lack of access to data? And it came down to not being able to access the data where it's generated and not being able to actually share it broadly across entities. And so what we did is we joined the Linux Foundation has a new open source community called Agstack. And we are a founding company as part of that new community. And we have shared the concepts around data spaces and the metadata layer standardization that we've envisioned into the community. And that's just getting kicked off, but it's also a great first step for us to kind of build an open source community around it. Excellent, that sounds like you said positive feedback. If we crack open the hood of data spaces, what are some of the technologies that we see underneath that are making it and its evolution possible? Right, so multi-cloud cross data support, data support, edge processing, data fabric, Ezmeral solution as well. So being able to kind of move data. And then of course kind of a key layer to this is this notion of a metadata layer standard on top of metadata layer standards. And what is that going to allow in terms of connecting the data consumers with the data producers? It's going to make it easier. It's going to make it faster. It's going to minimize costs. And it's going to allow for a quality exchange with more information for consumers to have that trust. And most importantly, the security. And it will also create kind of the motivation, kind of the give and take because exchange has to be equitable for producers and consumers to both be at the table. That's a great point about being equitable. So this whole initiative that we've been talking about is coming out of the office of the CTO at HPE where we talked about. So the focus is on projects that are emerging not yet on the roadmap. So what can we expect? What can your audience expect in the next 12 to 18 months? So our approach in the office of the CTO is to take emerging technologies and ideas and actually bring them into kind of what we would call advanced development stages. So we do proof of concepts. We do a lot of piloting. We work with customers and clients directly to kind of tune and test commercialization possibilities and value of a solution that we're evolving and to kind of get it ready for market if it makes sense to do that. And so we have proof of concepts with the dozens of customers right now in this topic area and more that want to join and get involved in having access to it as well. So I would say most of the work we do in the coming 12 months will be driven by what these proof of concepts with these clients actually uncover for us. And so we know first and foremost we're working with a large financial services company. We're working on the agricultural front with a number of important customers that are testing kind of the multi entity data sharing aspects. We're working also with a healthcare industry client which is looking at extreme sets of large data that are kind of unanticipated data sets. You would normally think that would be important for disease prediction. And so all of those different kind of use cases are helping us kind of think about, which features are most important and by when. I can tell you security, the trustworthiness, the data provenance, the data governance are essential elements that are gonna have to be there. I think those are essential elements that in any industry, especially that security front. Yes, very much so. So in terms of the event at HPE, what are some of the things that the audience is gonna be able to learn and glean about data sources, data spaces? So we've had kind of a great three days first starting out with Antonio Neary and FIS to talk about kind of the insights, and how data is actually becoming the currency of the future, if you will. And so we started that way. And then on day two, we had a panel of some of our clients talking about in their particular industry, what's happening with data. So you start to see the kind of sharing out of requirements and how urgent these requirements are growing. And then on day three, we actually go into more technology. So you'll see there, we have a number of demos and sessions, one specifically around agriculture use case, another around healthcare use case as well. And then we go into a little bit more detail around the data spaces concept in the keynote for day three. So action packed three days. Janice, thank you so much for joining me, talking to us about data spaces, what you guys are doing for social impact out of HPE's office at the CTO. We appreciate your time. Thank you, Lisa. For Janice Sancus, I'm Lisa Martin. You're watching theCUBE's coverage of HPE Discover 2021.