 Hello and welcome to theCUBE Conversation here in Palo Alto, California. I'm John Furrier, host of theCUBE. We've got a great security conversation with Ed Casper who's the founder and CEO of Cloud Storage Security. The great cloud background, cloud security, cloud storage. Welcome to theCUBE Conversation, Ed. Thanks for coming on. Thank you very much for having me. I got FOMO on that background. You got the nice look there. Let's get into the storage blind spot conversation around cloud security. Obviously reinforces, came up a ton. Heard a lot about encryption, automated reasoning, but still ransomware was still hot. All these things are continuing to be issues on security. But they're all involved on data and storage, right? So this is a big part of it. Tell us a little bit about how you guys came about the origination story. What is the company all about? Sure, sure. So we're a pandemic story. We started in February, right before the pandemic really hit. And we've survived and thrived because it is such a critical thing. If you look at the growth that's happening in storage right now, we saw this at Reinforce. We saw even a recent AWS storage day, S3 in particular houses over 200 trillion objects. If you look just 10 years ago in 2012, Amazon touted how they were housing one trillion objects. So in a 10-year period, it's grown to 200 trillion. And really most of that has happened in the last three or four years. So the pandemic and the shift in the ability and the technologies to process data better has really driven the need and driven the cloud growth. I want to get into some of the issues around storage. Obviously the trend on S3, look at what they've done. I mean, I saw my land at storage day, we've interviewed her, she's amazing. Just the EC2 and S3, the core pistons of AWS, obviously the silicon is getting better. The IaaS layer is just getting so much more innovation. You got more performance, abstraction layers at the past is emerging. Cloud operations on premise now with hybrid is becoming a steady state. And if you look at all the action, it's all this hyper-converged kind of conversations, but it's not hyper-converged in a box. It's cloud storage. So there's a lot of activity around storage in the cloud. Why is that? Well, because it's that companies are defined by their data. And if a company's data is growing, the company itself is growing. If it's not growing, they are stagnant and in trouble. And so what's been happening now, and you see it with the move to cloud, especially over the on-prem storage sources, is people are starting to put more data to work. And they're figuring out how to get the value out of it. Recent analysts made a statement that if the Fortune 1000 could just share and expose 10% more of their data, they'd have net revenue increases of 65 million. So it's just the ability to put that data to work. And it's so much more capable in the cloud than it has been on-prem to this point. You know, it's interesting data portability is being discussed, data access, who gets access? Do you move compute to the data? Do you move data around? And all these conversations are kind of around access and security is one of the big vulnerabilities around data, whether it's an S3 bucket that's an manual configuration error, or if it's a tool that needs credentials. I mean, how do you manage all this stuff? This is really where a rethink kind of comes around. So can you share how you guys are surviving and thriving in that kind of crazy world that we're in? Yeah, absolutely. So, you know, data has been the critical piece and moving to the cloud has really been this notion of how do I protect my access into the cloud? How do I protect who's got it? How do I think about the networking aspects, my East West traffic after I've blocked them from coming in, but no one's thinking about the data itself. Ultimately, you want to make that data very safe for the consumers of the data. They have an expectation and almost a demand that the data that they consume is safe. And so companies are starting to have to think about that. They haven't thought about it. It has been a blind spot. You mentioned that before. In regards to, I am protecting my management plane. We use posture management tools. We use automated services. If you're not automating, then you're struggling in the cloud. But when it comes to the data, everyone thinks, oh, I've blocked access. I've used firewalls. I've used policies on the data, but they don't think about the data itself. It is that packet that you talked about that moves around to all the different consumers and the workflows. And if you're not ensuring that that data is safe, then you're in big trouble and we've seen it over and over again. I mean, it's definitely a hot category and it's changing a lot. So I love this conversation because it's a primary one, primary, secondary data storage. Kind of a good joke there. But all kidding aside, it's a hard, you got data lineage tracing is a big issue right now. We're seeing companies come out there and kind of have an observability tangent there. The focus on this is huge. I'm curious, what was the origination story? What got you into the business? Was it like, were you having a problem with this? Did you see an opportunity? What was the focus when the company was founded? Yeah, it's definitely to solve the problems that customers are facing. What's been very interesting is that they're out there needing this. They're needing to ensure their data is safe. As the whole story goes, they're putting it to work more. We're seeing this. I thought it was a really interesting series. One of your last series about data as code and you saw all the different technologies that are processing and managing that data and companies are leveraging today. But still, once that data is ready and it's consumed by someone, it's causing real havoc if it's not either protected from being exposed or safe to use and consume. And so that's been the biggest thing. So we saw a niche. We started with this notion of cloud storage being object storage. And there was nothing there protecting that. Amazon has the notion of access and that is how they protect the data today, but not the packets themselves, not the underlying data. And so we created the solution to say, okay, we're going to ensure that that data is clean. We're also going to ensure that you have awareness of what that data is, the types of files you have out in the cloud, wherever they may be, especially as they drift outside of the normal platforms that you're used to seeing that data in. It's interesting, you know, people who are storing data lakes, oh yeah, just store one we might need and then became a data swamp. That's kind of like back six, seven years ago, that was the conversation. Now the conversation is, I need data. It's got to be clean. It's got to feed the machine learning. This is going to be a critical aspect of the business model for the developers who are building the apps, hence the data as code reference, which we focused on. But then you say, okay, great. Does this increase our surface area for potential hackers? So there's all kinds of things that kind of open up when you start doing cool, innovative things like that. So what are some of the areas that you see that your tech solves around some of the blind spots or with object store, are the things that people are overlooking? What are some of the core things that you guys are seeing that you're solving? So it's a couple of things, you know, right now the still the biggest thing you see in the news is configuration issues where people are losing their data or accidentally opening up to rights. That's the worst case scenario. Reads are a bad thing too, but if you open up rights, and we saw this with a major API vendor in the last couple of years, they accidentally opened rights to their buckets, hackers found it immediately and put malicious code into their APIs that were then downloaded and consumed by many, many of their customers. So it is happening out there. So the notion of ensuring configuration is good and proper, ensuring the data has not been augmented inappropriately and that it is safe for consumption is where we started. And, you know, we created a lightweight, highly scalable solution. We've at this point, we've scanned billions of files for customers and petabytes of data. And we're seeing that it's such a critical piece to that to make sure that that data is safe. The big thing, and you brought this up as well, is the big thing is they're getting data from so many different sources now. It's not just data that they generate. You see one centralized company taking in from numerous sources, consolidating it, creating new value on top of it and then releasing that. And the question is, do you trust those sources or not? And even if you do, they may not be safe. You know, we had an event around SuperCloud as a topic we brought up to bring the attention to the complexity of hybrid, which is on premise, which is essentially cloud operations. And the successful people that are doing things on the software side are essentially abstracting up the benefits of the infrastructure as a service from AWS, right, which is great. Then they innovate on top. So they have to abstract out. Storage is a key component of where we see the innovations going. How do you see your tech kind of connecting with that trend that's coming, which is everyone wants infrastructure as code. I mean, that's not new. I mean, that's the goal and it's getting better every day. But DevOps, the developers are driving the operations and security teams to like stay pace. So policy, seeing a lot of policy, seeing some cool things going on that's abstracting up from say storage and compute. But then those are being put to use as well. So you got this new wave coming around the corner. What's your reaction to that? What's your vision on that? How do you see that evolving? I think it's great actually. I think that the biggest problem that you have to do as someone who is helping them with that process is make sure you don't slow it down. So just like cloud at scale, you must automate, you must provide different mechanisms to fit into workflows that allow them to do it just how they want to do it and don't slow them down. Don't hold them back. And so, we've come up with different measures to provide and pretty much a fit for any workflow that any customer has come so far with. We do data this way. I want you to plug in right here. Can you do that? And so it's really about being able to plug in where you need to be and don't slow them down. That's what we've found so far. Yeah, I mean, exactly. You don't want to solve complexity with more complexity. That's the killer problem right now. So take me through the use case. Can you just walk me through how you guys engage with customers? How they consume your service? How they deploy it? You got some deployment scenarios. Can you talk about how you guys fit in and what's different about what you guys do? Sure, sure. So what we're seeing is, and we'll go back to this data coming from numerous sources. We see different agencies, different enterprises taking data in and maybe their solution is intelligence on top of data. So they're taking these data sets in whether it's topographical information or whether it's investing type information. Then they process that and they scan it and they distribute it out to others. So we see that happening as a big common piece through data ingestion pipelines. That's where these folks are getting most of their data. The other is where is the data itself, the document or the document set, the actual critical piece that gets moved around. And we see that in pharmaceutical studies. We see it in mortgage industry and fintech and healthcare. And so anywhere that let's just take a very simple example I have to apply for insurance. I'm going to upload my social security information. I'm going to upload a driver's license whatever it happens to be. I want to one, know which of my information is personally identifiable. So I want to be able to classify that data. But because you're trusting or because you're taking data from untrusted sources then you have to consider whether or not it's safe for you to use as your own folks and then also for the downstream users as well. It's interesting, you know, in the security world we hear zero trust and then we hear supply chain software supply change. We have to trust everybody. So you got kind of two things going on. You got the hardware kind of like all the infrastructure guys saying, you know, don't trust anything because we have a zero trust model. But as you start getting into the software side it's like trust is critical. Like containers and cloud native services trust is critical. You guys are kind of on that balance where you're saying, hey, I want data to come in we're going to look at it. We're going to make sure it's clean. That's the value here. Isn't it? Is that what I'm hearing? You're taking that and you say, okay we'll ingest it and during the ingest we'll classify it. We'll do some things to it with our tech and put it in a position to be used properly. Is that right? That's exactly right. That's a great summary. Ultimately, if you're taking data in you want to ensure it's safe for everyone else to use and there are a few ways to do it. Safety doesn't just mean whether it's clean or not is it or is there malicious content or not? It means that you have complete coverage and control and awareness over all of your data. And so I know where it came from. I know whether it's clean and I know what kind of data is inside of it. And we don't see, we see the interesting aspects are we see that the cleanliness factor is so critical in the workflow but we see the classification expand outside of that because if your data drifts outside of what your standard workflow was that's when you have concerns. Why is PII information over here? And that's what you have to stay on top of them. Just like AWS is control plane you have to manage it all you have to make sure you know what services have all of a sudden been exposed publicly or not or maybe something's been taken over or not and you control that you have to do that with your data as well. So how do you guys fit into the security posture say a large company that might want to implement this right away? It sounds like it's right in line with what developers want and what people want. It's easy to implement from what I see it's about 10, 15, 20 minutes to get up and running. It's not hard. It's not a heavy lift to get in. How do you guys know once you get operationalized when you're successful? Yeah, it's a lightweight, highly scalable serverless solution. It's built on Fargate containers and it goes in very easily. And then we offer either native integrations through S3 directly or we offer APIs. And the APIs are what a lot of our customers who want inline real time scanning leverage and we also are looking at offering the actual proxy aspects. So those folks who use the S3 APIs that are native to AWS puts and gets we can actually leverage our put and get as an endpoint and when they retrieve the file or place the file in we'll scan it on access as well. So it's not just a one-time data at rest. It can be a data in motion as you're retrieving the information as well. You know, we were talking we were talking with our friends the other day and we're talking about companies like Datadog. This is the model people want. They want to come in and developers are driving a lot of the usage and operational practice. So I have to ask you this fits kind of right in there but also you also have the corporate governance policy you know, the lease that want to make sure that things are covered. So how do you balance that? Because that's an important part of this as well. Yeah, it's we're really flexible for the different ways they want to consume and interact with it. But then also that is such a critical piece. So many of our customers we probably have a 50 50 breakdown of those inside the US versus those outside the US. And so you have those in California with their information protection act. You have GDPR in Europe and you have Asia having their own policies as well. And the way we solve for that is we scan close to the data and we scan in the customer's accounts. So we don't require them to lose chain of custody and send data outside of the account. That is so critical to that aspect. And then we don't ask them to transfer it outside of the region. So that's another critical piece is data residency has to be involved as part of that compliance conversation. How much does cloud enable you to do this that you couldn't really do before? I mean, this really shows the advantage of natively being in the cloud to kind of take advantage of the I as the SAS components to solve these problems. Share your thoughts on how this is possible. What if there was no cloud, what would you do? Yeah, it really makes it a piece of cake, you know, as silly as that sounds. When we deploy our solution, we provide a management console for them that runs inside their own accounts. So again, no metadata or anything has to come out of it. And it's all push button click. And because the cloud makes it scalable, because cloud offers infrastructure as code, we can take advantage of that. And then when they say, go protect data in the Ireland region, they push a button, we stand up a stack right there in the Ireland region and scan and protect their data right there. If they say, we need to be in GovCloud and operate in GovCloud East, there you go, push the button and you can behave in GovCloud East as well. And with serverless and the region support and all the goodness really makes a really good opportunity to really manage these cloud native services with the data interaction. So really good prospects. Final question for you. I mean, we love the story. I think it's going to be a really changing market in this area in a big way. I think the data storage relationship relative to higher level services will be huge as cloud native continues to drive everything. What's the future? I mean, you guys see yourself as a all encompassing, all singing and dancing storage platform for a set of services that you're going to enable developers and drive that value. Where do you see this going? I think that it's a mix of both. Ultimately, you saw even on storage day the announcement of FileCache and FileCache creates a new common namespace across different storage platforms. And so the notion of being able to use one area to access your data and have it come from different spots is fantastic. That's been in the on-prem world for a couple of years and it's finally making it to the cloud. I see us following that trend and helping support. We're super laser focused on cloud storage itself. So EBS volumes, we keep having customers come to us and say, I don't want to run agents in my EC2 instances. I want you to snap and scan. And I've got all this EFS and FSX out there that we want to scan. And so we see that all of the cloud storage platforms, Amazon, WorkDocs, EFS, FSX, EBS, S3 will all come together and we'll provide a solution that's super simple, highly scalable that can meet all the storage needs. So that's our goal right now and where we're working towards. Well, cloud storage security, you couldn't get a more descriptive name of what you guys are working on. And again, I've had many concerts with Andy Jassy when he was running AWS. And he always loves to quote the innovators dilemma. One of his teachers at Harvard Business School. And we were riffing on that the other day and I want to get your thoughts. It's not so much the innovators dilemma anymore relative to cloud because that's kind of a done deal. It's the integrators dilemma. And so the integrations are so huge now. If you don't integrate the right way, that's the new dilemma. What's your reaction to that? 100% agree. It's been super interesting. Our customers have come to us for a security solution and they don't expect us to be, because we don't want to be either, our own engine vendor. We're not the ones creating the engines. We are integrating other engines in and so we can provide a multi-engine scan that gives you higher efficacy. So this notion of offering simple integrations without slowing down the process, that's the key factor here is what we've been after. So we are about simplifying the cloud experience to protecting your storage. And it's been so funny because I thought customers might complain that we're not a name brand engine vendor, but they love the fact that we have multiple engines in place and we're bringing that to them this higher efficacy multi-engine scan. I mean the developer trends can change on a dime. You make it faster, smarter, higher velocity and more protected. That's a winning formula in the cloud. So, Ed congratulations and thanks for spending the time to riff on and talk about cloud storage security and congratulations on the company's success. Thanks for coming on theCUBE. My pleasure. Thanks a lot John. I really appreciate your time. This is the CUBE Conversation here in Palo Alto, California. I'm John Furrier, host of theCUBE. Thanks for watching.