 Welcome, everyone, to this CUBE Conversation featuring Baffle. I'm your host, Lisa Martin, and today excited to be joined by one of our alumni, Amish Devatia, co-founder and CEO of Baffle. Amish, great to have you. Thanks so much for joining us today. Thank you, Lisa. It's great to be here. Tell me a little bit about Baffle. What does your solution do, and why did you co-found the company? Well, Baffle is what we call the next big thing in security. Because of the fact that we focus on the data itself. As you can imagine, security has had a lot of different approaches in terms of how data is protected and how compliance is achieved. We believe the next thing that's going to happen is that the data is going to be protected at the record level, which ensures that in spite of all kinds of bad things that can happen, if the data gets stolen, if the credentials get stolen, the data itself is still protected because of the fact that it is either encrypted or tokenized. So it's not an inspirational form. Share with me, Amish, what some of the gaps were in the data protection market when you and the co-founders decided to launch Baffle. What was it there that you thought, we can solve this problem? Yeah, so we embarked on this journey with a very ambitious goal. And the goal was that there's got to be a way where data can be encrypted, but still can be processed. Traditionally, encryption has been a very, very difficult control to adopt. You have to manage keys, you have to transform the data, you have to keep track of the keys when you actually need to access that data. All of that causes friction. The prevailing way of protecting data was to protect data at rest, which means that you protect the data while it is in storage. Because back in the day, what used to happen is that data disks, disks physical media would get stolen. Data centers were not secure. Everybody had a data center in their basement and that was where the problems used to reside. Today, fast forward now, you have data centers that are extremely secure because of the fact that they are centrally hosted. The cloud is where the data center is. So you don't have the same kind of threats. You don't have disks getting lost. You don't have data being corrupted in a way where it cannot be retrieved. So you need a different type of control because of the fact that the hackers do not go and look for disks that are being discarded. They just get the credentials and go steal the data, which means that the protection boundary now moves from the storage all the way up to the application. And that's exactly where we saw the gap. No database vendor provides the ability to actually protect your data in use or in memory while it is being manipulated. There are approaches that are starting to emerge which require hardware modifications or specific CPU requirements. The baffle solution is all software can be hosted in any cloud and can even be on-premise. All software, any cloud on-prem. When we talk about data, more and more data is being created. That's, you know, the stats are staggering, but also more and more data is being used with cloud-based analytics tools. Of course, there's Gen AI. As companies are migrating data and apps to the cloud, talk about how baffle specifically helps them protect data during this migration process. Yes, so this is actually another gap that we found as we were starting to work with the customers who wanted to migrate to the cloud. So traditionally what customers do is they go in there and talk to the cloud vendor and ask them to migrate their data to the cloud. Most of these cloud solutions actually are hosted in the clouds. The data is extracted from on-premise requirements. They're dropped in the cloud in the clear before the cloud data protection mechanisms click in because they're two separate groups, right? There's a migration team and there's a data protection team and they're not necessarily working together. What we are doing is that we are actually integrating with a data migration service like the AWS DMS where we are just a bump in the wire. So as data is migrating to the cloud, it is being encrypted inline while it is actually moving. So at no point is the data in the clear outside the enterprise firewall. As soon as it lands up in AWS, it's already protected. It's already encrypted. Automatically, that's awesome. Can companies still use cloud-based analytics tools on encrypted data? Yes, so now you're talking about the second half of this problem. So we've talked about the ingestion. When you're ingesting the data, you're encrypting it as it is moving to the cloud. Now when it comes to using that data, that's when the consumption side comes in. So one of the things that we do a really good job of is that we determine what kind of transformation that you do depending on the downstream use case. So I talk about encryption. That is one of the things that we do as far as transforming the data. The other two are tokenization, which is also known as format preserving encryption. So tokenization allows that same data not only to be analyzed, it can be manipulated in the cloud while it looks exactly like the original data. It's a token. So credit card looks like a credit card. It passes all of the lung checks. A name looks like a name. So that is the second transformation. The third transformation is pure masking where you've just decided you don't want to reveal that data at all downstream. So you can just mask it irreversibly so that nobody can use that, nobody can get to the original data downstream. They can still use it. So the transformation, again, is determined by the use downstream, which is something that is unique to the baffle implementation. But it all boils down to analytics. The reason the data is moving to the cloud is because of sophisticated analytics tools that are available in the cloud. Gen AI capabilities are just the latest, but traditionally all new analytics tools are always available much more easily in the cloud. And that's what is driving customers to move the data there. The baffle solution is designed to make that consumption completely seamless. So we allow both the ingestion and the consumption to happen with a no code model. So there's no code modifications needed at all. There's no app dev involved. In addition to that, the key management is completely dependent on the baffle service. It is not done by the app default. So you don't need to hire cryptography experts of any sort. We manage the entire process. And so end-to-end from ingestion to consumption, you're using the baffle solution to ensure that data can be analyzed without any disruptions in terms of performance or in terms of complexity. Okay, so I'm maintaining that performance dialing down the complexity. I do want to touch on data residency requirements. We're seeing so many more of those pop up. So when a company migrates data to the cloud, how is baffle ensuring that the data residency requirements that are in place are met? Sure. So let's talk about compliance in general first and then talk about residence in particular. So compliance is really something that creates the compelling event here requiring the adoption of these controls. The regulators have figured out that now enterprises are collecting data and storing it in places where they don't control the infrastructure. So they're coming up with these restrictions to make sure that if that data is stolen, it is the enterprise's responsibility. The cloud vendors are very clearly saying it's a shared responsibility model where the data itself is something that the enterprises have to be responsible for. So let's take the simplest compliance use case, which is data residency. So the data is moving from one geography to the other. The data irrespective of where it resides is driven by the compliance regulation of where that data is originated. So the EU, for example, has very clear requirements that say that it doesn't matter where the data goes. The EU residence data has to be protected just like it was in the EU. So it comes to residency, what we can do is we can allow the encryption of the data. The keys for that encryption are never stored in the same geography as where the data resides. So now you can cryptographically prove that that data is not revealed in the geography where it is not supposed to be. So for example, US and Europe. So if it's a European data and moves to the US, the keys would still reside in Europe. So irrespective of where the data lines up, you can still say the data is resident in Europe because that's the only place where it can be decrypted. So that's how we solve the data residency problem without having to replicate infrastructure all over the place. Right. From what I understand, storage level encryption isn't enough to meet compliance regulations like this. Why is that? Well, we talked about the issue with storage encryption. It prevents against a threat that doesn't exist. Every time I think about storage encryption and the threat, I think about the Mission Impossible story, right? Where Tom Cruise is descending from the ceiling and he's trying to steal that disk that's sitting there. That's not reality anymore. There's much easier ways of hacking data. So what we are very clearly seeing is that regulators are starting to wake up to that. So the latest one is the PCI DSS, which is a payment cards industry, is regulations which protects data, credit card data, effectively in financial services environments. They are clearly saying that in version four, which goes into effect first of January, 2025, it is no longer enough to just protect data at rest or in storage. You need what are called supplemental controls. So supplemental controls mean that you need to have other requirements, other controls to make sure that that data is protected. So just storage level encryption is not enough. You have to protect it in a way where it is protected in memory, or you have to provide two to six big keys to make sure that the data is protected. The keys have to be rotated. All of those regulations are now going beyond just storage level encryption. Got it, needing those supplemental controls is really key. You touched on the infrastructure part or the fact that when companies are migrating data to the cloud, they don't control the infrastructure. With that said, what's the best way to protect data? Because the company doesn't control the infrastructure, whether it's AWS or whatnot. Well, there are technical capabilities now that allow you to do that, which is exactly what encryption is all about. So what you do is you protect the data with what is known as a bring your own key model. So the data owner is assigned a key. The data is encrypted. Now, if it's moved to the cloud, it's okay if you don't control the infrastructure, because you still control the data. So the only way that data can be accessed is if you make the keys available in the cloud, in your own environment, in your own, what is known as a VPC. So the database, which is in the database, administrators domain never has clear data. That's the threat we are protecting against. A database administrator whose credentials are compromised. Gotcha. So I want to touch on Gen AI for a second. We can't have a conversation these days without talking about it. It's one of the hottest topics on the planet. Every company, not every, but so many are dipping their toes in the water. In terms of Gen AI queries, how will protecting the data, or will it impact the accuracy or the speed of Gen AI queries? This is a very, very interesting problem. And we have an even more interesting solution because one of the biggest issues about Gen AI is that your outcomes are only as good as what you put in. What makes it worse is it actually lies to you if it doesn't have accurate data. So one of the biggest requirements now is to make sure that you keep the data protected while it is being fed into a large language model. There are various models that are emerging in how this can be done. There's more to come in the coming weeks, months that we'll be talking about. But to give you a sneak peek to that, the most critical way of making sure that you have good outcomes from Gen AI is to feed it data that can be processed just like it was the original data. So we talked about the fact that Baffle enabled this capability where we are able to process data with analytics tools. The same exact model applies for Gen AI. What we're able to do is to tokenize the data as it gets into object storage, which is usually the store for any kind of learning model. And then we allow processing, what are known as embeddings to happen on that data while that data is protected. Again, all of this is happening on the ingestion side when it comes to the consumption side. The queries have to be monitored very carefully so that no sensitive data is leaking. Because if you allow uncontrolled querying, there are no secrets. So we have a control that is based on the extraction part to make sure that that data is protected as it is being extracted so that you get that complete end-to-end control and protection of data without affecting the actual outcome of that query. Got it. It's that end-to-end protection that you did a great job of really double-clicking and describing actually how Baffle is delivering that and how you're really facilitating customers' migration to the cloud or their analytics and Gen AI use cases. Do you have a favorite customer story in which you think really just shines a spotlight on Baffle's value prop that you shared with us? I have many. And the one common thread that runs between all of them is the customers always come to us and say, where were you when we were trying to do this ourselves? So what we do is we help customers completely get rid of their existing resources required to migrate data securely. So my favorite customer story is still the one where this was a very large industrial conglomerate that was moving data to the cloud. They started out with the open source protection mechanisms that were available that completely failed. Their application failed to work anymore. So in this case, their application was quoted 20 years ago. The application developers did not exist anymore. What we were able to do is to move that application to AWS, using the database migration service where they originated the data in the field. This is massive amounts of IoT data that's in the field. Encrypt the data as it's moving into the cloud, move the application over, I'll put something called lift and shift, and then start processing the data. The favorite part of the customer story that comes in here is they started extracting the data out to process it, which is our standard encryption model. Only to come back to ask us, hey, is there a way around this? Why do we have to extract our data? And that's where the battle advanced encryption came in where we were actually able to let them analyze that data without decrypting it, using the application that they had been always using, Tableau, which is now owned by Salesforce. So they were able to analyze that data without actually having to extract all of that data out and decrypt it. And that's where the value proposition really came in. So it was the perfect two-step process where we sold them a solution and then we were able to upsell that solution for obviously more revenue, but more importantly, for much, much more differentiated functionality that they continue to now expand beyond just that initial application. Ah, that's outstanding. You're really meeting customers where they are and making things a lot less complex, which in today's world of complexity is incredibly important. You gave us a couple of teasers about some of the things that are next for Baffle, but what are you looking forward to the most in the next six to 12 months? We're looking forward to enabling more and more of these massive analytics use cases, something called Data Lakes. So while we talked about databases, we're continuing to expand beyond that into Data Lakes and Data Warehouses. As you know, there's so many interesting solutions out there started with Snowflake, Redshift, Databricks now. So we're seeing a lot of good data processing capabilities that are emerging. We believe we are the one that unlocks access for data scientists to a lot more data because you don't always have to make that trade off. We call it the having your cake and eating it to solution, which is they can still get to sensitive data for better outcomes, especially when it comes to generating AI, but even for generic applications that just do analytics. So what we are able to do is unlock a lot more data that can be processed now with tools that they're already familiar with without compromising the privacy of the data and get into trouble with all of those compliance regulators. So to me, that's the most exciting part. We're not building yet another security company, we're building a data processing company that has security built in to the process. Security built in and I like the analogy of being able to have your cake and eat it too. I think a lot of customers will clearly see the value in the solution. Thanks so much for coming on theCUBE as part of this CUBE conversation and helping us understand what Baffle is doing, what's different and the value in it for customers. We really appreciate your time. They said it was a pleasure and I'm looking forward to any more conversations like this as we make progress. Me too, we're gonna keep our eyes on baffle.io. We wanna thank you for watching the program and then remind you to keep it right here for more action on theCUBE, the leader in hybrid tech event coverage. Thank you.