 The challenges of legacy data warehouses and traditional business intelligence systems, they've been well documented. They're built on rigid infrastructure and they're managed by really specialized gatekeepers. Data warehouses in the past were as one financial customer once said to me, like a snake swallowing a basketball. Right, imagine that. The amount of data ingested into a data warehouse just overwhelmed the system. Every time Intel came out with a new microprocessor, practitioners that they would chase the chip in an effort to try to compress the overly restrictive elapsed time to insights. And this cycle repeated itself for decades. Cloud data warehouses generally and Snowflake specifically changed all this. Not only were resources virtually infinite, but the ability to separate compute from storage and actually turn off the compute when you weren't using it. Permanently altered the cost, the performance, the scale and the value equation. But as data makes its way into the cloud and is increasingly democratized as a shared resource across clouds and at the edge, practitioners have to bring sec DevOps mindsets to securing their cloud data warehouses. Hello and welcome to this week's Wikibon Cube Insights powered by ETR. In this breaking analysis, we take a closer look at the fundamentals of securing Snowflake and to do so, we welcome two guests into the program. Ben Hursberg is an experienced hacker and developer and an expert in several aspects of data security. He's currently working as the chief data scientist at Satori and he's joined by his colleague, Yoav Cohen, who is a technology visionary and currently serving as CTO at Satori Cyber. Gentlemen, welcome to the cube. Great to see you. Great to be here. Now these two individuals have co-authored a book on Snowflake security. It's a comprehensive guide to what you need to know as a data practitioner using Snowflake. So guys, congratulations on the book. It's really detailed, packed with great information, best practices and practical advice and insights all in one place, so really good work. But so before we get into the discussion, I want to share some ETR survey data just to set the context. We're seeing cybersecurity and data, they're colliding in a really important way. And here's some data points that we've shared before from ETR's latest drill down survey. They asked more than 1200 respondents, we're talking CIOs, CISOs and IT professionals, which organizational priorities will be most important in 2022, and these were the top seven. There were a lot of others, but these were the most important. So it's no surprise that security is number one, although as we shared in our predictions post, the magnitude of its relative importance, it does vary by the degree of expertise within the organization. The Delta's maybe not as significant, for example, in large companies. And you can see where analytics and data fit. And we've tied these two domains together and picked up on a term that our two guests have used. In fact, you guys may have even coined it called data sec ops, which to me is the idea that you bring agile DevOps practices to data operations and built in security as part of the full cycle of managing, whether creating the data, using the data, accessing the data, not a bolt-on, but it's fundamental. So guys, what do you make of this data and what's your point of view on data sec ops? So definitely aligns with what we're seeing on the ground in the market. In between what you saw there, you had cybersecurity and data warehousing. In the middle, you had cloud migration. And that's basically what's pushing companies to invest in both security and data and warehousing because the cloud changed the game for cybersecurity. The tools that we use before are not the same tools that we need to use now. And also it unlocks a lot of performance, value, and capabilities around data warehousing. So all of that comes together to a big trend in the industry for investment, for replacement. And definitely we're seeing that on the Snowflake platform which is doing really, really well recently. Yeah, well, thank you Joven. To that point, I want to share another data point and then dive in. Maybe Ben, you can comment. And I want to address why are we always talking about Snowflake? Of course it's a hot company. Everybody knows that. You can see it in the company's financials. But the ETR survey data tells a really compelling story about the company. Here's a chart from the most recent ETR January survey. And so you can see at the top that blue line that represents net score or spending momentum. And the darker line at the bottom represents presence or pervasiveness in the survey sample. Just in background, there are 165 Snowflake customers that responded to this past survey. 10% of companies within the Fortune 500 were in the sample and around 4% of global 2000 companies participated. Just under 30% of the respondents were C-suite executives and about 20% were analysts or engineers or data specialists. With around half were VP director, manager roles, that fat middle. With a very broad mix of industries and there was a bias toward larger companies. Now, back to the chart, net score for a moment that is that top line. It's derived by asking customers, are you adopting Snowflake new in 2022? That's the 27% lime green number. Will you be spending 6% or more on Snowflake relative to 2021? That's the 57% forest green. Is your spending flat? That's the gray. Is it down by 6% or worse? That's the pink area. Are you leaving the platform? That's the bright red and that's the zero defections. So there's none there. So you subtract the reds from the greens and you get net score, which calculates out to 83% in this past survey. But what's remarkable is that Snowflake has held this elevated score for more than 12 quarterly surveys. It's in the stratosphere among the many thousands and thousands of customer companies in the ETR survey. Remember, anything above that 40% line is elevated and Snowflake is like glued to the ceiling. So the bottom line shows that the company's market present continues to grow. That darker line at the bottom, that green shade is that the pace of last quarter is actually accelerating. Snowflake is becoming ubiquitous and customers are becoming intimately familiar with its platform and it's scaling like we've never seen before. And it's building a pretty hard to penetrate fortress we think and an ecosystem. Ben, I wonder in your view, what accounts for Snowflake's performance? Okay, so I would say that we can spend a full session just about such thing. So I'll try to say what I think. I think first of all, it does what it says on the box. It's you get from zero to being able to have a data warehouse easily. You have a very rich support of capability and features that you need for a cloud data warehouse. Your multi-cloud, you're not dependent on one of the big public clouds. And it's fast and scalable and you don't need to worry yourself with the infrastructure behind. You don't need to, God forbid, add any indexes or do things like that. You don't need to do that, at least not often. Indexes never, but other maintenance and the innovation rate, they innovate fast. They add a lot of new capabilities like the move to unstructured data like a lot of security and governance capabilities, high innovation rate as well. Okay, good, and we'll talk about that move. Let's get deeper into the topic now on securing Snowflake. My first question is, look, Snowflake, when you talk to practitioners and customers, they get pretty high marks on security, largely because of the simplicity. So why did you feel the need to write a book on the subject? So definitely Snowflake is investing a lot of effort and putting a lot of emphasis on security. However, Snowflake is the cloud service and like any other cloud service, there's a shared responsibility model between Snowflake and its customers when it comes to fully securing their data cloud. So Snowflake can build amazing features, but then customers have to really adopt them, implement them in the best way. One of the things that we've seen by working with Snowflake customers is that we typically interact with data engineers, but then they have to implement security features and security capability. We thought writing a book about the topic would help these customers to understand the features better, benefit from them better and really structure their implementation and decide what's most important to implement at every step of their journey. Yeah, I think that when I was researching this topic, I could find a lot of good information on the web, but I kind of had a hunt and peck for it. It was really sort of dispersed and you put the information all in one place. You have a nice table of content so I can just zip right to where I want to go. So that was quite useful, I thought. What are the very basic fundamentals of securing Snowflake? In other words, I'm interested in, you get this world of flexible, it's globally distributed, you get democratizing data. How do you really make sure that only those folks that should have access do have access? I mean really, let's talk about that a little bit. I think that, of course, there are a lot of different aspects, but I think that I would start with the big blocks. For example, when you get a Snowflake account out of the box, it's open to the world in terms of network. I would start by limiting that. That should be easy for an organization. It's a couple of commands and you've lowered your risk significantly, both security and compliance. Then one of the common things that you can get a good improvement in decrease of your risk is around authentication. For example, do you have applications that are accessing Snowflake using user password? Okay, change that to using a key. Do you have users with username password, change that to Octa integration or your IDP integration? So I would start with the big blocks that can remove most of my risk. And then of course, there is a lot to do from getting to the data warehouse and to auditing and monitoring. Okay, thank you for that. But Joav, how are these fundamentals that we just heard from Ben? How are they different? This is kind of common sense. What's unique about Snowflake? So a couple of things. First of all, security, we like to say that it's 80% like good security hygiene. You have to make sure that your basics are locked and tightly configured and that brings a lot of value. But two points to consider. First of all, all of these types of controls are pretty static in the sense that once you get in, you get in and then you have pretty, pretty broad access and we'll talk about authorization concepts and everything perhaps today, but these are really static gatekeepers around your data. Once you have access, then it's free for all. When you compare it to other types of environments and what we're seeing in other domains, maybe a move to more dynamic type of controls, elevated access or elevated additional authentication steps before you get elevated access. And what we're thinking is that beyond those static controls, the market is going to move towards implementing more dynamic, more fine-grained control, especially because in Snowflake, but any other data warehouse or a large-scale data store which becomes an aggregation point of data in the company. We were who would really be companies and they bring in data from multiple jurisdiction from across the world so they can get like an overview of the business and run the business in a much more efficient way, but that really creates a pressure point when it comes to securing that data. Okay, Ben, you touched on this a little bit. I want to kind of dig deeper because so Snowflake takes a layered approach, of course, it's sensible. And the layers, network, which talked about identity, access and encryption. And so with any cloud, as you guys mentioned, it's a shared responsibility model. So I want to break that down a bit and let's start with the network. So my responsibility as a customer, I'm going to be responsible to set up the DNS. How much public internet access am I going to have for other users and apps? So how should practitioners think about their end of the bargain on the network? What do they need to know? At the network level, as I mentioned before, a new account is open, network-wise, it's open to the world. And one of the first things I would do would be to set a network policy on the account to limit network access to that account. And of course, in many organizations, you would want to configure that with private link to your cloud environment. But that would be step two. First step is simply set a network policy to make sure that it's not open to the public. Yeah, and that seems pretty straightforward. But let's talk about identity, because it feels like that's where it starts to get tricky. You got to worry about setting up roles and managing users. You could even, you could configure row and column-based access, as I understand it. And I imagine access is where it really gets kind of confusing for a lot of people, especially when you're crossing domain identities. For example, isn't role-based security? Let's land on that for a minute. I think you called it hierarchy hell in the book. So what should we think about in regards to identity? So first of all, it's hierarchy hell. In the book it says that you can use hierarchy, but you should avoid getting to a hierarchy hell. Basically, we've seen that with several snowflake customers where the ability to set roles in the hierarchy model to set a role that inherits privileges from another role that inherits privileges from other roles. Maybe, of course, used in a good way, but it also, in some of the cases, it leads to complexities and to access not being deterministic, at least not obvious to the person who gives access, who's usually the data engineer. So whenever you start having a complex authorization model whenever I want to give Yov access to a certain data set and because things are complex, I also, by mistake, give him access to the salary information of the company. That's when things become tricky. If your roles are messy and complex, then it may lead to data exposure within the organization or outside the organization. How do you find snowflakes integrations? Like if I want to use Okta or I want to use a cyber arc, I mean, how would you grade them on their ability to integrate with popular third-party platforms? So I would say pretty high. Actually, we haven't encountered many customers who haven't configured any of these nowadays pretty basic security integration and it really, really helped setting that good identity management foundation for the platform. So they're investing a lot in that area. We've been following them for a couple of years now and it's really coming along nicely. All right, let's talk about encryption. I mean, that seemed pretty straightforward. Correct me if I'm wrong. I think Snowflake Auto rotates the keys every 30 days. It really seems like your responsibility there is monitoring, making sure you're in compliance. You got good log data or access to good log data. Is that right? So this really depends. So for the average company, I would say yes. For some of the companies with higher security requirements or compliance requirements or both, sometimes there are issues like you companies that do not want to have the data stored in clear text in Snowflake, even encrypted as in the data warehouse encryption or the account encryption, even if someone accidentally gets access to the table, they want them not to be able to pull the data in clear text and then it gets slightly more complicated. You have different ways of tackling this but for the average company or companies who do not have such requirements, then everything in Snowflake is encrypted in transit and address and of course there are more advanced features for higher requirements. Okay, I'm interested in what you guys think of some of the more vulnerable aspects that Snowflake customers should really be aware of. Imagine I'm saying, guys, let's run a pen test. Make sure, yeah, okay, make sure I have no open chest ones but really try to fool me. What would you attack? Where should I be extra cautious? So I would start with where data resides. And if you look at the Snowflake architecture, there's a separation between storage and compute but that also means storage is accessible without going through the compute. That can create opportunities for hackers to go and try and find access where access shouldn't be had. That's where I would focus on. Great, I want to ask you about virtual private Snowflake. I mean, it seems to me, if I have sensitive data, if I don't use virtual private Snowflake, I feel like I'm increasing my risk that a security incident at the shared cloud services layer could impact multiple customers and is this a valid concern? How should we think about reducing that risk and when should I use that higher level of security? So I think first of all, to the best of my knowledge, I'm not a Snowflake employee but to the best of my knowledge, virtual private Snowflake is used by a minority of the customers, a small minority of the customers. There are other more popular ways within Snowflake like private link, for example, to I would say to enhance your security and your account segregation. But I wouldn't say that simply because a platform is multi-tenant, it is vulnerable. Of course, in many cases, your security or compliance requirements requires you to eliminate even this risk, but I wouldn't say that there are a lot of other platforms in different areas that are multi-tenant. And probably better than your average on-prem installation. Okay, so I'll buy that. I would say on that that maybe a shared environment is a higher value target for hackers. So if you're on a shared environment with thousands of other customers, if I'm a hacker, I would go there because then I get data for thousands of customers instead of trying to focus on just one target and getting data for just one company. I think that's the most significant advantage and obviously Snowflake are investing a lot in making all of their environments very, very secure. And from our interactions with large Snowflake customers, we know that Snowflake are going above and beyond in making sure that these environments are secure. Yeah, that's good, that's good news because if I don't have to spend up, I could put the budget elsewhere. How do you guys think Snowflake's recent moves, they're making a couple of big moves. They've recently added unstructured data, they used to have semi-structured data. They're going after the data science and data lake functionality. Do those kinds of moves, I guess they're two different things, but does that change the way that a security pro should think about protecting their Snowflake environments? I would say that Snowflake is moving fast with adding new functionality, well fast, but not too fast. They're releasing it in a controlled way. I would say that for new capabilities, of course in some cases are new attack vectors or new risks and obviously securing different types of data may bring new challenges, but the basics I think remains the same. The basics of network identity, authentication, authorization, auditing, monitoring, I would say they will be the same and perhaps new features or capability will need to be used. And the largest issue is as the data democratization is growing within organizations and more and more people are using your data cloud that also needs to be addressed. Yeah. All right, finally I want to end, I want to talk a little bit about futures. You guys talked in your book about multi-cloud as a way to reduce your reliance on a single vendor and of course it happens through M&A and that's cool. We've talked a lot about multi-cloud and we've been using this term that we coined called super cloud and it references an abstraction layer that exists on top of and floats across, if you will, multiple clouds and it hides some of that underlying complexity and we feel like Snowflake is a good example of a company that's moving in that direction, building value on top of all that hyperscale infrastructure. So I wonder how you see Snowflake's moves in that direction would impact the way you think about data sec ops. So definitely we also see the trend of companies adopting more and more types of cloud and cloud technologies. They're in one cloud today, they want to move to a second one, almost every company that I talk to have nowadays a multi-cloud strategy. With respect to Snowflake, they basically have it figured out because they are an overlay like a super cloud, super data cloud that is spread across any cloud and you can basically pick and choose where you want to put your data for what use cases and that's really, really helpful because then you don't have to manage the complexity of multiple solutions for multiple areas of the business. We see this also in other areas where companies are saying, hey, I prefer to not use a specific cloud technology for that purpose, but take use of vendor that can cover my needs across the clouds, definitely on the security side where they want one throat to choke so to speak, but they want to control things on a central place. As Ben mentioned before, complexity is the enemy of security and having those multi-cloud operations from a security perspective definitely adds complexity, which adds risk. So simplifying that is really, really helpful. Hey, thank you for that and thank you guys for coming on today. Why don't you give us a little bumper sticker on Satori? What do you guys do? Give us the quick commercial. So we help companies secure access to their data on platforms like Snowflake and others. We build really innovative technology that decouples security controls from the actual data layer. So if you think about it, where you can put controls to govern how people access data, you can put it inside the database. You can put it somewhere on the client. We've actually invented a technology that can do that in the middle. So you don't have to coalesce and mix your security concerns with your data. You don't have to go to your clients, users, endpoints, laptops and put technology there. It's a technology that sits in the middle decouples that aspect of your data sec ops operation and really helps companies implement those security controls much faster because it's detached from the rest of their operation. Nice, leaning into that simplicity trend that you talked about. Okay guys, that's all the time we have today. I really want to thank Ben and Yov for coming on theCUBE. It was really great to have you. We'd love to welcome you back at some point. Thank you, Dave. All right, remember these episodes. These episodes are all available as podcasts wherever you listen. All you got to do is search breaking analysis podcast. Check out ETR's website at ETR.ai. We also publish full report every week on wikibon.com and siliconangle.com. You can get in touch with me, email me david.volante at siliconangle.com at dvolante or comment on our LinkedIn post. This is Dave Vellante for theCUBE Insights, powered by ETR. Have a great week, stay safe, be well, and we'll see you next time.