 It's theCUBE, covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. Welcome back to the Vertica Virtual Big Data Conference, BDC 2020. You know, it was supposed to be a physical event in Boston at the Encore. We were Vertica pivoted to a digital event and we're pleased that the theCUBE could participate because we've participated in every BDC since the inception. Rich Gaston this year is the Global Solutions Architect for Security Risk and Governance at MicroFocus. Rich, thanks for coming on. Good to see you. Hey, thank you very much for having me. So, you got a chewy title, man. You got a lot of stuff, a lot of hairy things in there, but maybe you could talk about your role as an architect in those spaces. Sure, absolutely. We handle a lot of different requests from the global 2000 type of organization that will try to move various business processes, various application systems, databases into new realms, whether they're looking at opening up new business opportunities, whether they're looking at sharing data with partners securely. They might be migrating into cloud applications and doing migration into a hybrid IT architecture. So, we will take those large organizations and their existing installed base of technical platforms and data users and try to chart a course to the future using MicroFocus technologies, but also partnering with other third parties out there in the ecosystem. So, we have large solid relationships with the big cloud vendors, with also a lot of the big database vendors. Vertica's our in-house solution for big data and analytics and we are one of the first integrated data security solutions with Vertica. We've had great success out in the customer base with Vertica as organizations have tried to add another layer of security around their data. So, what we would try to emphasize is an enterprise-wide data security approach where you're taking a look at data as it flows through the enterprise from its inception where it's created, where it's ingested all the way through the utilization of that data and then to the other uses where we might be doing shared analytics with third parties. How do we do that in a secure way that maintains regulatory compliance and that also keeps our company safe against data breach? A lot has changed since the early days of big data since certainly since the inception of Vertica. You know, it used to be big data. Everybody was rushing to figure it out. You had a lot of skunkworks going on and it was just like, get figure out data. And then as organizations began to figure it out, they realized, wow, who's governing this stuff? A lot of shadow IT was going on and then the CIO was called to sort of reign that back in as well, you know, with all kinds of whatever, fake news, you know, the hacking of elections and so forth, the sense of heightened security has gone up dramatically. So I wonder if you could talk about the changes that have occurred in the last several years and how you guys are responding. Yeah, it's a great question and it's been an amazing journey because I was walking down the street here in my hometown of San Francisco at Christmas time years ago and I got a call from my bank and they said, we want to inform you your card has been breached by Target Hack at Target Corporation and they got your card and they also got your pin. And so you're going to need to get a new card. We're going to cancel this. You need some cash. I said, yeah, it's Christmas time, so I need to do some shopping. And so they worked with me to make sure that I could get that cash and then get the new card and the new pin and being a professional on the inside of the industry, I really questioned how did they get the pin, tell me more about this. And they said, well, we don't know the details, but you know, I'm sure you'll find out. And in fact, we did find out a lot about that breach and what it did to Target, the impact the $250 million immediate impact, CIO gone, CEO gone. This was a big one in the industry and it really woke a lot of people up to the different types of threats on the data that we're facing with our largest organizations, not just financial data, medical data, personal data of all kinds. Flash forward to the Cambridge Analytica scandal that occurred where Facebook is handing off data or making a partnership agreement. I think they can trust and then that is misused and who's gonna end up paying the cost of that? Well, it's gonna be Facebook at a tune of about five billion on that, plus some other fines that'll come along and other costs that they're facing. So what we've seen over the course of the past several years has been an evolution from data breach, making the headlines and how do my customers come to us and say, help us neutralize the threat of this breach, help us mitigate this risk and manage this risk. What do we need to be doing? What are the best practices in the industry? Clearly what we're doing on the perimeter security and the application security and the platform security is not enough. We continue to have breaches and we are the experts at that answer. The follow on fascinating piece has been the regulators jumping in now, first in Europe, but now we see California enacting a law just this year. It came into place that is very stringent and has a lot of deep protections that are really far reaching around personal data of consumers. We look at jurisdictions like Australia where fiduciary responsibility now goes to the board of directors. That's getting attention for a regulated entity in Australia. If you're on the board of directors, you better have a plan for data security. And if there is a breach, you need to follow protocols or you personally will be liable. And that is a sea change that we're seeing out in the industry. So we're getting a lot of attention on both, how do we neutralize the risk of breach, but also how can we use software tools to maintain and support our regulatory compliance efforts as we work with, say, the largest money center bank out of New York. I've watched their audit year after year and it's gotten more and more stringent, more and more specific. Tell me more about this aspect of data security. Tell me more about encryption. Tell me more about key management. The auditors are getting better and we're supporting our customers in that journey to provide better security for the data, to provide a better operational environment for them to be able to roll new services out with confidence that they're not going to get breached without confidence, they're not going to have a regulatory compliance fine or a nightmare in the press. And these are the major drivers that help us with Vertica sell together into large organizations to say, let's add some defense in depth to your data. And that's really a key concept in the security field, the concept of defense in depth. We apply that to the data itself by changing the actual data element of Rich Gaston, I will change that name into cyber text. And that then yields a whole bunch of benefits throughout the organization as we deal with the life cycle of that data. Okay, so a couple of things I want to mention there. So first of all, totally board level topic, every board of directors should really have cyber and security as part of its agenda and it does for the reasons that you mentioned. The other is GDPR got it all started, I guess it was May 2018 that the penalties went into effect. And that just created a whole domino effect. You mentioned California and acting its own laws, which in some cases are even more stringent and you're seeing this all over the world. So I think, one of the questions I have is, how do you approach all this variability? It seems to me, you can't just take a narrower approach. You have to have an end to end perspective on governance and risk and security and the like. So are you able to do that and if so, how so? Absolutely, I think one of the key areas in big data in particular has been the concern that we have a schema, we have a database tables, we have columns and we have data, but we're not exactly sure what's in there. We have application developers that are being given sandbox space in our clusters and what are they putting in there? So can we discover that data? We have those tools with the micro focus to discover sensitive data within your data stores, but we can also protect that data and then we'll track it. And what we really find is that when you protect, let's say five billion rows of a customer database, we can now know what is being done with that data on a very fine grained and granular basis to say that this business process has a justified need to see the data in the clear. We're going to give them that authorization, they can decrypt the data. Secure data, my product knows about that and tracks that and can report on that and say at this date and time, Rich Gaston did the following thing to be able to pull data in the clear and that could be then used to support your regulatory compliance responses and an audit to say who really has access to this and what really is that data? Then in GDPR, we're getting down into much more fine grained decisions around who can get access to the data and who cannot. And organizations are scrambling. One of the funny conversations that I had a couple of years ago as GDPR came into place was, it seemed that a couple of customers were taking the sort of brute force approach of, we're going to move our analytics and all of our data to Europe, to European data centers because we believe that if we do this in the US, we're going to violate their law. But if we do it all in Europe, we'll be okay. And that simply was a short-term way of thinking about it. You really can't be moving your data around the globe to try to satisfy a particular jurisdiction. You have to apply the controls and the policies and put the software layers in place to make sure that anywhere that someone wants to get that data, that we have the ability to look at that transaction and say it is or is not authorized. And that we have a lock solid way of approaching that for audit and for compliance and risk management. And once you do that, then you really open up the organization to go back and use those tools the way they were meant to be used. We can use Vertica for AI. We can use Vertica for machine learning and for all kinds of really cool use cases that are being done with IoT, with other kinds of cases that we're seeing that require data being managed at scale, but with security. And that's the challenge I think in the current era is how do we do this in an elegant way? How do we do it in a way that's future proof when CCPA comes in? How can I lay this on as another layer of audit and responsibility and control around my data so that I can satisfy those regulators as well as the folks over in Europe and Singapore and China and Turkey and Australia. It goes on and on. Each jurisdiction out there is now requiring audit. And like I mentioned, the audits are getting tougher. And if you read the news, the GDPR example I think is classic. They told us in 2016 it's coming. They told us in 2018 it's here. And they're telling us in 2020 we're serious about this and here's the fines. And you better be aware that we're coming to audit. And when we audit you, we're going to be asking some tough questions. You can't answer those in a timely manner. Then you're going to be facing some serious consequences. And I think that's what's getting attention. Yeah, so the whole big data thing started with Hadoop and Hadoop is open, it's distributed, it just created a real governance challenge. I want to talk about your solutions in this space. Can you tell us more about micro focus voltage? I want to understand what it is and then get into sort of how it works. And then I really want to understand how it's applied to Vertica. Yeah, absolutely. That's a great question. And first of all, we were the originators of format preserving encryption. We developed some of the core basic research out of Stanford University that then became the company of voltage that filled the brand name that we apply, even though we're part of micro focus. So the lineage still goes back to Dr. Bonnet down at Stanford, one of my buddies there, and he's still at it doing amazing work in cryptography and keeping moving the industry forward and the science forward of photography. It's a very deep science and we all want to have it peer reviewed. We all want to be attacked. We all want to have it be proved secure that we're not selling something to a major money center bank that is potentially risky because it's obscure and we're private. So we have an open standard. For six years, we worked with the Department of Commerce to get our standard approved by NIST, the National Institute of Science and Technology. They initially said, well, AES 256 is going to be fine. And we said, well, it's fine for certain use cases, but for your database, you don't want to change your schema. You don't want to have this increase in storage costs. What we want is format preserving encryption. And what that does is turns my name rich into a four letter ciphertext that can be reversed. The mathematics of that are fascinating and really deep and amazing, but we really make that very simple for the end customer because we produce APIs. So these application programming interfaces could be accessed by applications in C or Java, C-sharp, other languages, but they can also be accessed in microservice manner via REST and web service APIs. And that's the core of our technical platform. We have an appliance-based approach. So we take a secure data appliance, we'll put it on-prem, we'll make 50 of them if you're a big company like Verizon and you need to have these co-located around the globe. No problem. We can scale it to the largest enterprise needs, but our typical customer will install several appliances and get going with a couple of environments like QA and prod to be able to start getting encryption going inside their organization. Once the appliances are set up and installed, it takes just a couple of days of work for a typical technical staff to get done, then you're up and running to be able to plug in the clients. Now what are the clients? Vertica is a huge one. Vertica is one of our most powerful client endpoints because you're able to now take that API, put it inside Vertica. It's all open on the internet. We can go and look at vertica.com slash secure data. You get all of our documentation on it. You understand how to use it very quickly. The APIs are super simple. They require three parameter inputs. It's a really basic approach to being able to protect and access data. And then it gets very deep from there because you have data like credit card numbers. It's very different from a street address and we want to take a different approach to that. We have data like birth date. We want to be able to do analytics on dates. We have deep approaches on managing analytics on protected data like date without having to put it in the clear. So we maintain a lead in the industry in terms of being in the innovator of the FF1 standard. What we call FF1 is format preserving encryption. We license that to others in the industry per our NIST agreement. So we're the owner, we're the operator of it and others use our technology. We are the original founders of that. And so we continue to sort of lead the industry by adding additional capabilities on top of FF1 that really differentiate us from our competitors. Then you look at our API presence. We can definitely run in Hadoop but we also run in open systems. We run on mainframe, we run on mobile. So anywhere in the enterprise, we run in the cloud. Anywhere you want to be able to put secure data to be able to access or protect data we're going to be there and be able to support you there. Okay, so I'm going to, let's say I've talked to a lot of customers this week and let's say I'm running in Eon mode and I got some workload running in AWS. I got some on-prem. I'm going to take an appliance or multiple appliances. I'm going to put it on-prem but that will also secure by my cloud workloads as part of a sort of shared responsibility model for example, or how does that work? No, that's absolutely correct. We're really flexible that we can run on-prem or in the cloud as far as our crypto engine. The key management is really hard stuff. Cryptography is really hard stuff and we take care of all that. So we've all baked that in and we can run that for you as a service either in the cloud or on-prem on your small VMs. So really the lightweight footprint for me running my infrastructure. When I look at the organization like you just described it's a classic example of where we fit because we will be able to protect that data. Let's say you're ingesting it from a third party or from an operational system. You have a website that collects customer data. Someone has now registered as a new customer and they're going to do e-commerce with you. We'll take that data and we'll protect it right at the point of capture. And we can now flow that through the organization and decrypted at will on any platform that you have that you need us to be able to operate on. So let's say you want to take that customer data from the operational transaction system. Let's throw it into Eon. Let's throw it into the cloud. Let's do analytics there on that data and we may need some decryption. We could place secure data wherever you want to be able to service that use case. In most cases what you're doing is a simple tiny little atomic key fetch across a protected tunnel, your typical TLS type of tunnel. And once that key is then cashed within our client, we maintain all that technology for you. You don't have to know about key management or caching. We're good at that. That's our job. And then you'll be able to make those API calls to access or protect the data and apply the authorization authentication controls that you need to be able to service your security requirements. So you might have third parties having access to your vertical clusters. That is a special need. And we can have that ability to say employees can get X and the third party can get Y. And that's a really interesting use case we're seeing for shared analytics in the internet now. Yeah, for sure. So you can set the policy how we want. I have to ask you, in a perfect world I would encrypt everything, but part of the reason why people don't is because of performance concerns. Can you talk about, and you touched upon it, I think just recently with your sort of atomic access, but can you talk about, and I know Vertica, it's Ferrari, et cetera, but anything that slows it down, I'm going to be a concern. Our customer is concerned about that. What are the performance implications of running encryption on Vertica? Great question there as well. And what we see is that we want to be able to apply scale where it's needed. And so if you look at the ingest platforms that we find, Vertica is commonly connected up to something like Kafka, maybe StreamSets, maybe NiFi. There are a variety of different technologies that can route that data, pipe that data into Vertica at scale. Secure data is architected to go along with that architecture at the node or at the executor or at the lowest level operator level. And what I mean by that is that we don't have a bottleneck that everything has to go through one processor or one box or one channel to be able to operate. We don't put an interceptor in between your data coming and going. That's not our approach because those approaches are fragile and they're slow. And so we typically want to focus on integrating our APIs natively within those pipeline processes that come into Vertica. Within the Vertica ingestion process itself, you can simply apply our protection when you do the copy command in Vertica. So really basic, simple use case that everybody is typically familiar with in Vertica land, be able to copy the data and put it into Vertica. You simply say protect as part of the data. So my first name is coming in as part of this ingestion. I'll simply put the protect keyword in the syntax, write in SQL, it's nothing other than just an extension to SQL. Very, very simple for the developer. Easy to read, easy to write. And then you're going to provide the parameters that you need to say, oh, the name is protected with this kind of a format to differentiate it between a credit card number and an alpha numeric string, for example. So once you do that, you then have the ability to decrypt. Now on decrypt, let's look at a couple of different use cases. First within Vertica, we might be doing select statements within Vertica. We might be doing all kinds of jobs within Vertica that just operate at the SQL layer. Again, just insert the word access into the Vertica select string and provide us with the data that you want to access. That's our word for decryption, that's our lingo. And we will then, at the Vertica level, harness the power of its CPU, its RAM, its horsepower at the node to be able to operate on that operator that the decryption requests, if you will. So that gives us the speed and the ability to scale up. So if you start with two nodes of Vertica, we're going to operate at X number of hundreds of thousands of transactions a second, depending on what you're doing, right? Long strings are a little bit more intensive in terms of performance, but short strings like social security number are our sweet spot. So we operate very, very high speed on that. And you won't notice the overhead with Vertica per se at the node level. When you scale Vertica up and you have 50 nodes and you have large clusters of Vertica resources, then we scale with you. And we're not a bottleneck at any particular point. Everybody's operating independently, but they're all copies of each other. They're all doing the same operation, fetch a key, do the work, go to sleep. Yeah, you know, I think this is a, a lot of the customers have said to us this week that one of the reasons why they're like Vertica is it's very mature, it's been around, it's got a lot of functionality. And of course, you know, look, security, I understand is kind of table stakes, but it also can be a differentiator. Big enterprises that you sell to, they're asking for security assessments, SOC2 reports, penetration testing. And I think I'm hearing with the partnership here, you're sort of passing those with flying colors. Are you able to make security a differentiator, or is it just sort of everybody's kind of got to have good security? What are your thoughts on that? Well, there's good security and then there's great security. And what I found with one of my Money Center Bank customers here in San Francisco was based here was the concern around the insider access when they had a large data store. And the concern that a DBA, a database administrator who has privileged everything could potentially expel data out of the organization and in one fell swoop create havoc for them because of the amount of data that was present in that data store and the sensitivity of that data in the data store. So when you put voltage encryption on top of Vertica, what you're doing now is that you're putting a layer in place that would prevent that kind of a breach. So you're looking at insider threats, you're looking at external threats, you're looking at also being able to pass your audit with flying colors. The audits are getting tougher. And when they say, tell me about your encryption, tell me about your authentication scheme, show me the access control list that says that this person can or cannot get access to something. They're asking tougher questions. That's where secure data can come in and give you that quick answer of it's encrypted at rest. It's encrypted and protected while it's in use. And we can show you exactly who's had access to that data because it's tracked via a different layer, a different appliance. And I would even draw the analogy. Many of our customers use a device called a hardware security module and HSM. Now these are fairly expensive devices that are invented for military applications and adopted by banks. And now they're really spreading out and people say, do I need an HSM? Well, with secure data, we certainly protect your crypto very, very well. We have very, very solid engineering. And I'll stand on that any day of the week, but your auditor is gonna wanna ask a checkbox question. Do you have HSMs? Yes or no? Because the auditor understands it's another layer of protection. And it provides me another tamper evident layer of protection around your key management and your crypto. And we as professionals in the industry nod and say, that is worth it. That's an expensive option that you're gonna add on, but your auditor's gonna want it. If you're in financial services, you're dealing with PCI data, you're gonna enjoy the checkbox that says, yes, I have HSMs and not get into some arcane conversation around. Well, no, but it's good enough. That's kind of the argument in conversation we get into when folks wanna say, Vertica has great security. Vertica's fantastic on security. Why would I want secure data as well? It's another layer of protection and it's defense in depth for your data. When you believe in that, when you take security really seriously, and you're really paranoid like a person like myself, then you're gonna invest in those kinds of solutions that get you best in class results. So this is a data, I'm hearing a data-centric approach to security. Security experts will tell you, you gotta layer it. I often say, we live in a new world. The queen used to just build a mode around the queen, she's leaving her castle in this world of distributed data rich. Incredibly knowledgeable guest and really appreciate you being on the front lines and sharing with us your knowledge about this important topic. So thanks for coming on theCUBE. Hey, thank you very much. You're welcome. And thanks for watching everybody. This is Dave Vellante for theCUBE we're covering wall-to-wall coverage of the virtual Vertica BDC big data conference remotely, digitally. Thanks for watching. Keep right there. We'll be right back right after the short break.