 Hello everybody, and thank you for joining us today for the Virtual Vertica BDC 2020. Today's breakout session is entitled, Keep Data Private, Prepare and Analyze Without Unencrypting with Voltage, Secure Data for Vertica. I'm Paige Roberts, Open Source Relations Manager at Vertica, and I'll be your host for this session. Joining me is Rich Gaston, Global Solutions Architects, Security, Risk, and Governments at Voltage. Before we begin, I encourage you to submit your questions or comments during the virtual session. You don't have to wait until the end. Just type your question as it occurs to you or comment in the question box below the slides and then click Submit. There will be a Q&A session at the end of the presentation where we'll try to answer as many of your questions as we're able to get to you during the time. Any questions that we don't address will do our best to answer offline. If you want, you can visit the Vertica forum to post your questions there after the session. That's going to take the place of the developer lounge. Our engineering team is planning to join the forums to keep the conversation going. As a reminder, you can also maximize your screen by clicking the double arrow button in the lower right corner of the slide. That'll allow you to see the slides better. Before you ask us, this virtual session is being recorded and it will be available to view on demand this week. We'll send you a notification as soon as it's ready. All right, let's get started. Over to you, Rich. Hey, thank you very much, Paige, and I appreciate the opportunity to discuss this topic with the audience. My name is Rich Gaston and I'm a Global Solutions Architect within the Micro Focus team. I work on global data privacy and protection efforts for many different organizations looking to take that journey toward breach defense and regulatory compliance from platforms ranging from mobile to mainframe everything in between cloud, you name it, we're there in terms of our solution sets. Vertica is one of our major partners in this space and I'm very excited to talk with you today about our solutions on the Vertica platform. First let's talk a little bit about what you're not going to learn today and that is on screen you'll see just part of the mathematics that goes into the format preserving encryption algorithm. We are the originators and authors and patent holders on that algorithm. Came out of research from Stanford University back in the 90s and we are very proud to take that out to the market through the NIST standards process and license that to others so we're the originator and maintainer of both standards and a thought leader in the industry. We try to make this easy and you don't have to learn any of this tough math. Behind this there are also many other layers of technology that are part of the secure data platform such as stateless key management. It's a really complex area and we make it very simple for you. We have very mature and powerful products in that space that really make your job quite easy when you want to implement our technology within Vertica. So today our goal is to make data protection easy for you to be able to understand the basics of voltage secure data. You're going to be learning how the Vertica UDX can help you get started quickly and we're going to see some examples of how Vertica plus voltage secure data are going to be working together in our customer cases out in the field. Let's take you through a quick introduction to voltage secure data, the business drivers and what's this all about. First of all we started off with breach defense. We see that despite continued investments in perimeter and platform security, data breaches continue to occur. Voltage secure data plus Vertica provides defense in depth for sensitive data and that's a key concept that we're going to be referring to in the security field defense in depth. It's a standard approach to be able to provide more layers of protection around sensitive assets such as your data and that's exactly what secure data is designed to do. Now that we've come through many of these breach examples and big ticket items getting the news around breaches and their impact of business, regulators have stepped up and regulatory compliance is now a hot topic in data privacy. Regulations such as GDPR came online in 2018 for the EU. CCPA came online just this year a couple months ago for California and is the de facto standard for the United States now as organizations are trying to look at the best practices for providing regulatory compliance around data privacy and protection. These give massive new rights to consumers but also obligations to organizations to protect that personal data. Secure data plus Vertica provides fine-grain authorization around sensitive data and we're going to show you exactly how that works within the Vertica platform. At the bottom you'll see some of the snippets there of the news articles that just keep racking up and our goal is to keep you off the news, to keep your company safe so that you can have the assurance that even if there is an unintentional or intentional breach of data out of the corporation, if it is protected by voltage secure data it'll be of no value to those hackers and then you have no impact in terms of risk to the organization. What do we mean by defense in depth? Well let's take a look first at the encryption types and the benefits that they provide and we see our customers implementing all kinds of different protection mechanisms within the organization. You could be looking at disk level protection, file system protection, protection on the files themselves. You could protect the entire database. You could protect our transmissions as they go from the client to the server via TLS or other protected tunnels. And then we look at field level encryption and that's what we're talking about today. That's all the above protections at the perimeter level, at the platform level plus we're giving you granular access control to your sensitive data. Our main message is keep the data protected at the earliest possible point and only access it when you have a valid business need to do so. That's a really critical aspect as we see Vertica customers loading terabytes, petabytes of data into clusters of Vertica console, Vertica database being able to give access to that data out to a wide variety of end users. We started off with organizations having four people in an office doing data science or analytics or data warehousing or whatever it's called within an organization and that's now ballooned out to a new customer coming in and telling us we're going to have a thousand people accessing it plus service accounts accessing Vertica. We need to be able to provide fine level access control and be able to understand what are folks doing with that sensitive data and how can we secure it. The best practice is possible. In very simple state voltage protect data at rest and in motion. The encryption of data facilitates compliance and it reduces your risk of breach. So if you take a look at what we mean by field level, we could take a name that name might not just be in US ASCII. Here we have a sort of Latin one extended example of Harold Potter and we could take a look at the example protected data. Notice that we're taking a character set approach to protecting it meaning I've got an alpha numeric option here for the format that I'm applying to that name. That gives me a mix of alpha and numeric and plus I've got some of that Latin one extended alphabet in there as well and that's really controllable by the end customer. They can have this be just US ASCII. They can have it be numbers for numbers. You can have a wide variety of different protection mechanisms including ignoring some characters in the alphabet in case you want to maintain formatting. We've got all the bells and whistles that you would ever want to put on top of format preserving encryption and we continue to add more to that platform as we go forward. Taking a look at tax ID, there's an example of numbers for numbers, pretty basic but it gives us the sort of idea that we can very quickly and easily keep the data protected while maintaining the format. No schema changes are going to be required when you want to protect that data. If you look at credit card number, really popular example and the same concept can be applied to tax ID. Often the last four digits will be used in a tax ID to verify someone's identity. That could be on an automated telephone system. It could be a customer service representative just trying to validate the security of the customer and we can keep that data in the clear for that purpose while protecting the entire string from breach. Dates are another critical area of concern for a lot of medical use cases but we're seeing data birth being included in a lot of data privacy conversations and we can protect dates with dates. They're going to be a valid date and we have some really nifty tools to maintain offsets between dates. So again, we've got the real depth of capability within our encryption. It's not just saying here's a one size fits all approach. GPS location, customer ID, IP address, all of those kinds of data strings can be protected by voltage secure data within Vertica. Let's take a look at the UDX basics. So what are we doing when we add voltage to Vertica? Vertica stays as is in the center. In fact, if you get the Vertica distribution you're getting the secure data UDX on board. You just need to enable it and have secure data virtual appliance that's the box there on the middle right. That's what we come in and add to the mix as we start to be able to add those capabilities to Vertica. On the left hand side you'll see that your users, your service accounts, your analytics are still typically doing select update insert delete type of functionality within Vertica and they're going to come in through Vertica's access control layer. They're going to also access those services via SQL and we simply extend SQL for Vertica. So when you add the UDX you get additional syntax that we can provide and we're going to show you examples of that. You can also integrate that with concepts like views within Vertica so that we can say let's give a view of data that gives the data in the clear using the UDX to decrypt that data and let's give everybody else access to the raw data which is protected. Third parties can be brought in folks like contractors or folks that aren't vetted as closely as a security team might do for internal sensitive data access can be given access to the Vertica cluster without risk of them breaching and going into some area they're not supposed to take a look at. Vertica has excellent control for access down even to the column level which is phenomenal and really provides you with world class security around the Vertica solution itself. Secure data adds another layer of protection like we're mentioning so that we can have data protected in use, data protected at rest, and then we can have the ability to share that protected data throughout the organization and that's really where secure data shines is the ability to protect that data on mainframe, on mobile and open systems in the cloud everywhere you want to have that data move to and from Vertica then you can have secure data integrated with those endpoints as well. That's an additional solution on top of the Secure Data Plus Vertica solution that is bundled together today for a sales purpose but we can also have that conversation with you about those wider secure data use cases and we'd be happy to talk to you about that. Secure Data Virtual Appliance is a lightweight appliance sits on something like eight cores, 16 gigs of RAM, 100 gig of disk or 200 gig of disk. Really a lightweight appliance you can have one or many. Most customers have four in production just for redundancy, they don't need them for scale but we have some customers with 16 or more in production because they're running such high volumes of transaction load. They're running a lot of web service transactions and they're running Vertica as well so we're gonna have those virtual appliances co-located around the globe hooked up to all kinds of systems like syslog, LDAP, load balancers. We've got a lot of capability within the appliance to fit into your enterprise IT landscape. So let me get you directly into the meat of what does the UDX do? If you're technical and you know SQL, this is probably gonna be pretty straightforward to you. You'll see the copy command used widely in Vertica to get data into Vertica. So let's try to protect that data when we're ingesting it. Let's grab it from maybe a CSV file and put it straight into Vertica but protect it on the way and that's what the UDX does. We have voltage secure protect as an added syntax like I mentioned to the Vertica SQL and that allows us to say we're gonna protect the customer first name using the parameters of hyperalpha numeric. That's our internal lingo of a format within secure data that's a part of our API. The API has required very few inputs. The format is the one that you as a developer will be supplying and you'll have different ones for maybe SSN, you'll have different formats for street address but you can reuse a lot of your formats across a lot of your PII, PHI data types. Protecting after ingest is also common so I've got some data that's already been put into a staging area. Perhaps I've got a landing zone, a sandbox of some sort. Now I want to be able to move that into a different zone in Vertica, a different area of the schema and I wanna have that data protected. We can do that with the update command and simply again you'll notice voltage secure protect nothing too wild there, basically the same syntax. We're gonna query out protected data. How do we search once I've encrypted all my data? Well actually there's a pretty nifty trick to do so. If you wanna be able to query out protected data then we have the search string like a phone number there in this example. Simply call voltage secure protect on that. Now you'll have the cipher text and you'll be able to search the stored cipher text. But again we're just format preserving encrypting the data and it's just a string and we can always compare those strings using standard syntax and SQL. Using views to decrypt data again a powerful concept in terms of how to make this work within the Vertica landscape when you have a lot of different groups of users. Views are very powerful to be able to point a BI tool for instance business intelligence tools, Cognos, Tableau, et cetera might be accessing data from Vertica with simple queries. Well let's point them to a view that does the hard work and uses the Vertica nodes and it's horsepower of CPU and RAM to actually run that UDX and do the decryption of the data in use temporarily in memory and then throw that away so that it can't be breached. That's a nice way to keep your users active and working and going forward with their data access and data analytics while also keeping the data secure in the process. And then we might wanna export some data and push it out to someone in a clear text manner. We've got a third party needs to take the tax ID along with some data to do some processing. Well all we need to do is call voltage secure access again very similar to the protect call and your part for writing the parameter again and boom we have decrypted the data and used again the Vertica resources of RAM and CPU and horsepower to do the work. All we're doing with voltage secure data appliance is a real simple little key fetch across a protected tunnel. That's a tiny atomic transaction, gets done very quick and you're good to go. This is it in terms of the UDX. You have a couple of calls and one parameter to pass. Everything else is config driven and really you're up and running very quickly. We can even do demos and samples of this Vertica UDX using hosted appliances that we put up for pre-sales purposes. So folks wanna get up and get a demo going. We could take that UDX, configure it to point to our appliance sitting on the internet and within a couple of minutes we're up and running with some simple use cases. Of course for a on-prem deployment or a deployment in the cloud you'll want your own appliances and your own crypto district you have your own security but it just shows that we can easily connect to any appliance and get this working in a matter of minutes. Let's take a look deeper at the voltage plus Vertica solution and we'll describe some of the use cases and path to success. First of all your steps to implementing data-centric security in Vertica. I wanna note there on the left-hand side identify sensitive data. How do we do this? I have one customer where they look at me and say Rich we know exactly what our sensitive data is. We develop the schema, it's our own app. We have a customer table. We don't need any help in this. We've got other customers that say Rich we have a very complex database environment with multiple databases, multiple schemas, thousands of tables, hundreds of thousands of columns. It's really really complex help and we don't know what people have been doing exactly with some of that data. We've got various teams that share this resource. There we do have additional tools and I wanted to give a shout out to another micro-focused product which is called Structured Data Manager. It's a great tool that helps you identify sensitive data with some really amazing technology under the hood that can go into a vertical repository, scan those tables, take a sample of rows or a full table scan and give you back some really good reports on we think this is sensitive, let's go confirm it and move forward with data protection. So if you need help on that, we've got the tools to do it. Once you identify that sensitive data, you're gonna wanna understand your data flows and your use cases. Take a look at what analytics you're doing today. What analytics do you wanna do on sensitive data in the future? Let's start designing our analytics to work with sensitive data and there's some tips and tricks that we can provide to help you mitigate any kind of concerns around performance or any kind of concerns around rewriting your SQL. As you've noted, you can just simply insert our SQL editions into your code and you're off and running. You wanna install and configure the UDX and secure data software plans. Well, the UDX is pretty darn simple. The documentation on Vertica is publicly available. You can see how that works and what you need to configure it. One file here and you're ready to go. So that's a pretty straightforward process. You can grant some access to the UDX and that's really up to the customer because there are many different ways to handle access control on Vertica. We're gonna be flexible to fit within your model of access control and adding the UDX to your mix. Each customer's a little different there. So you might wanna talk with us a little bit about the best practices for your use cases but in general, that's gonna be up and running in just a minute. The security and software appliance, Hardin Linux appliance today sits on-prem or in the cloud and you can deploy that. I've seen it done in 15 minutes but that's what the real techie who had access to being able to generate a cert and do all this and that, you know, being able to set the firewall and all the DNS entries. The basic blocking and tackling of a software appliance. You get that done. Corporations can take care of that in just a couple of weeks. They get it all done because they have weight waiting on other teams but the software appliances are really fast to get stood up and they're very simple to administer with our web-based GUI. Then finally, you're gonna implement your UDX use cases. Once the software appliance is up and running, we can set authentication methods. We can set up the formats that you're gonna use in Vertica and then those two start talking together and you should be going in dev and test in about half a day and then you're running toward production in just a matter of days in most cases. We've got other customers that say, hey, this is gonna be a bigger migration project for us. We might wanna split this up into chunks. Let's do the real sensitive and scary data like tax ID first as our sort of toe-in-the-water approach and then we'll come back and protect other data elements. That's one way to slice and dice and implement your solution in a planned manner. Another way is schema-based. Let's take a look at this section of the schema and implement protection on these data elements. Now let's take a look at a different schema and we'll repeat the process so you can iteratively move forward with your deployment. So what's the added value when you add for Vertica plus voltage? I wanna highlight this distinction because Vertica contains world-class security controls around their database. I'm an all-time DBA from a different product competing against Vertica in the past and I'm really aware of the granular access controls that are provided within various platforms. Vertica would rank at the very top of the list in terms of being able to give me very tight control and a lot of different auth methods to being able to protect the data in a lot of different use cases. So Vertica can handle a lot of your data protection needs right out of the box. Voltage to secure data as we keep mentioning as that defense in depth and it's gonna enable those enterprise-wide use cases as well. So first off, I mentioned NIST, the standard of FF1 that is format-preserving encryption. We're the authors of it, we continue to maintain that and we want to emphasize that customers really ought to be very, very careful in terms of choosing a NIST standard when implementing any kind of encryption within the organization. So AES was one of the first in Hallmark benchmark encryption algorithms and in 2016 we were added to that mix as FF1 with CS Online. If you search NIST and Voltage Security, you'll see us right there as the author of the standard and all the processes that went along with that approval. We have centralized policy for key management, authentication, audit and compliance. We can now see that Vertica selected or fetched a key to be able to protect some data at this date and time. We can track that and be able to give you audit and compliance reporting against that data. You can move protected data into and out of Vertica. So if we ingest via Kafka, ingest via NiFi and Kafka, ingest on StreamSets, there are a variety of different ingestion methods and streaming methods that can get data into Vertica. We can integrate secure data with all of those components. We're very well suited to integrate with any Hadoop technology or any big data technology as we have APIs in a variety of languages, fitness and platforms. So we've got that all out of the box, ready to go for you if you need it. When you're moving data out of Vertica, you might move it into an open systems platform. You might move it to the cloud. We can also operate and do the decryption there. You're gonna get the same plain text back. And if you protect data over in the cloud and move it into Vertica, you're gonna be able to decrypt it in Vertica. That's our cross-platform promise. We've been delivering on that for many, many years and we now have many, many end points that do that in production for the world's largest organization. We're gonna preserve your data format and referential integrity. So if I protect my social security number today, I can protect another batch of data tomorrow and that same cyber text will be generated. When I put that into Vertica, I can have absolute referential integrity on that data to be able to allow for analytics to occur without even decrypting data in many cases. And we have decrypt access for authorized users only with the ability to add LDAP authentication authorization for UDX users. So you can really have a number of different approaches and flavors of how you implement voltage within Vertica. But what you're getting is the additional ability to have that confidence that we've got the data protected at rest. Even if I have a DBA that's not vetted or someone new or I don't know where this person is from a third party and being provided access as a DBA level privilege, they could select star from all day long and they're gonna get cyber text. They're gonna have nothing of any value. And if they wanna use the UDX to decrypt it, they're gonna be tracked and traced as to their utilization of that. So it allows us to have that control and additional layer of security on your sensitive data. This may be required by regulatory agencies and it's seeming that we're seeing compliance audits get more and more strict every year. GDPR was kind of funny because they said in 2016, hey, this is coming. They said in 2018, it's here. And now they're saying in 2020, hey, we're serious about this. And the fines are mounting. And let's give you some examples to kind of help you understand that these regulations are real. The fines are real and your reputational damage can be significant if you were to be in breach of a regulatory compliance requirement. We're finding so many different use cases now popping up around regional protection of data. I need to protect this data so that it cannot go offshore. I need to protect this data so that people from another region cannot see it. That's all the kind of capability that we have with Insecure Data that we can add to Vertica. We have that broad platform support and I mentioned NIFI and Kafka. Those would be on the left hand side as we start to ingest data from applications into Vertica. We can have landing zone approaches where we provide some automated scripting at an OS level to be able to protect ETL batch transactions coming in. We can protect within the Vertica UDX as I mentioned with the copy command directly using Vertica. Everything inside that dash line is the Vertica plus voltage secure data combo that's sold together as a single package. Additionally, we'd love to talk with you about the stuff that's outside the dash box because we have dozens and dozens of endpoints that can protect and access data on many different platforms and this is where you really start to leverage some of the extensive power of secure data to go across platform, to handle your web-based apps, to handle apps in the cloud and to handle all of this at scale with hundreds of thousands of transactions per second of format-preserving encryption. That may not sound like much but when you take a look at the algorithm and what we're doing on the mathematics side, when you look at everything that goes into that transaction to me that's an amazing accomplishment that we're starting to reach those kinds of levels of scale and with Vertica, it scales horizontally so the more nodes you add, the more power you get, the more throughput you're gonna get from voltage secure data. I wanna highlight the next steps on how we can continue to move forward. Our secure data team is available to you to talk about the landscape, your use cases, your data. We really love the concept that we've got so many different organizations out there using secure data in so many different and unique ways. We have vehicle manufacturers who are protecting not just the VIN, not just their customer data but in fact they're protecting sensor data from the vehicles which is sent over the network down to the home base every 15 minutes for every vehicle that's on the road and every vehicle of this customer of ours since 2017 has included that capability. So now we're talking about an additional millions and millions of units coming online as those cars are sold and distributed and used by customers. That sensor data is critical to the customer and they cannot let that be exiled in the clear. So they protect that data with secure data and we have a great track record of being able to meet a variety of different unique requirements, whether it's IoT, whether it's web-based apps, e-commerce, healthcare, all kinds of different industries. We would love to help move your conversations forward and we do find that it's really a three-party discussion. The customer secure data experts in some cases and the Vertica team, we have great enablement within Vertica team to be able to explain and present our secure data solution to you but we also have that other ability to add other experts in to keep that conversation going into a broader perspective of how can I protect my data across all my platforms, not just in Vertica. I wanna give a shout out to our friends at the Vertica Academy. They're building out a great demo and training facilities to be able to help you learn more about these UDXs and how they're implemented. The Academy is a terrific reference and resource for your teams to be able to learn more about the solution in a self-guided way and then we'd love to have your feedback on that. How can we help you more? What are the topics you'd like to learn more about? How can we look to the future in protecting unstructured data? How can we look to the future of being able to protect data at scale? What are the requirements that we need to be meeting? Help us through their learning processes and through feedback to the team get better and then we'll help you deliver more solutions out to those end points and protect that data so that we're not having data breach, we're not having regulatory compliance concerns. And then lastly, learn more about the UDX. I mentioned that all of our content there is online and available to the public. So vertica.com, secure data, you're gonna be able to walk through the basics of the UDX. You're gonna see how simple it is to set up what the UDX syntax looks like, how to grant access to it, and then you'll start to be able to figure out, hey, how could I start to put this into a POC in my own environment? Like I mentioned before, we have publicly available hosted clients for demo purposes that we can make available to you if you wanna POC this. Reach out to us, let's get a conversation going, and we'll get you the address and get you some instructions so we can have a quick enablement session. We really wanna make this accessible to you and help demystify the concept of encryption because when you see it as a developer and you start to get your hands on it and put it to use, you can very quickly see, ah, I could use this in a variety of different cases, and I could use this to protect my data without impacting my analytics. Those are some of the really big concerns that folks have, and once we start to get through that learning process and playing around with it in a POC way, then we can start to really put it to practice into production to stay with confidence. We're gonna move forward toward data encryption and have a very good result at the end of the day. This is one of the things I find with customers that's really interesting. Their biggest stress is not around the timeframe or the resource, it's really around, this is my data. I have been working on collecting this data and making it available in a very high quality way for many years. This is my job and I'm responsible for this data, and now you're telling me you're going to encrypt that data, it makes me nervous, and that's common. Everybody feels that. So we wanna have that conversation and that sort of trial and error process to say, hey, let's get your feet wet with it and see how you like it in a sandbox environment. Let's now take that into analytics and take a look at how we can make this go for a quick 1.0 release, and let's then take a look at future expansions to that where we start adding Kafka on the ingest side. We start sending data off into other machine learning and analytics platforms that we might wanna utilize outside of Vertica for certain purposes in certain industries. Let's take a look at those use cases together and through that journey, we can really chart a path toward the future where we can really help you protect that data at rest in use and keep you safe from both the hackers and the regulators, and that I think at the end of the day is really what it's all about in terms of protecting our data within Vertica. We're gonna have a little couple minutes for QA and we would encourage you to have any questions here and we'd love to follow up with you more about any questions you might have about Vertica plus voltage secure data. Thank you very much for your time today.