 here we go. Hello and welcome my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We would like to thank you for joining to this Data Diversity webinar data governance trends and best practices to implement today sponsored today by Google Cloud. Just a couple of points to get us started due to the large number of people that attend these sessions you will be muted during the webinar for questions we'll be collecting them by the Q&A or if you like to tweet we encourage you to share our questions via your favorite social media platform using hashtag Data Diversity. And if you'd like to chat with us or with each other we certainly encourage you to do so. And just to note zoom defaults the chat to send you just the panelists we may absolutely change that to network with everyone to find the Q&A or the chat panels you may find those icons in the bottom middle of your screen for those features. And as always we will send a follow up email within two business days containing links to the slides the recording of the session and additional information requested throughout the webinar. Now let me introduce to our speakers for today's Sam Lugani and Renee Colga Sam is the product lead for a confidential computing at Google Sam started his professional career working as a software engineer at Cisco before venturing into product centric roles at FireEyes, Cinec and later at Google. And Renee has over 15 years of cybersecurity experience in the areas of end point protection insider threat encryption and vulnerability management. Renee is a member of the confidential computing product management team at Google cloud. And prior to Google he worked at Symantec Citrix Altress and a number of security startups and with that, I will give the floor to Sam and Renee to get today's webinar started. Hello, and welcome. Sam. Hello, hello. Super excited to be here. Sam. Are you able to unmute yourself. Okay. Thanks. I'm super excited to be here. Thanks everyone for joining in. Let me just share my screen and then we move through the presentation. Let's go. Okay, great. Well, hello everyone. Good morning and good evening. So today to talk about data protection trends and best practices. My name is Sam Dugani. And again, I look after the confidential computing portfolio for Google. And with me is my awesome colleague, Renee Colga. Renee, if you want to spend a minute introducing yourself. Yeah, I think thanks Sam. Yeah, great to be here Renee Colga product manager on confidential computing team with a focus on trusted execution environments. And yeah, I think we've got a great introduction at the beginning as well. Thank you. Great. Let's get started. We'll introduce data governance. And subsequently, we've created some scenarios around some of the questions we get from customers and how to solve those. We'll cover access controls, encryption, secure collaboration and close out with the incident response. Now, data governance is necessary to assure that data is safe, secure and private, and also helps with compliance, both internally and externally. It means setting up data policies that apply to how data is gathered, how it's stored, how it's processed and eventually how it's disposed of. So data governance is really about deciding what kinds of data would be under governance and who can specifically access this data. And then there is this aspect of complying with external standards set by industry associations or government agencies, or maybe some other stakeholders that you have. Strong data governance allows more personal access to more data with the confidence that these personnel get access to the right data. And the fact that the sharing of data or democratization of data does not negatively impact the organization. So all this could actually improve security while also not limiting collaboration. Now, data governance has many different facets and today we'll talk about one of the most important pillars, which is data protection. We want to focus on some key questions that come up when our customers think of data protection, you know, and questions such as organizations wanting to limit access to data, both internally and externally. They also at times want to protect their sensitive data and that intellectual property. And at the same time, organizations don't want to limit their ability to collaborate on data with external parties. And we're precisely going to give you a blueprint to answer these questions for your organization. Now, a lot of companies are using cloud services to scale their business and to also expand in new areas. When choosing a cloud provider, it is very important to ask the right questions and understand truly how your data is being stored and used on the cloud. So questions such as, is your cloud provider processing data as per your requirement? Are you sure that this data is not being sold to third parties? Could this data be used for advertising? Is it being used for advertising? Is the cloud provider transparent about what data it collects and how they use it? Is your data encrypted at rest and in transit by default? Do you get notifications when there is a data incident? And really, in general, is your cloud provider following globally accepted standards? These are key questions to answer before picking the right cloud provider for yourself. Now, let's assume you've picked a cloud provider and walked through a few scenarios. Now, before you bring your data to the clouds, you really want to plan the controls you will have over your data. Having solid identity and access management policies will let your organization or your admins authorize who can take action on specific resources. And that gives you more control and visibility to manage cloud resources in a more central sort of way. Now, this is going to be especially true for organizations with complex organizational structures where you have hundreds of work groups, or you're looking after multiple different projects. And in that case, identity and access management can provide a unified way. Have you seen Slidespan? Just checking. Okay. Yeah, so in that specific case, identity and access management can provide a view into your security policies across the entire organization. And with that auditing that comes with it, it also helps with your compliance needs. So that way you can control who has access to what resources within your organization. And then beyond internal access, what you also want control over is what resources can be accessed by a cloud provider. So this brings us to things like access transparency and access approvals and different vendors are going to have variations of this. So where access transparency and access approval come in is that they can help expand the visibility and control you have over your cloud provider with admin access logs and approval controls. Now, in the specific case of Google Cloud, customer data is not accessed for any reason other than to fulfill contractual obligations. And for that, a valid business justification is required for any access made by support staff. But access transparency beyond that can still provide near-linear time logs when Google Cloud admins access your content. And admin access is limited. Now, with access approvals, you can further limit this because access approvals lets you prove or dismiss requests for access by Google employees working to support your service. So access approval requests when combined with access transparency logs can be used to audit an end-to-end change from support tickets to the access request to the approval and then eventually to the actual access. So once you figure out the right access controls, the next scenario is wanting to know where your sensitive data resides. Because to be able to protect this data, the first step in that journey is really to understand what data you want to bring to the cloud. And to do this, you have to get visibility and understand where this data resides and what level of protection specific data needs. And there are tools that can help you with this. And one of them is cloud data loss prevention and classification. Now, Cloud DLP doesn't just help prevent exploitation of data, but also helps to classify the data. And that in turn helps you identify what data is sensitive and what isn't. And that's not all what DLP does. It also has tools that can help you de-risk sensitive data by removing unnecessary PII, like phone numbers, card numbers, et cetera, from the actual data before you process it. Now, let's assume we need to keep the raw data in its full fidelity. You know, you can't anonymize or tokenize it for whatever reason. But what do you do then? And that's where encryption comes in. Encryption is a very powerful tool to protect data and limit access to it. However, with encryption, you have to make sure you protect your keys based on the policies that your organization has or the specific regulation you have to work with. So let's introduce a few options for encryption at rest. Various mandates called for hardware encryption, which is HSM, or the fact that keys need to be separated from data, you know, having an external key manager. Or, you know, in general that keys need to be handled securely through key management systems. But there's a lot of options you can pursue in this regard. You can essentially manage encryption keys, which could be through a host, you know, cloud hosted key management service that lets you manage both symmetric and asymmetric crypto keys for your cloud services. The same way you today maybe do it on-prem. You can generate, use, rotate, or even destroy the keys within this centrally managed service. You can use hardware key security with HSM or hardware security modules in case of GCP. And in this specific case, the host encryption keys perform crypto operations in our fifth level three certified HSM, and customers who have specific compliance regulations may be required to store their keys and perform crypto operations in this fifth level three elevated device. So if that is a requirement that you have, then an HSM, whether it's through Google Cloud or externally, can be used to sort of fulfill that regulation. And then you can also have an external key manager, and various cloud services provide support for external key managers. And in this case, you can encrypt data with encryption keys that are stored and managed in a third party key management system that is deployed outside the vendor's infrastructure, in this case Google's infrastructure. So an external key manager really allows you to maintain the separation between your data at rest and your encryption keys, while still leveraging the power of cloud for computing and analytics or, you know, what say whatever other stuff you will do with it. Now, so far you've normally mean maybe maybe able to figure out the fact that okay well here's how my data is classified here's how I need to store it and here's the key management stuff I need to have in place based on my internal regulations or my preference in terms of security. So now that you've secured this data address you have to run your VMs and use that data to get insights. So you need to process that data. And so you need to make sure that your VMs are hardened against Patek's and and whether it's a container or a VM that you're running you want to make sure that that is trustworthy. So on Google Cloud we have something called shielded VMs. It's enabled by default on VMs and Kubernetes containers. Now these are virtual machines just the same way we have other virtual machines that are hardened by a set of security controls that help them be stronger against root kits and boot kits. They help protect against threats such as malicious project insiders or malicious guest firmware or kernel or user mode vulnerabilities. And it really makes sure that the workloads are trusted and verifiable. So in this case shielded VMs can help protect your virtual machines, you know, as a set against root kits and then really against kernel level malware, because we have secure and measured boot capabilities within it. Great. So now you've created this hardened box and where you can process data. But you also want to make sure that in certain cases you achieve crypto graphic isolation while data is being processed. There's data, there's encryption in transit when data is moving from one place to another place. There's encryption at rest, which we talked about with KMS or when your data stored needs to be encrypted. And there's this whole encryption trifecta about completing the encryption lifecycle with encryption while data is being processed or being used and that is where confidential computing comes in. So confidential computing provides really a verifiable way for organizations to process process data in a confidential environment, which means it's always encrypted within that confidential environment boundaries. And this is this is really enforced by virtue of the fact that the memory of those environments is encrypted. Now, these confidential VMs work and act very similar to regular VMs. There's no change per se in workloads that's needed. So there's no code line changes that you need to do to enable confidential VMs, at least in, you know, within within Google clouds environment, and different vendors would have different requirements around some of these things. It works like a regular VM, but the memory of the VM is encrypted momentarily. So although it works like a regular VM, but outside adversaries wouldn't be able to access it access that data, even when that data is in use. And so, you know, when we say that data is encrypted, the next question we typically get where are the keys, the keys in this case actually protected by hardware. So in AMD systems as a special designated secure processor that is responsible for generating the keys. And these keys are only shared with the memory controller. And when the application is trying to access memory or execute some instructions, this memory gets decrypted by the memory controller, put on the cache, and then the CPU can perform those instructions. What it really means is that no software should be able to get this keys, these keys out of this encryption engine, which means they're not extractable. And so these are really ephemeral keys generated by hardware and customers or even Google can't access these keys. Now, this demo shows really how easy, how really trivially easy it is to create a confidential VM. Literally, we have to create, you know, we just have to click this checkbox when you want to create a confidential environment on a specific VM instance, and then we do all the heavy lifting in the background. We map this VM to the right hardware, enable the right capabilities, and then as soon as you click, as soon as you sort of click to the next step, this confidential VM is up and running. And then you can verify if this VM is running in a confidential environment. Great. So to talk about the next scenario around secure multi-particle aberration, I want to invite my colleague, Rene Kolbe. Yeah, thanks so much, Sam. Great, great information on the topics like access controls, data classification, various key management mechanisms, and this hardened and confidential compute. So here I would like to add just a few additional scenarios. Most of you know that with so much data out there these days, most of our organizations have sensitive data locked up and siloed sitting there, mostly due to its sensitivity. But how many amazing things we could have accomplished if we could unlock this data in the privacy-preserving way, of course, and collaborate securely across teams or even external organizations, right? So let's move on to the next slide, Sam. So here we could take confidential computing that Sam just described earlier to the next level to enable this type of multi-party collaboration. And this would allow teams and organizations to unlock, free the power of their data through the secure collaboration without violating confidentiality. And this enables scenarios where you can gain this mutual value from aggregating sensitive data together while retaining full control over it, as well as ensuring that only the right workloads have access to that data. Let's look at the next slide where I show just a basic example. Here we could have multiple banks collaborating to identify fraudsters or detect some type of money laundering activity. In this example, a workload author may be one of the collaborating banks or an independent third party. And they create that workload, whether ML model or some query, to aggregate data and detect some suspicious activity. So with a trusted execution environment or TE or secure enclave or a confidential space, which is as we call it here at Google, these banks can share their sensitive customer data with this kind of a black box in the middle, but only and only if it's running on confidential computing, keeping memory encrypted and running the right previously agreed upon workload that all collaborators can trust. And in the end, all collaborating banks benefit because often this type of complex fraudulent activity becomes only visible when multiple organizations in this example, multiple financial institutions pool their data together. Yeah, so now moving on to the demo, but before we get started, like I just described a banking fraud detection and that certainly could be powerful. But let's take it closer to home and make an example that would be relevant for everyone. Like what if you wanted to know whether it is you or your colleague who makes more money at the company you're working on right now, without revealing your actual salary? Wouldn't that be interesting, right? And this is exactly what you could do securely using this type of confidential space. So let's start with the demo. So here imagine we have two colleagues on the left, we have Alice, and on the right we have her colleague, Bob. And they agreed on some code, a query that would rank their salaries without revealing the actual numbers, right, the actual salary. Just will tell you who makes more money. So Alice on the left would generate an input file with her salary, say just over 100,000. And Bob on the right would do the same with a salary of say exactly 100,000. Next, they will encrypt those files. Alice encrypts hers first on the left. Bob does the same as well on the right. And then they would upload those encrypted files to their corresponding storage buckets. And in this example, Alice is the workload author. This could have been an independent third party as well. And Bob, of course, prior to collaborating would audit this workload and be even able to reproduce it on his side to ensure it's legitimate. Alice already built and uploaded the Docker container with the workload that will rank their salaries, but not show the numbers. You can see it's shot to 56 hash here on the left. This is the same hash that Bob will include into the attribute conditions that define his policy. It defines under which conditions Bob is OK, releasing a key to his encrypted salary data. In this example, his policy checks for a number of things actually, including the type of the confidential computing technology required like MDSCV or MDSNP, version of the hardened OS where the workload container will be running the authorized user and this image digest or hash of the workload itself. Alice also has a similar policy for her side, but she's the workload author. She doesn't need to include this image digest, this hash. Instead, she just includes the workload location in the artifact repository. In our demo, Bob plays the operator or the admin role and will run the workload that Alice built and that could be another party as well. And admins can only start and stop the workload, but they can't access dating, clear text or influence the workload in any way. So let's see. The workload is running. You can see at the bottom right now it's complete and Bob will read the output. And of course, this output is accessible to all collaborators, but not the operator or the admin. So Bob will read the output and let's see what it says. Oh, it looks like Alice makes more money. So it's time for Bob to ask for a raise from their boss. But now let's imagine that Alice goes rogue or becomes malicious. And instead of just wanting to learn where she ranks in salary among her colleagues, she wants to get their actual salaries, right? So she is the creator of the workload, modifies the source code to include the actual salaries in the output file, and then rebuilds the Docker container and pushes it to a repository. There's a reminder Bob's policy includes the hash of this workload under the attribute conditions. So now Bob will rerun the workload that Alice rebuilt and read the output. And you'll see that the output will have an error in it. And there says the given credential is rejected by the attribute condition, which means that Alice wasn't able to take advantage of our of her colleague Bob, and because his policy included the hash of this previously agreed upon workload. And obviously when Alice modified it, the hash also changed. So this is just an example of how multiple parties who don't necessarily have to trust each other can share sensitive data and collaborate in the secure manner. So let's, yeah, so hopefully this demo gave you a little bit of a flavor, what is possible, when even the most sensitive data out there, your, your salary, your PI, your PHI, etc, can can be shared securely and computed on among across multiple parties. So basically in summary, multi-party data collaboration using these confidential spaces provide you an ability to collaborate without blind trust. Everything is measured and verified before you share your data, providing you with the data integrity data confidentiality as well as code integrity guarantees. Again, you retain complete data ownership and you control how your data is used and which workloads are authorized to access. And again, you don't need to trust the operator or, okay, let's move to the next slide. So next topic, next scenario, right, you may have done everything in your power to secure your data in the cloud, but stuff can still happen. What if you need to address a cloud security incident? Go to the next slide. And of course we all know the cloud is here to stay but unfortunately so are the threats. And I like to say often that the bad guys are just so good. Human ingenuity for both good and evil is limitless. And that's why we keep seeing these data theft issues due to misconfiguration, abuse of cloud resources, crypto miners and other types of issues out there. Go to the next slide. And obviously protecting data requires information security operation that combines manual and automatic automated processes, an expert incident response team, as well as multi-layered information security and privacy infrastructure. So at Google we recommend having the following steps in that incident response program and have them listed on the slide. And so the first is identification. The main goals here are detection and reporting. Those include both automated as well as manual processes. Next is coordination. It involves triage and response team engagement. Resolution is probably the part everyone is most excited about. Let's get this incident over with, right? And here we cover investigation itself, containment, super important right, containment and recovery, as well as the communication. The fourth topic for step in the incident response process is the closure. And here it's all about capturing the lessons learned. And finally, we should not forget about continuous improvement because if we haven't learned anything and improved our prevention, it's obviously a lost opportunity. And you can learn much more about our recommended and documented incident response process by following the link at the bottom of this slide. Okay, so to summarize, today we covered quite a few topics, right? We went through data access and classification controls, protecting data at rest using various key management mechanisms, protecting data while in use through this newer technology called confidential computing. And then we touched on unlocking how to unlock sensitive data through secure multi-party computation and collaboration. And finally, we touched on the incident response process. Most, if not all of these controls are provided by major cloud platforms for you to take advantage of. And hopefully, during this webinar, we can help raise your awareness and allow for a safer data processing in the cloud by taking advantage of all of these available controls. Sam, anything you would like to add here? No, great. Thanks for working us through all these different things, Renee. Maybe we can just open it up for questions. So Renee and Sam, thank you so much for this great presentation. If you have questions for them, feel free to submit them in the Q&A portion of your screen. And just to give everyone the answers to the most commonly asked questions, just a reminder, I will send a follow-up email by end of day Thursday for this webinar with links to the slides and links to the recording from today's session. Everyone's very quiet right now, so I'll give everyone a moment to type any questions you have into the Q&A portion. So Sam and Renee, so when they, when your customers implement Google Cloud just to give everyone a moment here, you know, what's the big aha moment that they have when implementing it that they didn't think of that just makes their life so much easier. From my side, I would just say the simplicity, you know, we, especially from the product management side, we always try to ensure that the most common use cases are taken care of out of the box. So literally, just like Sam was showing you how easy it is to enable confidential computing, literally one checkbox or one button, that's what we really strive to achieve, that it's, you know, most common scenarios are completely 100% pre-configured out of the box. So it's just lift and shift or a very seamless transition. Of course, there are more advanced scenarios for which we have the advanced configurations available as well. Sam, anything from your side? Yeah, great, great point, Renee. What I'll add to simplicity, you know, what Renee just mentioned is maybe scalability as well. You know, often things that are simple are really hard to scale and things that sort of need to be scaled are quite complex. I think when, you know, you move to the cloud, especially to systems which are easy to use, the kind of scalability you get with cloud services along with the simplicity is something which is really unique to cloud environments. And that's something which our customers appreciate a lot. That's fantastic. So, since the enablement of these controls, responsibility of technical teams or is there an interface for information security personnel? I think it's, you know, it's a bit of both. There's going to be different technical teams within the organization based on what the organization deals with. So for example, you know, if it's an agency or some, you know, a company that, you know, supports the government, there's multiple technical teams evaluating how data is stored, how it's classified, what data needs to be protected, and they have to interface with the cloud security folks to make sure that those controls are in place. In smaller companies or companies sort of that are growing fast where they don't really have these established controls over a period of time, these new controls have to be established. And then in that case, you know, these teams may look like the same team really, or be part of the same team. And so there's less chance of some of this information being lost in transit. Definitely makes sense. Anything else that you want to add there or give everyone a moment to add any additional questions? Pretty quiet. Sam, Renee, anything that you want to add that you kind of want to deep dive into with a little bit of time left? No, I think there's a lot of good information available on every cloud vendors website. You know, we can specifically talk about GCP. So all the information you saw today, you know, there's a lot of support pages available for that with use cases, demos, and then and presentations. So if you want to dive deeper into any one of these topics, there is good information available online for you to chew on. I love it. Well, Sam and Renee, thank you so much for this fantastic presentation. And thanks to all of our attendees. Just again, I will send a follow up email by end of day Thursday with links to the slides and links to the recording. Thanks everyone. Thank you. Thanks, thanks for joining everyone. Have a great day.