 Hello everyone, and thank you for joining our session about protecting open cloud infrastructure here at the Linux Foundation's open source summit. My name is Kristen A. Comber. I'm a senior analyst with Evaluator Group. I cover data protection and data management related technologies. In our session today, I'll kind of work through kind of some of the data exposure risks related to open infrastructure, including Linux systems for cloud environments. And also provide some tips on kind of how we see customers working to mitigate those risks. I'll also kind of wrap up the conversation by sharing some best practices that we've seen based on our engagement with customers in terms of data protection and compliance for this, again, this open infrastructure that's kind of going to support cloud environments. Before we get there, I did want to give a little bit of background about Evaluator Group. We are an analyst firm that is very specifically focused on information storage, information protection and kind of information management related technologies. We do study products very closely at a technical level. And we consult regularly with IT professionals to kind of, again, give them, you know, some guidance regarding how to approach different projects or pain points that they might, that they might be facing. So we do have, again, kind of those services. We also have kind of a whole research library covering various technologies across the product segments that you see here on the slide. And so with that, I would love to jump right in. So we are talking about cloud infrastructure here at a conference with all about Linux. And so I thought it would be a good idea to start with kind of some conversation about why does this make sense, right? Why is Linux popular in cloud environments? And because we do see that it is popular in cloud environments, we see that it's heavily in use by public cloud providers like Amazon, like Google, for example. And really, one of the big reasons is because Linux is open source and it's also very flexible as an operating system. So what that means is that it's very customizable and that it can support a whole host of different use cases, as well as kind of a variety of target systems as well when we begin to start thinking about data protection. And we see that Linux has kind of this very vibrant community, right, of contributors that are just kind of wide ranging in their skill sets and in their requirements. And so again, we do see that Linux serves, you know, kind of smaller scale environments, but also implementations that might span internationally as well, especially when we start thinking about cloud, of course. So I've alluded to kind of the fact that Linux does have this very large base of developers. And so also that means that the code base itself tends to be fairly stable. And that also kind of its capabilities and also some of its bugs are being continuously sort of, you know, fine tuned and addressed and things of that nature. Another kind of characteristic of Linux that makes it popular in cloud environments is that compared to closed source systems, it can be kind of high performance while also being kind of more cost effective as well. So it can have, you know, sort of that acceptable level of performance that is required, but then at the same time be fairly cost effective as well as course being, you know, open source and things of that nature. And then a final characteristic is that Linux is modular and it is scalable as well. So we start to think about cloud environments and some of the exponential growth in traffic that these environments has. It means that Linux can handle, you know, kind of these again the spike in traffic without a huge spike in expenses as well. So of course that's very attractive. So Linux and other open source systems, they do carry a number of unique advantages compared to kind of closed or proprietary systems. And I'll kind of talk through those on the slide. But then as well, some of the risks in terms of data exposure and that's really going to kind of lead us into really the meat of our conversation today. So just taking a little bit of a step back so open source code open source systems. You know, really, there are more eyes on the code on a regular basis, typically speaking, especially, you know, especially for a platform like Linux. So this means that, you know, any problems or issues with the code in theory can be detected, you know, possibly more quickly right then, if it were kind of a proprietary system that has a smaller base perhaps as developers that are working on it. You know, an open source software platform that have, you know, 10, 100,000 of contributors, you know, so again, more eyes tends to help detect those problems more quickly. Another advantage is that once those potential issues are found, things like patches and updates, but also new versions of the software as well. They're not tied to a commercial product update or commercial product launch. So this tends to make them be available more quickly and kind of get into the hands of users in theory at least more quickly compared to a proprietary alternative that again maybe is tied to kind of that product cadence. Now, the flip side, you know, of this coin if you will, is that although vulnerabilities in the code can be detected more quickly, they also tend to be more exposed. And also, how these vulnerabilities are going to be corrected tends to be more exposed as well. So, and we find tends to be, especially when we start to think about things like cyber attacks like ransomware, for example, that's a big one that we all tend to hear about regularly. We'll get into that in more detail, but these bad actors, these hackers, they really do tend to keep an eye out for these vulnerabilities and how they're planning to be addressed and try to leverage that as an end, you know, again, for example, to plant ransomware. So again, that is sort of, you know, a risk or a threat, if you will, in terms of using open source infrastructure, especially from the standpoint of data loss, of course. And along that line as well, so we kind of talked about patches and versions and, you know, any bugs or issues or vulnerabilities being fixed. Well, oftentimes there might be a number of different components of open source software that go into, you know, that kind of go into these cloud environments that we're talking about here. So, there is the risk that because there's a number of moving pieces, you know, perhaps there are some practices that are a little bit lack, lack of a better way to put it, in terms of, again, tracking all those components that are being used and also all those vulnerabilities that might be kind of cropping up and really making sure that that those are being addressed. And so the final point I did want to make here from sort of a risk standpoint is that, although there's a much larger base of developers, typically speaking that are working on open source code. Typically, this is being maintained by, you know, it's not being maintained by a paid employee, it's being maintained by, you know, volunteers or individuals. And there are a few, if any repercussions if the software is not found to be, you know, sort of having its own abilities are addressed or really kind of following some good guidelines and good practices in terms of being responsible in terms of how data is being collected and retained. And that becomes a risk from the data protection standpoint. And we'll talk a little bit more about that when we start to discuss issues regarding compliance. So we've just kind of covered some of the major risks for data exposure that are associated with open source infrastructure. And in the next section I did want to talk a little bit more about why data protection is needed, as well as some of the basics, when it comes to protecting data from disasters, and for meeting compliance requirements, which I just alluded to as well. So that's what we'll cover here in the next section. So, of course, the fundamental question, why data protection? Well, one of the big reasons is that data really is at the heart of business lifetime and business value these days. You know, I don't think that's a surprise to anyone on the phone, we tend to hear about that a lot. So really the tolerance for data loss or the tolerance for kind of data exposure just tends to be very minimal these days. So it tends to be important that that data is being protected and that it is being protected effectively. And one of the big things that this data needs to be protected from these days especially is ransomware, which I've already alluded to. You know, it's not only is it all of the headlines, but it's costing big money. And we see that these hackers are just constantly getting smarter and more sophisticated. So the attacks are getting more sophisticated and the attacks are also occurring more frequently as well. So we saw, for example, the colony of pipeline be shut down earlier this year. That is one of the largest oil pipelines here in the U.S. That was shut down due to ransomware earlier this year. We see these organized gangs and essentially businesses having sprung up around ransomware. One of the highest profile ransomware gangs called Reval. I actually just read earlier this month, they've kind of reemerged, right? So, you know, they're certainly not backing down by any means. In ransomware, you know, we tend to think about it in some of these high-profile headlines, but, you know, really it can impact organizations really of all sizes. And some of the smaller companies that maybe are doing business with an MSP that might have been impacted by ransomware, you know, even kind of smaller scale organizations as well becoming targets to a degree. So it's certainly something that is, I think, a consideration for effectively all businesses these days. Another kind of item in terms of why data protection, why does data need to be protected is not only some of these external threats like ransomware, but also internal threats as well. So, you know, maybe there's a rogue admin that wants to kind of go ahead and delete an entire storage volume or things of that nature. Well, it's important to be able to have, it's important to be able to not only control the access and the ability for that rogue admin to be able to do that, but also to have, you know, backup copies to be able to recover from in the event that something like that happens. There's, you know, human errors such as accidental data deletion, of course, in the event that that happens, you know, you want your data to be backed up so you can recover it. The legal compliance requirements. So I alluded to this a little bit. You know, I think we're likely all familiar with the European Union's GDPR. We've seen California's CCPA and essentially these pieces of legislation that are cropping up and kind of stipulating how user data needs to be kind of collected and stored and really some of the rights that users have in terms of what data is being collected and stored about them. We'll get into that in more detail, but this certainly requires a lot of very kind of fine brain data protection capabilities to be able to meet these requirements and not have signs or other, you know, of course, negative repercussions. And then finally, I wanted to touch quickly on the shared responsibility model. So, especially when we start to think about moving data off-premises into the cloud, you know, into kind of a public cloud hosting environment. I wanted to bring that up here because we do see a lot of hybrid cloud models or maybe some of the infrastructures on-prem and applications, but maybe there's some infrastructure and applications off-prem as well. So I wanted to touch on this because we do see that a lot of the time there's an assumption that the cloud provider, the cloud service provider is responsible for the data protection and that's actually not true. That responsibility does still lie with the customer. So in the event of ransomware, you know, a piece of legislation that needs to be complied with, for example, it is still the customer's responsibility to make sure that the data and the data protection implementation are helping them to meet those needs. So even with all of this, you know, right, kind of this case for data protection, if you will, these sometimes get asked by customers, well, my system itself is architected for high availability, right, to have that maximum uptime. So why do I need data protection on top of that? So on this next slide here, I thought I would break down some of the differences between high availability, which is HA, and disaster recovery, which is DR. So for me, I tend to think about two fundamental differences between HA and DR. So from the standpoint of HA, it tends to handle problems that occur when a system is running versus DR tends to handle problems after a system has failed. And another way to think about it is that HA addresses hard load failure and routine downtime. But it does not address data loss, which is addressed by disaster recovery. And it also doesn't address some of these, you know, more catastrophic events like a natural disaster or ransomware attack or things of that nature. So let's break that down a little bit. So essentially what HA means is to design a system to eliminate any single points of failure. So for example, we see a lot of redundancy in the hardware components like a power supply. So in that example, if something were to happen to one of the power supplies, the other one would be right there, up and running, ready to take over, you know, as instantaneously as possible. And in fact, the system can detect that outage for the primary power supply and switch over to that redundant component. Again, ideally, if not instantaneously, you know, at least very, very close to it, for lack of a better way to put it. We do see software redundancy as well. I did want to mention that, of course, to this audience here today. So that's, you know, for example, clustering to spread an application across multiple systems that way in the event that one system goes down, the application can stay running. And really kind of the concept is always on, always available, or at least having minimal interruptions to that service. It really is as synchronous as possible in terms of some of those failovers. And so that those operations can be ideally continuous and really have no failure. Now, it can and often is, you know, kind of, it can be a component of DR, and often is a component of DR, but that's recovery, or at least true disaster recovery really goes a step further. It's more of a strategy or a more holistic plan to be able to resume more normal operations in the event of some of those, as I kind of described them, catastrophic disasters. So an earthquake takes out a data center or there's a cyber attack that takes down a number of critical systems and wipes out, you know, from some backup data, for example. So it's again, it does include the technology itself. So again, it can be a part of it, but it also looks at people and processes as well. So, in fact, one best practice that we talk about with customers is to have a DR plan. So what needs to happen in the event that an IT system goes down or that, you know, that data, or that there is a data loss event and that there does need to be a recovery operation from a point in time in the back of the environment. And that latter part, that kind of data recovery, that's very important. So with certain technologies like snapshots and backups and replication, which we'll get into in more detail in a few moments. DR does address the ability to recover the data in addition to the systems. And again, that's not something that's factored into HA. And on this next slide I wanted to talk about, you know, a couple other items, you know, maybe kind of break down a few differences as well in terms of. So we talked about disaster recovery, but also kind of looking at disaster recovery alongside what we term is kind of back up in more operational recovery. And so back in recovery and disaster recovery together, they tend to pretty much compose the majority of what we see in terms of data protection, you know, implementations and strategies. So depending on the technology and the environment and the different requirements, DR kind of DR and operational recovery, they're typically going to leverage secondary copies of the production data for that that data recovery. So we see kind of the first sub bullet here we see snapshots being one of those secondary copies that might be used. And what a snapshot is, it is actually kind of you can think of it as like a picture that is taken of a file system at a specific point in time. They're primarily designed for short term storage they actually typically will live on the system itself. And with time older snapshots will be over and by newer ones, just due to the space requirements. And snapshots are used to actually restore that server to the exact state that it was at at the point in time in which the snapshot was taken. So backups are a little bit different, they are actually a duplicate copy of the data that is created, and it's actually stored in a different location than the original or what we call the production copy so there is that geographic isolation. And that tends to become important as an additional safeguard for DR and data recovery, and the event that you know the production environment is compromised so you're not relying just on that one system. You now have this geographic isolation between these two systems, and you have that that secondary copy that's living again on a different system. These are not able to be taken instantaneously like snapshot is in theory, depending on what you're backing up they might take hours or days to complete. But again they do provide that additional safeguards. So, okay, so we've talked about the fact that data copies including backup can be used for DR and or and operational recovery. So, what is the difference here. Well, a DR scenario is typically more of almost a crisis, right that is going to directly impact business uptime. Typically there might be a failure in the primary data center, and then you know you will need to recover to that alternate to an alternate site actually. And again we didn't necessarily talk about this on the previous slide and we're talking about the differences between DR and high availability. But typically in a DR scenario there's a secondary site whether it be in the cloud or whether it be a secondary data center that's maintained by the customer that that recovery operation or that failover can happen to. So again, they are typically more of kind of a crisis if you will. Operational recovery tends to be a little bit more routine and a little bit more smaller scale in its impact to business operations, early and also to it kind of just by nature. So this might be things like an accidental file deletion, a coder, a file that is wrongly saved or things of that nature. So you might ask, well, why wouldn't I want to kind of just always DR and kind of have, you know, a quicker failover and have less data last well there's a price differential. Disaster recovery can be much more expensive than an operational recovery back in recovery. And this is because you do need that secondary site. It's also because of the technologies that are used. So we do see that customers tend to sort of pick and choose when they're going to implement it through DR and that it tends to be for a critical application. So this is really going to be driven by what we call the RPO, the recovery point objective, which is essentially the amount of data loss you can tolerate. The amount of time that you can recover to end the RTO or the recovery time objective. So how long does it take you to recover? And again, those are, you know, essentially the smaller your RPO and the smaller your RPO, the more expensive the technology is going to be because, again, because you're going to have that more limited data loss and you're going to have that more limited downtime. So a couple of other things to mention here, data archiving and long term data retention. So as these data backup copies age, they might still need to be retained, for example, for compliance reasons. But of course, their value from recovery standpoint is going to go down because they are older. So going back to those RPOs that we just talked about. So part of a data protection strategy is also typically to allow for data to be kind of archived or tiered off to data that is less expensive. But, you know, it may not be as high a performance and things of that nature. So the back these archived data copies are still acceptable if they're needed. But it's just going to take longer to recover to them. But again, they're still available if needed. And then one other point I did want to make is that oftentimes we do see data protection technologies be used for data migration initiatives. So, you know, customers are moving their data around all the time these days. And oftentimes they might leverage their data protection solution to do that. So we talked a lot about kind of protection against natural disaster cyber crime. Another element to kind of cover is data privacy and security legislation that must be complied with. And this tends to be typically industry specific, possibly regional specific. I alluded a few minutes ago to the, it's called the general data protection regulation GDPR, which is essentially is in the European Union. And this is one of the strictest regulations that we've seen. So it's actually a good example. It's also quite applicable on a global basis because GDPR governs how data related to people in the EU, in the EU, excuse me, is targeted or collected. So essentially it impacts any business that does business in the European Union. So basically what GDPR and some of these other regulations are aimed at doing is giving individuals insight into and of course ultimately control over how their own personal data is being collected and processed and stored. So we hear about the right to know or the right to be forgotten. And essentially what this means is that the customer has the right to ask what data about them is being generated and stored. And they also have the ability to ask their personal information be either anonymized or deleted, if they so choose. We also see that they have to provide consent and really have the option to understand what data about them is being collected so there needs to be transparency on the part of the business that they're doing business with. As well as of course how that data is kind of being used as well. So that's another point that there really needs to be that business case for collecting the data. And there's this concept of data minimization, which basically means that companies are not going to just companies again need that business case for collecting data, and that they should only try to collect the minimal amount of data that they need. And then data deletion. So we talked about kind of the customer request for their personal information to be deleted. But this also tends to be important as data ages and things of that nature. So the accountability is really on the business or the organization that is actually controlling your processing the user's data to make sure that you know all these situations are met with so that the data does not ultimately fall into the wrong hand. And it's pretty broad spread it actually stands to data life cycle from the initial point when the piece of data is being kind of gathered or collected all the way through when it's being deleted which I think is pretty clear based on what we've talked about. So from a software kind of an open software development and sort of an open source standpoint a couple things that are that are interesting. So some regulations do have a stipulation to avoid known software vulnerabilities, which of course when we think about open software is practically unavoidable. But essentially as long as the business establishes or can prove that they have an appropriate level of security in response to any known vulnerabilities that you know then essentially they're they're meeting their compliance requirements so it's really about avoiding revealing the personal the PII as we call it this personally identifiable information. That is kind of the ticket here if you will versus just kind of the software code itself being publicly visible. This goes back to kind of one other point I did want to talk about which is the concept of privacy by design, which is really that data privacy should be considered upfront and so of course data protection is going to be an important component of that. So we've gone through a lot of fundamentals in this next section I did want to kind of double click down on some best practices and really provide a little bit of guidance regarding what we see in terms of you know kind of customers doing and just some of the learnings from that perspective. So we think about best practices for data protection. I think one of the first things to be cognizant of is actually really just to be cognizant. We're seeing kind of the mentality that just because you're using a Linux system does not mean you won't be the target of a ransomware attack or otherwise have a need to retain and recover your data. You know and hopefully we've kind of gone through a number of the reasons why this is important here, but that's just something kind of that mindfulness there I think is the best practice that we see. Another best practice that we see is to use a variety of technologies to meet those RPO and those RTO commitments without breaking the bank. So we've already kind of covered a little bit around operational recovery, recovering from backups. I did want to talk a little bit more about what we call replication. So replication is often used for disaster recovery and essentially what replication does is a copies data changes from the primary database to a secondary database that is typically located on a different physical server. And then that secondary physical server can then essentially operate as the failover location in the event that a data recovery is needed. There are different types of replication that we see one is called synchronous and what this means is that the data is being righted to that primary or what we call that production database in that secondary database or what we tend to call the replica at the same time. So an implementation like this might be used to support some of those very, very highly available, very, very kind of low RPO, low RTO applications or use cases. Again, because the data is being righted at the same time and it really can be ideally failed over to as quickly as possible. So asynchronous replication, what it does is it writes the data first to that that production database, and then it commits the data to be written to the replica that we talked about. So there tend to be some benefits here compared to synchronous in terms of asynchronous replication tends to be a little bit lower cost. And it also tends to have benefits when we think about physical geographical distances when these are an issue. It's a little bit kind of cost effective and easier to work with from that standpoint. One more technology I didn't want to cover is called continuous data protection. And essentially what this does is it automatically saves a copy of every change that has been made to the data. So it tends to be very, very granular in terms of what can be recovered. But of course that tends to get expensive. So we also see what we call kind of near CDP technologies. For example, that might provide access to files on a file share at intervals of probably 10 minutes, 15 minutes. And it provides a higher level of protection than in some cases that we see a backup of the backups are definitely getting there for sure. But at a lower cost and kind of a true CDP technology. And another important consideration I did want to address is kind of how you recover. So, in other words, what granularity do you have, can you recover individual files can you recover directories. You have to recover the entire virtual machine or container. You know, ideally, of course, you would want the ability to be able to do both. So you'd want to be able to recover a few files as needed, but then also recover the entire application in the form of VMs and containers, if that is what what is needed. However, a lot of data protection tools are optimized around only one of these scenarios. You know, of course, a lot of them are getting there. There's been a lot of development that's been going on. But it's just important to kind of keep in mind and really kind of understand the capabilities and any possible limitations of the particular solution in question again to be able to understand the granularity of the recoveries that are going to be possible. I also wanted to introduce the concept of application consistency. This is important, especially for large and complex applications, because what it does is it allows the backups to be more reliable. And it also allows for a faster way to get the application back online again in the in these cases of having these kind of large complex applications with a whole variety of dependencies that are required to get them back online. So what an application consistent backup does is it actually pauses the application operations and then it allows any pending data rights to be flushed out to the storage disk and sort of captured there. And this is in contrast to what we call a crash consistent backup, which with that what a crash consistent backup does is it takes a snapshot of all of the files at the exact same time. So essentially, it allows the application to be brought back up to that exact point in time. But again, there might be different considerations, depending on the state that the application was at at that point in time, in terms of getting that data back up and running. So I bring this up here, because there are some APIs that can be built into the applications up front, you know, essentially to allow for that consistent copy to be created. So Microsoft has what it calls volume snapshot service VSS VMware has its V storage API for data protection which is the ADP. And again, these allow users to create those consistent point points in time, right in terms of kind of the state of the file system, so that they can then be, you know, copied or backed up and building these in upfront process to happen without then requiring kind of a third party application to come in after the fact and try to do this so it tends to kind of be beneficial from that perspective. A few other odds and ends that I did want to mention regarding best practices across the next couple of slides. And again, these are related to data protection. So the first is thinking about technologies that can help to detect problems in the backup process. Because of course, if there's a problem that occurred with your backup, and it is not a good backup to recover from, you know, then you are going to have problems with, you know, of course, recovering that data to the point in time that you need. So leveraging things like a checksum, which is essentially a very small data block that is used to kind of detect any errors that might have happened during the transmission of the data to the storage. These can be really helpful in terms of verifying the integrity of that backup, for example. Encrypting data in-flight and at rest, definitely best practice. This can help to avoid some of those bad actors from accessing the data backup copies in the first place. Another item is the concept of access control. So we see a lot of data protection software today, you know, using capabilities like multi-factor authentication, role-based access control, to really make sure that only the correct user is able to access, you know, data, backup data, and that they only have the amount of control that they really specifically need to get their job done. So again, this can help kind of that hacker, that bad actor, prevent them from accessing the backup copies. We also do see what's called kind of two-person concurrent. So certain actions like, you know, kind of perhaps changing an immutability setting or deleting a file or deleting, you know, a certain storage volume might require the authenticity authorization from two individuals so that one individual cannot go in and wreak any havoc. And that's, you know, important when we think about some of these, you know, internal bad actors or these hackers as well. Logging and auditing can be very helpful to detecting security vulnerabilities, also giving reading indicators of breaching, you know, breaches to the environment. So for example, a failed login attempt. And then I did want to touch as well on kind of remote login, secure shell is very commonly used to manage kind of Linux servers remotely. So just making sure that there's the proper configurations that it's behind a firewall, you know, really kind of best practices from that standpoint. A few other things to think about. So physical server security, you know, making sure that the environment cannot be physically accessed by, you know, a bad actor, patching, patching, patching, patching. This is very important, including for the Linux kernel. You know, outdated patches are one of the ways that we see these ransomware attackers kind of targeting to go in. So this is very important to keep these upstate. Avoiding unnecessary software and services really keeping the environment to the basics of what is needed. Disabling booting from external devices. That's just another way that a bad actor might be able to penetrate the environment. And that involves right as an additional kind of preventative measure in terms of, you know, again, getting access to the environment also, you know, kind of maybe detecting that an attack is occurring. And that's when we start to think the next bullet here into the virus software. We do see data protection software building in kind of anomaly detection to be able to analyze the back environment for things like large scale encryption activity, for example, that might indicate that a ransomware attack has occurred. We do see kind of more server side software being used to detect an intrusion as it's happening. So these are very complimentary to each other and, you know, really kind of a best practice to be able to use both. And of course, vulnerability scanning. I know we kind of talked a little bit about that earlier. I did want to share a little bit about consign specs practices before we shift to kind of our summary section. So a lot of these you'll see are similar. One of the big ones is just getting visibility into and control over the entire state of backup copies. So the, you know, the customer has responsibility for knowing what data is being created, where it's being stored, how it's being stored. So having that visibility is very important. We do see data protection software vendors are working hard to be able to kind of try to provide that single pane of glass. So making sure that they can, you know, provide that visibility across your various environments is increasingly important these days. We've talked about the value of having those externally stored encrypted backup. As well as I just alluded to the security scanning as well as the auditing and logging and patching. And then again, just, you know, kind of these last few bullets here, really a similar vein. So, you know, essentially sound data protection, you know, having sound data protection implementation and kind of following some of these best practices is also going to help on the compliance side as well. So in summary, I wanted to leave you with a couple of slides. The first is essentially some of the key things to look for that we see any data protection solution. So the first one is making sure that the data protection solution can support all the sources that you need it to from both an on and an off premises perspective. So kind of support any on prime VM support any cloud applications that you're using. Going back to that single pane of glass that we just alluded to, you know, increasing these becoming important for data protection solutions. Frequent backups are important that really is the insurance policy in terms of being able to recover. So of course that's important. Similar search and recovery. This becomes important. You know, especially when like we talked about you might want to recover one specific file you might need to recover a couple of VMs but maybe not the entire VM estate for example because of course what if some of your virtual machines are still impacted with ransomware for example. So having that kind of fine grain capability becomes important. So extended data retention, you know, we did talk about the fact that this might be required for compliance or other reasons. So that's important to look for. And then the ease of use. So we do see data protection teams are spread very thin these days. They're having a complex very complex environment. And they are dealing with all of these, you know, cyber attacks and compliance requirements and things of that nature. So that ease of use becomes important. In fact, for many data protection solutions, we see some self service capabilities being built in for that, for example, the developer might be able to execute their own recovery easily without having to go through it. And so this is something kind of to keep in mind. The cost effectiveness. This is always table takes and data protection, really making sure you're getting the most bang for the buck. Of course is important. Airgap being an immutability we didn't touch on this too much but these are important preventative measures against ransomware. So air gap essentially means having a copy of the data that is not accessible by the host or by the production environment. And really is kind of the final safeguards from a recovery standpoint. And you see the cloud being is for this today. For example, and so this is important. Immutability basically what this means is that the data cannot be deleted or altered until a set period of time. And so for ransomware, this is important because even if the hacker is able to gain access to the back of environment, they're then not able to tamper with the back of data. DR testing we heard a lot about this especially in 2020 with the COVID-19 pandemic customers were looking to switch to remote operations they're looking to try to protect against all this whole kind of insects of ransomware attacks that we saw happening. And they needed to make sure that their DR plans were sound and that they were going to be able to recover so testing just really became top of mind. Analytics machine learning. You know, I talked a little bit about some backup software is being developed to be able to identify anomalous behavior in the back of environment. And this can become important for identifying that a cyber attack has occurred. Data reuse. So being able to make the data accessible. For example, for test and dev or other kind of business use cases. This is important. And then the data mobility. I mean, we talked a little bit about this a few slides ago, but being able to make that data mobile across heterogeneous environments is becoming a requirement we're seeing today. And then my last slide here is sort of just a number of questions that you might consider asking, you know, as you know when looking at different data protection offerings and solutions. So the first kind of going back to our previous slide is what data needs to be protected? Where is it located? Are they all, you know, are all of the sources protected by the data protection pollution? And then of course the storage target. So this is kind of the storage environment where the backup copy is going to be sent to. You know, there's a variety of options and really just making sure that the proper, you know, storage targets are going to be supported. Of course that's important. I thought this next bullet. I think this one is pretty interesting. People skill sets capital resources, right? Do we have what is needed to be successful? What is lacking? There's, you know, this data protection, you know, really is kind of evolving these days with clouds, containers and things of that nature, you know, open source, I think, certainly. So really just making sure that the organization has all the tools and capabilities to be successful is important. Making sure that the recovery points, recovery times, locations, everything that is needed, making sure that those are, you know, supported and available is important, of course. Understanding any high availability features that might be built in, like those checkpoints we talked about to complement the system architecture, that's important. The data encryption is important. Also, how are those encryption keys managed? What control does the customer have over those? Not becomes important as well because, of course, you know, you want to make sure those encryption keys are protected appropriately. Auditing is the backup solidity. So again, making sure that those backups are good, making sure that there has been no issues with them. This is important. Automating and testing recovery processes, we just talked about data archiving, you know, making sure you understand what capabilities are available. And then the compliance oversight capabilities as well. You know, having the ability to understand, you know, that your data, that your backup copies are in compliance, you know, across your environment is important as well. So with that, I know I've kind of thrown quite a bit at you today. You know, I do believe we're going to be taking some questions here during this session. I also did want to leave you with my contact information, including my email address and some social media contact as well on the bottom left hand side of the screen. You know, we certainly want to be a resource to you. So please don't hesitate to reach out if there are any follow up questions on this. So with that, I wanted to thank you all again for joining the session today. Really appreciate it and hope you enjoy the rest of the conference. Thank you again so much.