 Hej, jag är Johan Torchan. Jag är CTO och co-founder av Elastices, vi är Swedish CNCF-memor, training partner och service provider och vi också kreator om kubinetisk distribution. Jag har också en spärrtime jobb som tjänstfakter i computer science, där jag försöker göra microservices lite mer självdrivande, för vi vill självdrivande cars och vi vill simplifikt använda IT. Jag har varit researcha och läsa distributiva system, eftersom det är dörra ägelser av webbservice. Och jag remerar lite om när Amazon webbservice introducerade en förslags VM. I dag ska jag läsa lite av det som vi hade i Elastices i hjälp av många kompagniser som ville att återföra klubinetisk teknologi och kubinetisk, men fortfarande operera i ett stort reglerat businesssektorn. Så tillbaka min tal är kompagnisk för övriga säkerhet. Låt mig läsa lite om det. Det är väl understått att för att kompla med varje reglerat man behöver en solid informationssäkerhet program. Och i fack har det varit många talar i tidigare kubikon på det här temaet, inkluderat ganska många historier om hur man ska använda servicemeters för att få en till en inkryption mot alla microservices. Jag har inga servicemeters, men det är mer för kompagnisk än det. Och idag vill jag läsa lite om certain aspects of compliance that goes beyond traditional security or require you to take a different approach to security. So traditional information security standards such as ISO 27001 when SOC2 focus a lot on topics such as confidentiality, integrity and availability, the so-called CIA triad. More recent regulations such as the European Data Protection Regulation GDPR and the much inspired California Consumer Privacy Act focus on the different aspect, data privacy and end user rights to their data. To illustrate some of these needs for the regulation. Let's do a mind experiment. Consider a scenario where you're running an online service and one of your users requests to unregister. How do you do that? Well, removing a user that's pretty straightforward with delete the corresponding record from our database or data stores or even perhaps just mark that particular user as removed. Because hey, after all, we hope that that user might change his or her mind and come back as a user. However, if our user instead requests to be forgotten this is a much trickier question. Returning to our database example, it is clearly not sufficient just to mark this particular user as removed without actually removing that record. But what more? What other kind of data do we keep about our users? Where do we store it? How do we process it and for what purposes? This is the topic I want to talk to you today about with a focus on the GDPR regulations. So we see an international trend of data privacy most prominent in Europe with the GDPR which is a general protection of user data that regulates protection of data, sort of classical information security, privacy of data and also limits how we can actually transfer this data. I will focus my talk on the two last aspects the data privacy and data transfers. So while this GDPR regulation applies inside Europe it's very important to remember that if you do business in the European Union or even if you keep personal data about European citizens GDPR applies to you. So this is not only a European concern. So I previously mentioned personal data. What then is personal data? Well, this is both rather well defined in GDPR and rather vaguely defined. So clearly identify such as social security numbers or other numbers directly tied to individual persons or personal data. Similarly a name is sort of an attribute of a person but it may not be unique. But if you then combine a name with perhaps an H and a location where that particular person is living you can identify that individual. And similarly more abstract identifiers such as locations or even IP addresses has been deemed to be personal data. So in summary all data that explicitly or implicitly identifies an individual is personal data and the regulation here is future proof in the sense that data that is today not really considered personal can be considered personal in the future. So we should be very careful when collecting data about personal opinions, diseases, religions and the similar. So at any rate if you process this type of information about users directly or indirectly then you need to comply with GDPR. So let's talk about terminology a bit. In GDPR we talk about the data subject that is the person about whom you are collecting the data. The data controller if you are running an online service and collecting data you are the data controller. And we also have data processors and this is basically all the various type of providers you use for processing data that you collect about this person. So most prominently this would be your cloud provider. So GDPR brings out quite a few requirements of data processors. For example you need a legal basis as one of the six reasons to collect this data. If you don't have a legal basis and consent you are not allowed to collect data. And this is the reason that we see consent popups on every website we visit. They are asking for your consent to store the data in accordance to GDPR. Another example in GDPR is data minimization. You are supposed to collect only the minimal amount of personal data you need in order to do what you collect the data for. So GDPR is a rather new regulation but do mind that fines can be very severe. This can be an impact for both small and large organisations. I mean what we see here to the right is the total amount of fines handed out for violating GDPR regulations. It starts slow and increases and this is because in many different European countries the data protection authorities took over educating stance in the beginning basically informing companies that they are not really complying with GDPR to only later more recently start handing out fines. So this would be your friendly parking attendant telling you that your car is incorrectly parked in this new street, so please don't do it in the future and you will get fined. So interestingly enough if we look at the top fines handed out here we see that these are both handed to European and non-European countries. So your organisation's place of registration doesn't matter here. And we can see two broad categories. We can see both that data has been used in an inappropriate way and we can also see that some others are fined for not having sufficient protection capabilities around the data. So you need both a sort of top-notch information security system program and to actually use data in a proper way if you want to comply with GDPR. So two aspects of data privacy and GDPR compliance. The first one I like to call data obsolescence. Under GDPR we no longer have the full and eternal right to user data and we need to plan accordingly. And then we also need to be careful about data transfer. Most prominently use of cloud computing services as this is only allowed if these services operate under legislation from countries with sufficient privacy protection. We'll talk more about that. So let's look at the first challenge for data obsolescence, observability. We've been told many times that observability is a great way to find out what's happening with your running system so we keep metrics to see how things are running what kind of utilization we have and so on. We do logs to be able to do detail debugging later and we may even use distributed tracing to pinpoint performance issues. We're trying to ensure that our users are happy our applications run as they should and that our servers and infrastructures are not overloaded. This is all good and all fine. However under GDPR we also need to be mindful that are we using observability in a manner that keeps our regulators happy because collecting all this data needs to be done for a good purpose. The other challenge to data obsolescence is availability and then in particular high availability is important for any business. Recent news highlight the need for high availability and in fact in the ability to be able to operate as normal even despite major failures is important. I mean this is highlighted in many regulations such as ISO 27000 and such criteria are commonly described as business continuity. Even if disaster strikes there should be continuity and not as illustrated by this recent boating incident. So to achieve high availability we commonly use standard practices such as replicating critical components so that our systems can tolerate failure of a single component. But overall this increases data fan out and personal data tends to be replicated both in many places perhaps even across geographic regions, in the ability zones or even cloud providers. We are spreading our data far and wide for availability reasons but this has implications of privacy. So in summary let's revisit this cloud native trail map that many of you are familiar with and personally I've always been wondering why there are so many dragons in this figure but for the sake of this discussion we note that here we have a few deniers in terms of data obsolescence and here there are two main problematic categories observability and database of storage where we tend to collect more data than what is allowed from a GDPR perspective and we tend to keep it for longer and in more locations than is strictly required and this may give us troubles. So in essence we are data hoarders we are collecting data about our users all throughout our systems and we keep them for longer than we perhaps necessarily need and in more places whereas GDPR is very strict that we need to minimize data so we are only allowed to collect data for the purpose of the actual collection and store it as long as needed so for example if you are doing detailed logging to debug some log in session in one of your market services it makes a lot of sense to store some logs to see what is going on but frankly storing these logs for much longer than needed and debugging is likely not in accordance to GDPR because two weeks down the road you probably know whether that user managed to log in successfully or not and you no longer have a valid reason to store the data from a GDPR perspective so the solution I would like to propose here is really retention so do collect the kind of data you have but keep it on as long as reasonable for the sake you are collecting it so do rotate your metrics your logs and your traces accordingly and then backups very interestingly according to recent French court ruling are not really subject to data minimization and the right to be forgotten under GDPR if your user asks to be forgotten you don't need to erase any backup containing data about the user however if you do perform some sort of disaster recovery or restore from backup you need to remember that you forgot about that particular user otherwise you can end up in trouble the second point I want to make here is about data transfers so why should we really care about that so here to the right we see a map of the world with a few readings highlighted this could be the locations of past and upcoming cube cons once we get the chance to meet again in person but for this particular talk the map highlights where the largest cloud providers in the world are incorporated and GDPR states clearly that you the data controller is responsible I mean not only for your own GDPR compliance but also for the data processors that is your cloud providers and what we also know is that many of the largest cloud providers are not European companies so why is this important well up until this summer we had a legal framework for data transfers between Europe and the United States under a framework which was called the privacy shield which allowed European countries to safely transfer data to the US and ensure that they comply with GDPR and this was the second incarnation of such a framework previously there was something called a safe harbor that turned out not to be safe according to court rulings that was invalidated later on we had a privacy shield however same story comes over once again so this particular summer the European Court of Justice basically said that the privacy shield is no longer valid so without going into too much details this is a real clash of cultures whereas in the US there is a strong culture of the state having the right to citizens data for security reasons whereas in Europe there is a strong culture of you as an individual having the right to your own data and similarly not at the state but at the company level there is a big difference in the your right to users data versus users right to own their own data this is a mind shift so what will happen then in the future will we have another data transfer agreement so while searching for a solution I found this very neat idea of the iron fence perhaps inspired by the iron curtain that will be valid all up until 2048 of the no longer transfers of data over the internet is possible so yoke aside because this was published on April 1 given this clash of culture and law there is very unlikely that there will be a new data transfer agreement between Europe and the United States so what are my options while the European Court of Justice upheld so called standard contractual clauses or binding corporate rules as one valid mechanism for data transfer however this places a lot on burden on the data processor that is you to determine whether there is sufficient protection for data privacy with your processors and honestly this does not look very promising so you may also try to get the consent from your user for your type of processing but this is very tricky because you need to get the consent for each individual type of processing you may not only ask your user to sign a general consent because then you don't really minimize the data the use of the data well let's try to anonymize the data pseudonymize the data or even encrypt the data before transferring it to some of the clouds in the US well I mean this is doable as we can do encryption at rest but however as we cannot do encryption in use in a good manner this very much limits the use of such external sub processors the road we really recommend is try to limit the reliance on external cloud services and try to run services containing sensitive data on European providers including you as your own provider so how do you do it well cloud native technology and Kubernetes to the rescue to the huge cloud agnostic compliance because with cloud native technology we can build our application stacks in such a way that we can run them in large hyperscaler clouds outside of Europe and also in our own data centers or on smaller European clouds all without when they are looking just to illustrate this concept so if you are running this is an Amazon and you are using certain type of technologies there are nowadays good alternatives in the cloud native space such as using Kubernetes instead of some compute service or EKS using cube k native product for serverless and so on I won't go through the full list but in essence services that you would need to construct your application and that you can get from a big hyperscaler cloud you can create yourself using cloud native technologies but however that will put a lot of burden on you to fulfill GDPR compliance and to ensure that your data is safe and that you fulfill the privacy needs so in essence just because you are using Kubernetes and cloud native technology means that you are secure or that you are compliant so in the time remaining I would like to discuss some of the subtle technical challenges when implementing GDPR or similar regulations using cloud native technologies so we reviewed GDPR and found that there is quite a few let's call them more technical articles suggesting at various types of information security capabilities and some of these resided to physical level others at the platform of Kubernetes level yet others are more about your applications or even your processes in your organization but let's focus what we can solve at the Kubernetes level here so looking at GDPR in article 25 it mandates that data protection must be by design and by default so unfortunately Kubernetes is not really secure by default due to its desire to preserve this wow it just works branding what we need to do here then is to bring in some additional technologies from the cloud native landscape and configure Kubernetes in certain ways to make it more secure so just to exemplify we can use something like DEX an open ID connect to ensure that we are really connecting to our clusters as individuals that can be traced back to specific persons so I would really like to know what the person logging in to this Kubernetes cluster not just some anonymous administrator account that tries to do it changes and similarly we would like to use something like role based access control to actually limit what this administrator is allowed to do versus some other things and if you want even more fine grained access control you can look at something like OPPA and its gatekeeper project so let's look at article 33 and 34 which is notification of data breach as GDPR mandates that OPPA data breach really need to notify the data subject and here we can use cloud native tools such as FALCO for intrusion detection that captures everything that's going on in our cluster and gives us warnings about very suspicious activity I would recommend combining that with detailed logging, press through Elastic and Kibana to store application logs but actual audit logs from Kubernetes control plane to understand what is going on so if you get the suspicious FALCO we can look to our audit logs and see was this only our internal development team that deployed a new function or a new version of some application that did something that FALCO picked up or is this really a sign of an ongoing attack by these examples I illustrate how we can use cloud native technologies to create an information security management system and thus fulfill some of the aspects for compliance and to illustrate how to combine these kind of tools we created what we call compliant Kubernetes where we basically use a lot of well-known cloud native technologies to implement the technical capabilities required to achieve compliance with GDPR or similar regulations and this is just one possible set rendering you may replace some of these technologies with similar scoped products to achieve similar goals if you want to take a look at how we did it please check out our open source product and we would love your feedback and or contributions To sum up compliance regulations such as GDPR requires strict handling of personal data and failure to do so may result in crippling fines we need sufficient information security for data protection but we also need to minimize the data we collect so no more data hoarding and also we need to implement the right to be forgotten for example through tight data retention furthermore the end of the privacy shield raises a barrier for transferring data from the European Union to the US unfortunately through the cloud native technology we get the building blocks we need to build our own cloud agnostic compliance toolkit and a final shout out please check out compliant Kubernetes our own rendering of cloud native technologies for compliance if you like it please give us a star or even contribute thank you for your attention I'm now ready to take questions