 Good evening. Welcome to the AMI session on data leaks in SeatGuard. I'm Shruti about me. I work as a security manager at Apsico. What I do at Apsico is taking care of our clients and helping to ensure our clients have a smooth journey with the work we do for them. Apsico was founded in 2015 and we are an application security and a cloud security specialist company. You can find us at Apsico.com. Also I will be your moderator for today. So in today's session we'll be discussing about real world data breaches, examples and how to identify them and also we'll be talking about prevention on the data security best practices as well. So about our speaker Abhishek, say hi. Yeah so Abhishek heads the security products team at Apsico and is also an accomplished security professional with over a decade experience in information security. He's also a core team member responsible for technology development at Null Community. So today's session is going to be an interactive one. So after Abhishek finishes its talk, we'll be opening the event for questions and interaction. So you can ask all your questions in the Zoom Q&A and also on the YouTube chat. So if you're on Zoom and you would like to talk to Abhishek, you can just raise your hand. So I will let you talk to Abhishek then. And also we'll be wrapping up the session in around 50 minutes to one hour. But again, it depends on how interactive we're going to be making the session. So yeah, please do ask a lot of questions. Over to you Abhishek. Thanks Shruti. Thanks for the introduction. In fact, I had an introductory slide which I think is going to be obsolete now. Okay. Okay, so hello everyone. Thank you for joining. I really hope that this session will add value for your time and attention. Thanks to Haske for organizing this and providing with the audience. I think this is a great opportunity for us to talk about data security, the way we see it, and the kind of work that we are doing around data security. Oh, okay. The obsolete, the deprecated slide. I think Shruti has already done the introduction. Just want to add that I'm pretty active on Twitter. So if you want to have a conversation beyond today's question and answer session, or you want to, you know, in general, chat about data security, application security, or Kubernetes security, I am pretty active on my Twitter account. So feel free to follow me or you know, ping me on Twitter. Just want to mention that I, as I think Shruti mentioned, I had the security products team at AppSecco. We do a lot of application security, cloud security, and more recently, Kubernetes security related work. So I'll be more than happy to talk about, you know, Kubernetes security or any of these areas with anyone interested. So let me set the expectation for today's session. So the objective of this session is to, is more to have a conversation on data leaks, data breaches, and you know, understand some of the fundamental reason for this data breaches. Of course, if you look at it technically, you will see that there are a lot of different and unique technical issues that lead to various data breaches. But if you look carefully, you will see that there are some recurring fundamental issues that lead to these breaches. So as part of the session, I will start with some example data breaches that have happened in the past, you know, more of stories really. And I will avoid going into too much technical details of this case studies or stories. The reason being, I would want to take up this technicalities or very specific technical questions as part of the question and answer session. So feel free to ask if you have any specific questions related to data security or concerns related to data breach. I'll be more than happy to answer and discuss the technicalities of data security as part of the question and answers. So I'll keep the stories a bit short really. So the objective of the case studies is, you know, just to give you an overview of what is happening around the world using examples. But, you know, mostly from the technical Q&A, we'll take it up as part of the Q&A session after this talk. I'll try to, try my best to give you, you know, very actionable answers as part of the Q&A. But just want to set the expectation that I am not really an expert in data breach investigation. I look at data breaches more from the perspective of what really went wrong when a breach happened. You know, more from a threat model perspective or more from a security architectures perspective and identify how we can fix or mitigate those issues in a fundamental manner so that such mistakes can be avoided, right? So as part of the session, I will also talk about my understanding of some of this fundamental root cause that lead to data breach, right? And of course, we are free to discuss them in detail as part of the Q&A. But before we start with the, you know, the example case studies, let me, let me just give you some assurance. Like if you are worried about data breach or if you, if you run some platform or infrastructure on the internet, provide critical services and you're worried about data breach or you have been breached in the past, really don't feel too bad about it, right? Because if you look into this infographic, you will see there has been countless breaches in the past. There has been so many breaches. In fact, every major company or even governments have faced some breach or the other, right? So there is nothing to feel bad about it. What we are going to do is really look into some of the fundamental root cause that leads to data breaches and discuss on how or what we can do in order to mitigate them or avoid the same mistakes that lead to data breaches in the future, right? So as part, as part of today's session, I will, I will talk about three unique case studies. These are definitely not selected based on, you know, criticality or size of a breach. But mostly from the perspective of, you know, the different root cause, right? So I selected these three types of three breach that happened in the past so that I can, you know, cover some of the most common root cause for breach. So let's start with the Capital One incident. So Capital One is a bank based out of U.S. I think they operate in some other countries as well. So in case of Capital One, what happened was there was a data breach in which almost 100 million PII information about their customers were exposed including financially sensitive information. So what really happened as part of Capital One database and whatever I am telling you is, you know, in public domain, my understanding of the breach is also based on the publicly available information. So what happened in Capital One breach was that the attacker was able to identify some application, some web application, which was not supposed to be exposed outside to an external, you know, users outside the maybe a trusted zone, right? Not just an attacker found such an application. The attacker was able to find a class of web application vulnerability called server side request for journey, which allowed the attacker to make the application execute some HTTP request on its behalf and get a response, right? That is the nature of SSRF vulnerabilities. So what the attacker did, the attacker used this vulnerability to interact with the AWS metadata endpoint. This application was posted on AWS and by virtue of that, the attacker was able to get some session tokens for the role associated with the instance and that access allowed the attacker to, you know, enumerate S3 bucket and download a lot of very critical data from the S3 bucket belonging to the same cloud account. So that is a high level technical overview, but if you really look at it, what really happened was there was a web application vulnerability, which was exploited by the attacker and then there was an inherent trust provided to the web application by the underlying platform. In this case, it was AWS and the trust was the role, the AWS role which was attached to that instance. The attacker also managed to take advantage of that trust and access data storage, which in this specific case was S3 buckets, right? So if you look at the fundamental root cause, some vulnerability and trust abuse, okay? We'll come to other root cause as well. So this, this led to a massive data breach of, you know, as I mentioned, about 100 million records. So let's move on to our next example. This, this, the next example is of Equifax, another very, very interesting example which had very high impact in the security industry. In case of Equifax, what happened was the attacker, again, was able to discover some application which apparently was not even a critical application. This application was just lying around somewhere in Equifax infrastructure and this application was vulnerable to a publicly known vulnerability in the structs framework using which this application was made. As part of the publicly available information, the attackers managed to use a publicly available exploit to take advantage of the vulnerability in an application and gain a foothold into the internal network of Equifax. I am not sure what all activities they had done once they had got into the internal network, but usually in case of, you know, traditional enterprise architecture, which kind of depends on perimeter security, if an attacker manages to gain a foothold into the internal network, really bad things may happen because of the, you know, because of the additional trust which conventional enterprise architectures have on the internal network, right? And that is what happened in case of Equifax by leveraging a publicly known vulnerability and exploiting an application which was not that important, right? Attackers managed to gain access into the internal network and then what really happened was a massive amount of data reach with very financially, very high value, financially sensitive information. Now, coming to our third example, the third story that I wanted to share, this is more of something which is directed related to many of us, right? So, sometime back, I think earlier this year only, a couple of months back, there was news that Veeam app data breach, right? Some Israeli company came up with information about a potential data breach or Veeam app. So, on further reading it was found that it's not really the Veeam app which was breached, but third party service partner, the CSE, Indian government CSE, the Common Services Commission or something like that, they were doing some activity for Veeam and as part of that activity, they were collecting a lot of really sensitive PII, personally identified information including identity cards like Aadhar or other identity card scan copies from vendors and other users. In this case, what was found was that a large amount of this data, let me check what was the size. So, almost 7 million records of sensitive, personally identifiable information was discovered by this Israeli company in a misconfigured S3 bucket, which means anyone who discovered that S3 bucket posted on the internet could access and download that information, right? So, obviously it is not known what is the exact root cause, how does such sensitive information ended up in a publicly accessible S3 bucket, but again, if you look at it, if you forget about the impact and try to look into the root cause, you can see it has something to do with misconfiguration, some change in the infrastructure with an impact which the security team of whoever was responsible was not aware, right? So, that was my final example. So, let us move ahead. Now, if you look at the stories I just shared, you can see technically, they were pretty different, like there were different reasons for compromising each of the case. In case of Capital One, there was a SSRF vulnerability in a web application. In case of Equifax, there was a structs vulnerability in their web application using which an initial foothold was gained into the corporate infrastructure. But in both the cases, if you look into it, you will see it happened, the initial compromise happened probably because of the first point that I have in my slide, lack of visibility of online exposed assets. So, many a times especially in today's time, well, there is no strong separation of perimeter. Like we are in a hybrid infrastructure where we have some part in our internal network, some part we have on the cloud. In such a time, it is very difficult to maintain visibility of everything that an organization have exposed on the internet or for that matter any untrusted network, right? So, in both the case of Equifax and Capital One case study, you will see there was lack of visibility. They were not aware that something was exposed, which kept compromise may have a significant impact. Also, if you look at the third example, like Heem's example or even in case of the Capital One's AWS role example, you will see that identity and access management is also another very fundamental root cause, misconfiguration of identity and access management is another fundamental root cause because of which there has been many data breaches in the past. Especially now, like when, you know, when perimeter security is perimeter is kind of diluted and we are in the age of a combination of perimeter and cloud security. Identity and access management is a very, very important component of infrastructure security, especially data security, right? So, it is very important to get it right. Major cloud platforms provides great identity and access management primitives, but ultimately, it is the end user's responsibility to correctly use this IAM primitives and ensure that appropriate authentication, authorization is enforced for all access, especially through the critical assets. Finally, in case of, again, in case of, say, if you look at the QFACS case, particularly QFACS case, the initial breach happened because of vulnerable software and that to a publicly known vulnerability, right? So, vulnerability and patch management is also one of the most common security challenge, which has been known for a very long time. But, you know, as people who are responsible for infrastructure security, many a times we do not get it right, right? We know that vulnerability and patch management is a problem, but because of various reasons, we do not always get it right. So, I think these are some of the core issues which if taken care of correctly will probably reduce the risk of data breach. Obviously, these are entirely my opinion based on whatever I have, some of the breaches that I have read and some of the, you know, architect modeling and architecture reviewer that I have done for our customers, these are some of the really critical things that one should get right in order to ensure data security. There are many other security controls as well that we should look into, but I think this is something we can definitely take up as part of our Q&A session, right? Like what you are using, what are the some of the root cause you think are responsible for data breach and what can be done. Now, what can be done about these data breaches? Like we have identified some of the root cause. Now, what can be done? What is the solution? I think there is no great answer or even no good answer for this. In fact, I would rather share what I would do in such a situation instead of trying to answer how I can stop or reduce risk of data breach. So, let me start by sharing. Like in my opinion, if I have to secure something, I will start by understanding two things. I will start by asking two questions. Who are the attackers and what can they attack, right? Both of these questions can be answered by doing a threat model for an application or a network or a cloud infrastructure. Personally, I believe that having a threat model as a reference before, you know, you do any kind of security activity or take any security initiative is really critical. Without a threat model, we are usually very dependent on best practices, guidance, which are publicly available and many a times they are very generating nature. They may not be the best fit for my own infrastructure. So, if I have to secure an infrastructure, I will definitely start with a threat modeling that will tell me what are the threats and what are the risks that really exist or really affects my infrastructure. Only when I have that visibility, then I will go out. Then I will go out and look for either compliances that have provided me a list of security controls as a checklist, which is applicable for me. While compliances are good as a baseline, I don't think compliances really assures you that there is some infrastructure is secure. But yeah, when I have the threat model, I will then look for security controls to mitigate the various threats. So that is what I will do. And I highly recommend that if you are responsible for infrastructure security or application security, you definitely look into creating basic threat models. If you do not have the time or resource to create an extensive threat model, then maybe you can look at something called rapid risk assessment methodology created by Mozilla as an alternative approach for identifying risks at your development or design phase. It will also give you an overview of what are the threats and risks that affects an application at the development time or if you are doing a major change in your application or in your infrastructure level, you can still use rapid risk assessment methodology to identify what is the impact of such a change. For example, like in case of say the CSC data bridge affecting Bheem's users, some of the users. I don't think they created an S3 bucket which was public by default because for quite some time AWS by default will not create a public S3 bucket. We have to go and manually set some permissions in order to make it publicly accessible. So what I really think happened is that at some point because of some requirement, they had to expose it to some other services, need a system or need some other entity in the organization. And the kind of change that they made on their S3 bucket, they did it without understanding the impact and that is what resulted in a database like that. Also, one of the things which I would consider like for the vulnerability and patch management. I really think a containerized approach to application delivery to a large extent can help mitigate issues with vulnerabilities and patch management. Because if you have a process of continuously building your container image, then you can ensure that the container image once when it's being built, it built with the newer base image. Like for example, if you're building on a Ubuntu base image, then you can ensure that it is updated at the time of build which will ensure that operating system level vulnerabilities are not present. Of course, this will not mitigate application level vulnerabilities, but a containerized approach to application delivery can to a large extent mitigate some operating system and runtime library level vulnerabilities.