 Okay. So hello everyone. Welcome. Thank you for attending the talk today. My name is Sandesh Mysore Anand. I am a senior engineering manager at the security team at Rezapay. I'm joined today with Namaskar Kisanian. He's a senior security engineer at Rezapay as well. You know, we both have been working, we both work as part of the Rezapay security team and we've been working on many different initiatives. And today, we kind of want to talk about, you know, how we built automated software set inventory to answer some key security questions. Very often, you know, in the security industry, we kind of use gut feel and tribal knowledge to kind of make decisions. But as we scaled, we figured that, you know, we needed a more immediate approach which relies more on data than on just tribal knowledge. So that's kind of, you know, what we'll talk about today. Before we get started, though, just a quick recap of what we do. Rezapay, I mean, as probably all of you know, is a fintech company. You know, we offer a lot of payments and banking suite solutions. But from the perspective of this talk, I think what's important to know is that Rezapay really believes in a tech-first approach to solving even finance problems, which means that, you know, we look at, you know, how we can solve problems using technology, whether they're internal or external problems. So that culture really helps us, you know, build such solutions. And, you know, why is security important for Rezapay? Just look at the list, right? So I mean, we power many, many of the largest companies and the most important startups in India. So obviously, security is of primary importance to us. And whatever we can do to make our security process and program more efficient, that's really helpful. And, you know, that's one of the motivations behind working on this project. Finally, the one aspect that I want all of you to appreciate is that over the last two or three years, I know we've grown in every measure possible, whether it's number of engineers or number of microservices or number of deployments. As you can see, we have been growing very, very fast. So keeping up with growth is important. You know, what works for a small company with 15 engineers does not work for a company with, say, 500 engineers. And we've been growing at that scale, you know, very, very quickly, right? So whatever solutions we build, whatever program we build, it needs to kind of cater to this kind of scale. And what we're going to talk about today is definitely relevant to companies which are scaling quickly and need to scale their security program as well. Okay, so let's dive into the talk, right? So I mean, before we kind of talk about the solution and things, I think, you know, the reason we build this inventory is basically to answer some common security questions. So I want to talk through a few questions which, you know, come across with any security program all the time, right? So first of all, incident response, right? So no matter what size company you are, you will have to deal with security incidents, right? And when you do, having robust security incident response program is important. So let's take an example of Lockport, right? So we had Lockport in the Lockportial issue, I think last December. And the idea was it happened so quickly that within a few hours it went from, hey, what is this new bug to getting attack traffic. And this happened to pretty much every company in the world, right? And when that happened, you know, it is very useful to have some information, right? Quickly at your fingertips. So for instance, you know, let's say you got an alert on a particular subdomain, right? And you know that this particular subdomain under attack. Now you want to kind of very quickly figure out, hey, which application teams should we reach out to, right? So if you have a way of answering the question, hey, what applications are deployed on this subdomain, who the owner is and what kind of data it stores, then it makes your life much simpler. And you know who to contact and you can kind of gauge what the impact may be if things go wrong, right? So responding quickly to incidents is really important. Having solid data is very important to responding quickly. Now let's take another example, right? Companies like us are product companies. So you know, our applications are products are really important from a security perspective. You know, and we routinely, you know, we have an app sec program where one of the things we do is routinely review our app sec applications for security, right? Now when we do that, it's really useful to kind of have all the security relevant information about that application, you know, upfront, right? So we can make decisions on how to review the security posture of a given application. So it's very useful to know, you know, what are the different infrastructure components an application uses? You know, does the application use any vulnerable third party libraries? Is it onboarded into our, you know, SDLC security tools? We use a bunch of tools like SemGraph, Dependabot and others. So, you know, does the application already use it or is it using it right? Having those information being available in an automated fashion makes a huge difference. So finally, you know, metrics, right? So when you, the first one, the incident response stuff is most useful to the security team and the incident team. The second one is useful to the security team, but also to let the engineering manager who wants to know how secure the application is. But very often, you know, whether it's, you know, like view leadership or engineering leadership teams, they kind of want to get a sense of where we are as a company, right? They want to get a sense of, you know, how many of our applications are considered high risk, or, you know, what percentage of our attacks has not been tested from a security perspective, let's say for the last six months. So these are the kind of data points that, you know, we often present to management or, you know, in general, even to ourselves to kind of know the health of the program, right? And again, these are kind of questions which are, you know, really important from a metrics perspective. So having an up-to-date software set inventory can help answer all these questions really, really quickly, right? And with correct up-to-date data. So that's great. So what's the problem, right? So the problem is that, you know, in most companies that I've seen, I mean, you know, I used to be in consulting before I kind of took this job up, and I still see many different security programs. And, you know, the favorite asset inventory tracking tool of choice for many people are spreadsheets. Now, don't get me wrong, I love Excel, I feel like most problems in the world can be solved using Excel, but there are some limitations, especially in this case, right? So first, you know, when you track these assets in a spreadsheet, it needs constant manual intervention where you need someone to actually go and enter the information, right? It's very hard to kind of, you know, it's not very automation friendly from that perspective. It's always out of date, right? In a company like ours, like I said, when we have dozens of deployments every day, every inventory that you filled out manually yesterday is out of date, you know, a few minutes after you fill it out, right? So you always, you know, having an uprooted inventory becomes very difficult when the only way you collect data is through spreadsheets. It's prone to human error, right? When you have humans entering information, human errors happen. Now, I'm not saying we can, any system can be completely void of human errors, but we've got to reduce it, right? And this is this approach to start. And finally, as your company scales, as your program scales, you have so many spreadsheets, you essentially need a spreadsheet with that spreadsheet. And that's kind of when, you know, you can have gone too fast, right? So yeah, so this is the problem. To be honest, this is exactly how we started as well. In 2019 and 2020, when we kind of were looking at it, we definitely had like a bunch of spreadsheets with a list of applications and everything there. And it kind of worked for a bit. And at some point it just stopped scaling, right? And that's when we felt like we needed an intervention. So before I hand over to Satya Ki, I just want to at a high level talk about kind of how we thought about the solution, right? So, you know, from the last two slides, it's very clear what the requirements were, right? So we needed something automated, we needed something up to date and we needed something on demand, right? So when we were looking at different options, we found a tool from Lyft called Cartography. This is an open source solution that they have open source, which essentially connects to different sources of data and kind of dumps it all into a Neo4j database and then let us do stuff with it. So this is a great starting point. We were very excited when we saw this, you know, we kind of initially played around with it in our laptops and see how it works and the stage environment. And that kind of gave us the confidence that, you know, we can probably make this work. Very quickly, we realized that, you know, we won't be able to use it out of the box and get all our answers and we had to make a bunch of changes. And we had to also make it, you know, work for all our teams as well. So it's not just something that security consumes, but also other teams consume it. So we needed to make sure that it's usable for, you know, every team that needs to use it. So that's kind of how we could start it. Let me take a pause here and hand this over to Satyaki who will tell you a little bit more about the nuts and bolts implementation of how we went from kind of playing around with Lyft to kind of having like a full-fledged software set inventory. Satyaki, over to you. Thank you. Thank you, Sandesh. Before I walk you all through the presentation, although Sandesh mentioned, let me, I mean, try to make you understand why asset inventory is required by taking an example that you all might be familiar with. So let's say we all invest in a variety of things like mutual funds and stocks and crypto and so on. So how do we manage all these different kind of investments done on completely different platforms and mitigate market risk? We cannot only rely on memory, right? Hence, we write down our assets so that we remember our investments every time we take a look. Similarly, asset inventory helps the organization remember all of its assets it has procured over the years. Maybe that subdomain which was procured a few years ago and it's now lying idle. So asset inventory basically prevents an asset from becoming a liability and helps the security protect raise a bit better. So moving on to the next slide, though we gather most of our information in an automated fashion, there are just some information which we need manual intervention for. Like contact details, for example, or documentation links or tech specs or ownership information. All these data is a little hard to get in an automated way. So entering, also entering these data manually is hard, right? So we came up with a program, we made a product that supports, I mean, that supports the developers. So we came up with the idea of application manifest, which is a YAML file where developers can put in all the necessary details as you can see on the slide, right? Now, no one likes to do extra work and we predicted no developer will fill up the manifest even though it hardly takes five minutes to fill it up, right? Hence, we enforced it on all the repositories and if you want to create a repository and raise a bit, you first have to create an application manifest. Moving on to the next slide, let's talk about the fun part. How did we manage to map different components of different services like AWS and GitHub and Gila together? To understand that, let me quickly walk you through how we deploy a code in Rezipix. So first of all, we store the code in GitHub. Secondly, GitHub actions access CI and builds the code. Thirdly, the code is stored as build images in Harbor. Then Spinnaker is used as CD and delivers the build to Kubernetes and Kubernetes is the place where the deployment happens. So basically, how we did it is we scanned all of the above mentioned technologies and connected the dots. The entire process is automated and engineers at Rezipix follow strict naming conventions, which helped us successfully map all the repositories back to their deployed containers. This mapping is necessary to derive critical insights. Like for example, if you want to know which IAM role is being used by a Kubernetes container of a certain GitHub repository, this is one way where you can get that information in seconds through our own automated asset inventory. Moving on to the next slide, Lyft's original cartography module was great and it helped us achieve our goal much faster than we thought, but it didn't work for all of our use cases. We had to build some intels and modify the existing ones. For example, we needed to connect Jira tickets back to the applications and Jira wasn't available. So we had to write an Intel for Jira. Similarly, for GitHub, we needed more than what cartography originally offered. So let me give you an example. We modified the GitHub Intel to read, depend about results and scan, shift for each and every repository so that we can create S-bombs for every repository. So this is how we modified intels and wrote our own intels to reach where we are today. Moving on to the next slide, we used Neo4j as our data lake. Neo4j is a graph database and exactly what we needed to gather inside. So let me show you how it works. Let's say you have to map an S3 bucket back to the repository it is connected to. So first of all, you list down all the S3 buckets that are there and you choose your S3 bucket that you want to search for. Then you write the name of the S3 bucket there and do a search again. So that will like shortlist that specific S3 bucket you are looking for. Now S3 buckets are connected to AWS IM roles. So you find out which IM roles this S3 bucket is connected to. Trying to do that, it will return you all the IM roles this specific S3 bucket is connected to. As you can see, now IM roles are connected to cube manifest templates. Now every repository in RazorPay has a cube manifest template so that it is deployed into Kubernetes. So we shortlist the cube manifest that these IM roles are attached to. Trying to do that, it will give you the exact cube manifest templates this S3 bucket is connected to. And now the cube manifest templates are directly related to GitHub repositories. So when you try to link the cube manifest template back to the GitHub repository, you will get those repositories that this S3 bucket is originally connected to, which is colored in sky blue as you can see. Correct. Although Neo4j is extremely powerful and has a beautiful UI, it isn't for everyone since it has a learning curve. Hence we push our insight to a different platform called Looker. Moving on to the next slide. We chose Looker since Looker is RazorPay's BI tool of choice and most of the company knows how to use it. Hence there's no learning curve that is required. The specific dashboard gives us an overview of the company and is used by business owners, engineering leaderships and of course security. This dashboard provides you with key insights like disk ranking of an application, which is determined by an in-house algorithm. This algorithm works by taking into account multiple parameters and not just with tribal knowledge. So this is how dynamic we have tried to create the entire process. Moving on to the next slide. While the previous dashboard gave us an overview of where we stand as a company, this one gives you an in-depth information about any application that you choose. So the focus was not on fancy graphs but on answering specific questions which helps the security team make decisions based on intelligence. In this case, we were able to get information about all the applications into one place. So any application you type, you get all the information that is related to it. This makes life easier for pentesters and incident response teams and whenever there is a crucial vulnerability or a serious error, in no time you can directly reach out to the team owners or the application owners and get the issue resolved. Moving on to the next slide. And before I hand over the mic to Sandesh, I just wanted to mention a couple of factors that helped us get this project done. Specifically, the fact that the deployment pipeline was automated from day one, which meant that we were able to pull all the information in an automated fashion. Secondly, from the time that we were a small company, Reservade did a great job with naming conventions and other best practices. This made it very simple to map applications across multiple resources. When we pushed the application manifest, I mean the application manifest program, instead of getting pushpacks for the program, we actually were supported by different teams and different BU's. And if you are building an inventory and in your company and these factors does not apply to you, you may have to modify the program that suits your needs. So saying this, I hand over the mic to Sandesh and Sandesh, you can take this on. Yeah, yeah, thanks a lot, Satyaki. So yeah, like Satyaki mentioned, there were a bunch of engineering practices which predates this project, which can have really helped us kind of get there. The whole automation first, you know, culturally is very, very useful and very conducive for such projects. So if anyone listening kind of wants to build a program like this, do keep this in mind. You may have to account for your company's cultural preferences, your company's processes, you know, and modify whatever we are saying to kind of suit that and then go ahead and kind of build the inventory. Okay, so I think Satyaki did a great job at kind of giving you the lowdown on all the different aspects of how we can have built this tool. So let's get a circle back to kind of the original questions we were asking at the beginning of the talk, right? So the one key question we asked was, hey, what AWS services does an application use? So if you look at this quick video, right? So this is like a quick screengap of our looker dashboard. So essentially what happens here is that if you have an application and you kind of enter the name of the application here, right? It will give you a list of all the infrastructure components which are mapped to that particular application. So the number of Kubernetes services deployed, the number of active pods, and so on and so forth. So at a quick glance, you're able to kind of figure out what kind of infrastructure this application uses. Very useful during security assessments, very useful during incident management as well. So moving on, we kind of spoke about kind of subdomain integration as well a little bit earlier. But in general, if you have a subdomain or if you have like a bug bounty report where somebody reported a bug on a particular subdomain, right? You don't know what it maps back to, right? So we built like a very simple interface which basically maps applications to infrastructure components. And what you can see here is basically that we have a list of all the three subdomains mapped to different applications. And then you can basically pick a particular name. So in this case, I'm showing you, you know, kind of a company often that we acquired a couple of years ago. I want to know all the subdomains and often, you know, which applications they're mapped to. I can see that in like a single circuit. It takes me less than 30 seconds to get all that information. And I also have the contact information that I can go contact. Another example, right? What applications have access to an AWS service, right? So let's say there's a critical AWS S3 bucket or in general, just an S3 bucket and you want to know which are the different applications who have access to it. Now, this is in general a critical thing, right? I mean, you know, when your application has access to write an S3 bucket, and then if let's say that application gets back, that essentially means your kind of your S3 bucket or your AWS interface in public, right? So in general, you want to follow principle of least privileges and make sure that a given application only has access to bucket that it requires. So when you're doing an assessment and you, the first thing you do is go look at, you know, what components an application has access to. And if that application has access to components it shouldn't have, then that's a red flag, right? That's a signal for us to say, hey, let's go digging a little bit further and try and figure out what's going on, right? So like you can see, you know, it does not always tell you exactly what's going on, but it gives you important nuggets of information, which would otherwise take a new hours or days to figure out, right? So it gives that gives you all of that kind of a click of a button. So the final kind of question that I want to kind of show you how we answer is on, you know, SBOM, SBOM is kind of the kind of the hot topic of the day, right? Thanks to Loc4j and a bunch of other open source issues. So essentially what we did was we connected our inventory to our SBOM. And if you put a name of a library here. So actually let me walk through this example a little more slowly, right? So within ReserPay, there are two different ways of kind of doing logging, right? So one is using, you know, Uber's ZAP library, which kind of helps you with the logging framework. And we also have an internal logger that, you know, that we wrote as part of a GoUtil package, right? So what we want to do is sometimes like, for example, let's say the security invites everybody to perform secure logging. Okay, I'm just giving an example here. And you want to kind of figure out how good the adoption is, right? So this gives you a really good idea because what you can do is you can put that, you can see, you can basically add that dependency here and make it part of the SBOM. And then see how many applications are using it. So here what I'm doing is I'm clicking ZAP and then seeing who all are using ZAP here, right? And then you get a list of, I think, yeah. So we used is instead of contain. So we use contain and then go run that query again. And then essentially what you get is a list of all the applications which use ZAP now. So, you know, here are the list of 40 application teams you may have to convince to try say, hey, why don't you move to the latest version of logger that we written here all the advantages, right? So in addition to all the security implications of that an SBOM gives you, like, for example, looking for, you know, into your library usage of insecure libraries, it also helps you drive an option, you know, for particular practices. So yeah, so these are some of the key questions that we spoke about earlier in the presentation that we can answer now in an automated fashion, pretty much a click of a button. And, you know, we're very happy with where we are. And, you know, but we also feel like we're just getting started. So what else can we do, right? So, you know, we want to kind of integrate, you know, what we have today with more observability tools, right? Like, you know, like, you know, like, link it with API inventories, and also risk rank applications based on volume of traffic. So today we base it on static, static questions, like, you know, what kind of data handles is an internal versus external and so on and so forth. But, you know, you're going to have two different external applications, which both handle similar kinds of data. But one is used by millions of people and the other one is used, you know, a few times a week, right? The risk ranking for that, for both applications should not be the same, right? So we want to be a little more dynamic without risk ranking and that can happen if we can integrate, you know, more information about, you know, traffic data into our inventory. Finally, you know, we love local dashboards, like Satyaki mentioned, looker is used across the company. So, you know, it's pretty easy for people to pick up and understand what's going on. It also makes life very easy for, like, an analytics team or an engineering team to go, you know, get answers to questions. But the one issue is that, you know, if you want to build a local dashboard, then you first start with a question and then answer the question using a local dashboard, right? It doesn't allow you to kind of like find new insights. And Satyaki was showing you a demo of Neo4j. You know, Neo4j is like a really cool interface to kind of just like explore, right? Like enter the name of S3 buckets, see kind of, you know, what components are there and then you say, hmm, this is interesting. And then click on that and then kind of, you know, just use your security mindset to kind of try and find new relationships and see what's going on and see if there are any security issues there. Having said that, Neo4j is super nerdy and very hard to use for people. The learning curve is massive. So we kind of want something between a static local dashboard, which only answers specific questions and Neo4j, which is super nerdy and kind of hard to use. So, sorry about that. Okay, so we kind of try to figure out if we can find the UI for asset inventory, which can kind of be somewhere in the middle, right? So that would be really cool as well. And that ties into the next one as well, is that, you know, software asset inventories are not useful just for security. In fact, I would say security is maybe one of four or five different functions for which inventory may be useful, right? At DevOps time, the teams will find it very useful. From an observability perspective, it's very useful kind of to have everything in one place and also to measure their productivity, right? Right now, we're kind of looking more at the app to infra mapping to figure things out. But you can just get so many insights from GitHub or how our developers write code to kind of figure out how productive our developers are, you know, what are the areas they need support in to improve productivity. You know, how many errors are we getting, let's say in your CI pipeline and so on and so forth, right? Again, things not relevant to security, but very useful to the company. So, you know, our goal is now to kind of, you know, invite collaborators from outside the security team and see how else, you know, the inventory can be used there as well. Okay, so in summary, right, just to kind of quickly tie together everything we spoke about, we cannot protect what we don't know exists, right? So if you don't have a good idea of how many applications we have, how many subdomains we have, how they map to each other, what kind of components, infra components are used, then it's hard as security to protect it, right? So we need to know what we own, we need to understand our tax service. And as companies grow, it's very difficult to maintain such an inventory manually, right? So you need kind of, you need kind of automation to do that. And while there are many inventory tools within a cloud provider, let's say like AWS, you need something which talks to all of your different systems, right? And that's where tools like cartography can be a great starting point to connect different data sources. By no means is cartography the only tool, this is the tool that we use, there are plenty of others paid and free open source, which can be used to do similar work. Next, I think in summary for us, this is a big learning lesson for us was to minimize reliance on manual data entry, make it easy to enter and update data. Nobody likes filling forms, we all know that. But how do we make it easy, how do we make it relevant and how do we make sure it's up to date, right? So that's something that we learned through this process. Finally, if you want to deliver insights and you want the org to use it, make sure you take your insights to a tool that they use, right? So that way, adoption is a lot faster, they see results pretty in a comfortable location, as opposed to like a PDF report, which is 50 pages long. Finally, make secure decisions based on data and not just that field. This is kind of why we started this whole project. You know, we felt like when we were a smaller company, it was easy to do to kind of make decisions on that field. If you have to pick like five assessments to test, right? To take a step back, right? Ideally, we would like to do all kinds of security activities on all kinds of applications and all aspects of our infrastructure. But the reality is that every security team is limited by the amount of time and budget and band power we have, right? So the resources we have, we need to use it well, right? And to do that, we kind of want to make such decisions based on data and not just on that field, right? So that ensures that our security posture is good and at the same time that we're using our resources. So yeah, so this is in summary of what we've done. You know, you can reach out to me on Twitter or I'm at Jubaun Jeans, or you can reach out to Satyaki or who's on LinkedIn as well. If you have any questions, if you want to collaborate or, you know, if you want to do this in your organization and you have questions for us, please feel free to reach out. Hey, thank you for the session. It was very, very insightful. A lot of learnings. I think we've been trying to do a couple of such things in my organization also. And it is very, very insightful. Thank you for that. I was actually making some notes while I was listening to the session. I have a couple of questions, but I think audience also, if you have any questions, you can type in a chat. We'll take it up. Till then, I think I have a couple of questions for you guys. And so first question, when we were talking about app sec, right, specifically third party and using some of the open source tools, which you named a couple of them, which is like dependable or or send rep. And you started your presentation with the term called scale, which at which razor pay works. Right now, all these open source tools. Usually they don't support a greater extent of a scale, right? If you have to run some parallel scans of some graph, we don't know how does it behave. Right. These tools, specifically non commercial tools, they don't look at scale all together. How do you deal with it? Yeah, yeah. Yeah, that's a really good question. Right. So I think, I mean, to me, some graphs kind of the odd one out because it actually scales really well. Right. So, but, but in general, you're right. I mean, you know, whether it's tools like dependable or others, right? I mean, some of the reason we chose these two tools is because they integrate really well with our, with our CI type, right? So with GitHub actions, they're supported out of the box. And with some graph, we've kind of started out by using the open source free version, but we kind of moved on to kind of, you know, work with their commercial product as well. So that kind of helped get some scale for static analysis, but dependable. It's been an interesting journey, right? It helps that given be given dependable, it's kind of part of the GitHub ecosystem. It just works very, very well. You know, with GitHub, but I will say this that, you know, you're absolutely right about, you know, the effectiveness and some of the features that doesn't exist, right? So very often we'll have to do things to make these tools more usable. So for example, you know, for dependable, I know it's really hard for developers to kind of look at results in GitHub itself. It actually, you know, you don't, you don't get, you don't get data at a high level. Like if you want, if you want, like, you know, like, let's say you're an engineering manager and you have like 10 different products at you. You want to get a summary of everything it's literally very, very hard to get dependable. So what we did was we kind of took another open source project, Diffek Dojo, and we kind of forked that and we built something called Bhadra internally. Maybe this is topic for another talk, but Diffek Dojo does not scale. It just doesn't. It's built for like fun projects for like five people. So we kind of decomposed Diffek Dojo and kind of like, you know, make sure that we kind of scale up the infrastructure. So now what we do is we kind of pipe all the results from dependable also into what we call Bhadra, but effectively scaled Diffek Dojo, right? So then it's easier for, you know, execs and managers to kind of go look at their status. So all of them dependable, all the tools that we use, you have a single vulnerability management dashboard there. And if developers are want to look at the actual issue in code, then they can go look at it and look for it in GitHub, which dependable does a good job of because it was built for developers, not for security teams or for managers, right? So you're right. I think, I think the advantage we have with open source tools is that it allows you to start very quickly and get something going very quickly. But not all tools are built for scale. And when we encounter that, we don't really have an option but to kind of try something more commercial, right? But I do think semgrips a bit of an exception because, you know, it was built very recently. And I know it is built by people who have used all the other tools and kind of failed to scale. So I think semgrips is a bit of an exception. Great. So my next question will be when we talked about vulnerability management, it somehow links with asset inventory also because whenever we create asset inventory, we tend to do some checks and analysis on top of all these assets on a regular basis, or if the moment you figure out there's a new asset, we try and run some test cases on that, right? So how do you map your vulnerability management tool or your asset inventory to respective teams and owners, right? That becomes a very important question because it's very, very important. When you have a vulnerability, how do you know who to reach out to? It might be linked to a GitHub repo or it might be a subdomain where there's a default admin page or there's a CMS or there's a bug in a mobile application. These are all assets, right? So how do you map it to a person and how do you make sure the role-based access, because I'm not sure if defect dojo, I think it's supposed, but how do you basically configure defect dojo or the Badra tool which you have basically scaled up? How do you make sure the specific vulnerabilities are only for certain set of people, people but not everyone in the org can see them? Yeah, yeah. No, that's a good question, right? So I think there are a few things which, I mean, first of all, this is a work in progress, right? I don't think we've kind of hit the ideal state that we want to be in, right? So we're still kind of in the building process there. I think at this point, again, there's a place engineering culture really helps here, right? So the focus for this talk is on applications that we have built, not on COTS applications, like third-party applications. That's a slightly different story and the process for that is very different, right? So for example, if there's a bug in a JIRA instance, then it's not treated using the inventory, we use something else for it. But for applications, we build what essentially happens is that the repository where the code is stored, which is in the micro service, is mapped one is to one to a deployed application. Sometimes there's like a one is to two mapping or one is to three mapping, but by and large, it's a one to one map, right? And like Satyaki explained earlier, right, we created something called manifest, which is the only manual portion of this entire exercise. But for every repository, we had a manifest file, which is basically a YAMB file, something developers enjoy writing as opposed to spreadsheets, right? So we basically created a template and made sure that every repository has these manifests, right? So every repository has like owner information, Slack thread, etc., etc., Slack channels where they coordinate or email list, etc., right? And since we can map repositories to subdomains to deployments, if I find a bug in an application, we get a bug bounty report for a subdomain, right? I can map that back to a repo and for the repo, I have a manifest, which has the contact information, right? So that's kind of how I can, within a couple of clicks, go ahead and find who is responsible for it. And then we track any production issues we have on JIRA. So we have like a very clear demarcation between issues we found in production versus issues we found in SDMC, which is basically in any branch before master. So those are tracked in Dojo, but in JIRA we have, you know, we kind of track production issues. And there, of course, we can use the tools, the features that JIRA gives us to kind of mask some of the details. We use a couple of interesting hacks to kind of do the RBAC stuff. So what we do is, you know, the details of the ticket themselves may not make it to JIRA. That will stay in a Google Drive, which has much better RBAC updates. You know, where we can actually kind of make sure it's exposed to our team. So yeah, the RBAC piece, I would say there's room for improvement on how we do it. The way we do it right now is pretty hacky, but it works. But maybe in the future, yeah, I mean, especially for open vulnerabilities and stuff, you know, we use a combination of private channels, private Slack channels and Google Docs, heavily restricted Google Docs to put the details in, and only the tracking information is available with JIRA. But we're also a big proponent of learning from our own mistakes. So if there are tickets which are fixed and everything is done and doesn't exist anymore, then we really want to talk about it. We want to talk about it in trainings. We want to make it internally available for people to read and learn from. So we try to give as much information as possible as long as it's not an active call. Very interesting to see Google Drive being used for RBAC. I would love to explore that option. I mean, not to hack you one. Yeah, it's a hack. I mean, not so much as RBAC, but to make sure the details are in one place and it's easy to manage who has access to it. Sure. I think we have one question on the chat. It's from Saurabh. It has two parts to it. First is how many items are present in the system? I'm not sure if you got the questions. How many items? Probably means how many people, how many pieces of rows of information in the entire system. I don't know if I have the answer to that. That's a really good question. I'll take off the question. So we connect almost everything to everything. Great. So let's, the last question that you're talking about, how do we map our vulnerability back to a person or specific team? So we did that in a manual process, but we eventually did it. So it's like we have multiple systems. So we have, I mean, if you kind of talk about rows, it will be like in millions, the number of rows that we have. But for that reason, I see the second question is how is it connected? Is it a tree or a graph? So it's a graph. So for that same reason, we have limited the number of storages that we, the storage that we use in database. So we have used a graph database for the same reason. So there are multiple components. Like every Gira ticket is a, I mean, if you're asking about the items, every Gira ticket is an item or every AWS S3 bucket is an item. Every IAM role is an item. Every GitHub repository is an item. So we have around millions of such items which are connected, interconnected with each other via graphs. So that is how we do it. That's how we save storage and we ensure that we give the information. I mean, we can find information really fast. Yeah. Would be a cool exercise to get a precise number. That'd be fun to actually get the exact number of rows. That's a good question. We didn't think about getting the exact number, but I think the important part there is a lot of the magic and the heavy lifting is done by cartography. So we didn't have to actually kind of worry about how big the database is and all that. Eventually all we had to do was kind of like the writing tells and then, you know, cartography did a lot of the magic. And then we focused on what we are good at, which is figuring out what security questions need to be answered. Right. So that's what we spent most of our time. Sure. We'll wait for some more questions. Till then I think I have a couple of more. Okay. So you talked about APIs again being assets. That's work in progress. Anything we can share in that sense right now? Yeah. So like I said, you know, we have, we have, we've kind of moving to our microservices first architecture, right? So most of our applications, you know, are microservices. And if you understand how they work, a lot of merchants integrate with that. Right. So there are some no core products, some products which are like web applications, but a lot of our, you know, highly scale products are basically API integrations. Right. So obviously API security is really important for us. The one area where I said, you know, we want to improve is that, you know, when we, when we figure out how risky a particular API endpoint is, we use various measures to various parameters to kind of figure that out. Right. You know, what kind of sensitive data is it using? Does it, you know, is it involved in making a financial transaction? And, you know, is it, is it external facing or is it an API endpoint, which is used just by two micro services and so on and so forth. So those things are pretty straightforward. It's already done. What we need to probably also consider is things like him, you know, what is the impact that it goes down, you know, how many users use it at a given time, right? An API endpoint, which sees like a hundred RPS is way more valuable to us or has a higher impact if it goes on from an availability perspective as opposed to something which sees like one RPS on, you know, has, you know, is used by some obscure merchant three times a month. Right. So I think that level of information we still don't have in our inventory. Once we get that, I think we can use that as well. We kind of working on that. We're trying to work on observability systems, which will give us that data and then use that as well. We have the observability system, but we don't, we don't have it linked to link to the inventory. So when that happens, I think hopefully we can drive some more insights, right? Because to me in the end, I have a limited team and limited budget, limited time, right? We want to kind of use it on the light sources on the light assets, right? And using, you know, getting this data will allow me to kind of, you know, spend time on the important assets and not treat every endpoint as the same. Right. I would have loved to kind of do the same amount of investments on everything, but the reality is that we can't do that. I can totally understand this. Talking about third-party integrations, Sandesh or Satya, anyone who can answer this, but the question is, how do you exactly deal with it? For you guys specifically being a payment gateway or multiple payment services, there'll be, there'll be thousands, I'm not sure, I cannot even come up with the number, but how do you deal with a security low? You might even give a secure SDK, but it might be some configuration, configuration which third-party vendor might miss. Do you guys deal with it? Or how do you make it by default secure? Yeah, that's like a talk on its own. Like, I don't know if I can kind of like give you, give you like a complete answer, but you're like, right? I mean, this is kind of the, this is kind of a problem that any B2B, modern B2B company or B2B, but we're kind of like B2B to see in the sense that, you know, it's in the end, the company is also used by consumers. The goal is to kind of, whenever we create an integration or we build SDKs, the hope is that, you know, when we do whatever it takes to kind of make it hard for people to make security mistakes in integrations, but then it's impossible to make it completely proof, right? I mean, that's, I think, that's a given, right? And we do integrations on multiple sides, right? So merchants integrate with us, but we integrate with our banking partners, we integrate with other, you know, acquirers, et cetera, et cetera, right? So the integrations are all over the place, right? And it's kind of like second nature to us. So I think we've kind of developed some amount of muscle memory on figuring out what works and what doesn't work. And, you know, we have checklist and stuff which will help us there. But honestly, it's a constant kind of like a cat and mouse thing, right? So we'll have to keep upgrading what we do. And without that, it's pretty hard to kind of keep up with it. So I think, I mean, there's no kind of like silver bullet, it really depends on what kind of integration we're talking about. And the idea is to kind of have like a secure SDLC where depending on the type of application integrations, you try and track model it and then based on that comp with security checks. I'm making it seem like it all works seamlessly in a perfect manner. It does not, right? It's very messy and it's hard to scale. But, you know, like I said, we have a lot of experience doing this and that kind of helps. That knowledge kind of helps us do these things in a secure manner. Great, great. I think you guys can again, audience, you can raise your hands or post your questions in the chat. Last question from my side Sandesh and Satyapi. And the question is about, do you guys rely on any other third party tool services or a vendor to figure out if there's something left, some asset which you have not identified? Yeah. So I think we are, so you talk about like accept small things. Might be Shodan or any other vendor who can provide OSM information or asset which is like in public. Yeah. So I mean, I kind of don't want to go into too much specifics here because things are kind of in a flux and stuff. Yeah. I mean, there are certain things that we do where we leverage and we do like OSINT assessments. One of the things we do is every time somebody joins us, joins as a new pentester or part of the security assessment team, their first job is to do OSINT on a, on, on days of break. Because our point is when you join you, you kind of, you still kind of like an outsider in the sense that you don't have the internal working knowledge. So you look at things from a fresh pair of eyes, like non-scalable, but a fun exercise and good to kind of get an outside perspective. We are, I mean, we do use some tools which kind of use some OSINT tools internally, which we run on a regular basis to get these, I guess. But we are kind of also considering working with vendors who will kind of help us, you know, get that information. But in general, attack surface monitoring is something, you know, which is really important to us. What works in our favor again is that we are like a single cloud provider. We work with AWS only and that, you know, from day one, we've been all on public cloud, right? So this means that there is a smaller chance, not a zero chance, but a smaller chance of having shadow ID where people are randomly deploying on something else and, you know, doing that. So that makes life a little easier, but definitely something that we'll consider.