 Okay, hello everyone. Today we're going to be talking about leveraging web application vulnerabilities to carry out reconnaissance and, you know, gather intelligence for investigative purposes. Just a quick introduction. I work with the Center for Internet and Society as a security researcher and also work on issues of policy. I'm very interested in, you know, exploring different ways where user privacy and security can be affected with web technologies and services. So with that out of the way. So this is going to be a short primer on what, you know, intelligence gathering really is because a lot of you may be developers and, you know, you may not have the mindset with which an attacker looks at the products you create or the applications you make. So just for that purpose I'll be covering a very closely related topic to this, which is open source intelligence. What is open source intelligence? So any information that an attacker can gather from public sources or public is very loose in this case and then use it for, you know, painting a picture about an individual to say, if you know what kind of products a certain person buys or what kind of websites a certain person visits, that can, you know, you can abstract that to be related to open source intelligence, given that you use open sources to find it. And again, it is often used to aid journalists in investigative processes. So a lot of the times journalists will use information that is publicly available and, you know, try to find, I mean, to further the journalist and that they're trying to achieve. And for an example to that, there's this quite well known organization known as Bellingcat, which does exactly this. And if you can see on the screen, there's this article called Geolocation and a Philosopher's Stone in Kashmir, where they actually, so recently there was a plane that went down in the Valley of Kashmir and they used different images that were put up on social media. So what you're seeing right now is a panoramic image that they've created from three different images. And then they were able to sort of map that to what the actual, you know, locational coordinates were using Google Maps because if you can see the three trees at the back on top and the ones there. So that's sort of, you know, to give you a glimpse into the mindset of what this intelligence work requires. And yeah, to further go on that topic. So this is a project run by this NGO called TTC. And it's called Exposing the Invisible. So they publish a bunch of methods on how, you know, you can use the internet and different web technologies to carry out investigations. And I think, yeah, so this is just a glimpse from the actual guides that they've published themselves. But, you know, we can't go too much into that topic today, sadly. And another example is from a tool that was created by one of the people that works with Bellingcat. It's called Who Posted What. It basically uses the GraphQL API to provide a more searchable, you know, like a more tweakable search feature for Facebook because Facebook search is very limited in what they show you to, you know, provide more ease for the regular end user. Okay. And yeah, let's get into the actual talk now. The stock will mostly cover, you know, how the concept of carrying out active reconnaissance and gathering intelligence by finding and exploiting flaws in web applications, I mean, mainly web applications. And yeah, so since this is quite large of a topic, you know, like intelligence and other later aspects of it, like a lot of analysis is required. And that is something which will take a whole day to, you know, even just dive into how different bits of information can be correlated in piece together. But just to give you a picture of it, you can imagine that from a certain website, like say from Facebook, you get to know through the forgot password feature someone's last two digits of the phone number. And if they live in, if you have knowledge about the general area in which they live, like city or state, you can sort of start assuming the numbers which build the starting blocks. So like, you will get knowledge about the first two. If you know what carrier they use, like Airtel or Vodafone, there are like prefixes for those numbers. So you can slowly start uncovering the entire phone number just by knowing, you know, their online identities, but again, that's piecing together in correlation is a very, very wide topic. So unfortunately, we can't cover that today. Another thing which I wanted to cover was that since we live in a very hyper privacy world now after Cambridge Analytica, there is a tradeoff between, you know, sort of carrying doing open source intelligence and sort of the privacy, which websites like Facebook want to offer to the people and sort of to delve into that a little bit. And also just best practices to avoid mistakes, which will end up harming your users. And I mean, eventually your product as well. But just making applications which are more privacy, you know, conscious and yes. So this slide was called living off the land. And yeah, so I mean, one, one problem with resources like who posted what or stocks can is that since they're publicly available, you know, websites like Facebook will actually try and circum like make sure that service is not running. Maybe users have flag like concerns about the privacy, which actually did happen. So Facebook has cut out GraphQL access from most of these services. And you can no longer do custom search for different topics on Facebook. And yeah, so they're generally very rigid and you don't have a lot of flexibility. And that means you're only given a certain set of parameters where that will be useful to you. So this is a problem with like tools that are online available, like online now, which you can go and use. And which is sort of a foreign to how, you know, why you should rely on yourself instead. So we have application vulnerabilities and intelligence gathering. So this is sort of little, you know, figure that I created, which deviates from the regular open source intelligence sort of methodology. So here it'll go to target identification, then it will target identification being who or what sort of demographic of people you're looking to find information or intelligence about. Then source identification would be sort of what websites they frequent. And then, you know, from when you have a list of the sources where you can gather info from, you can start with the actual identification of flaws. And then comes exploitation. And the last bit would obviously be analysis. So part one is scoping and here. So I'm sure most people are aware, but services like says a matto, Prakto, and, you know, just the services for just about everything in the current year we live in. And that means that you can book medical appointments from your phone. You can find people to repair your AC or you can even rent a, you know, vehicle that will reach you at your doorstep today if you order it now. And what that ends up creating is a situation where there is, you know, a lot of information because it is generally not well protected. And so there's a lot of information you can retrieve about people just from knowing what services they use and then finding flaws in those services. And for instance, that could be, let's say if someone targets, you know, your motorcycle rental account and you've rented five different motorcycles over the course of the last year, they can know whether you're like bullets or if you like street bikes or sport bikes. And while that's sort of obviously it's quite a silly thing to glean from having access to that information, you can also, you know, figure out about people's location, which areas they frequent because a lot of these bikes do have GPS embedded in them just so the company doesn't, you know, end up having a stolen bike. And yeah, so that that can be a more serious aspect of that. And yeah, so here are the two primary methods of using, you know, the sort of method, what I'm proposing here, there is speculative targeting, which is similar to a watering hole attack, which most people, if they've taken a course in security in, I mean, would might know the watering hole attack is basically, you know, you sorry. Right. Yeah, the watering hole. Sorry, I made a false analogy there. That was for informed targeting, but I'll continue watering hole attack is where you sort of set up, you know, a watering hole and you wait for the user to come and you know, I mean, you know that, okay, these users are going to be frequenting these areas. And similarly, like you know that X user visits Facebook, Twitter, and maybe, you know, a bunch of other sites. Again, finding this knowledge can also be done through open source intelligence. But yeah, and speculative targeting is where you know that I want to access information about, let's say, you know, people who get their groceries online, and that will again, give you the very wide set of demographic there. So common flaws. Now, the next slide, I think it's going to be covering the like the set of common flaws, which usually persists in web applications, which can allow for this sort of, you know, extrapolation to occur. But obviously, this is not an exhaustive list. This is just what I found from, you know, my experience of discovering these flaws. So there obviously, there's overly permissive APIs, which means that, you know, there's really no checking whether you're authenticated to access a certain resource. And then there's improper access controls and implicit user trust, again, which is closely related to the first. Then the third one is oversight in deprecating API functionalities. Now, what that means is, when you're updating applications, you will create a new API to serve, you know, the functions which you want to, but you will forget to take the previous one offline, or you may have a business choice, which says, okay, there may be people with older phones who can't update to the newer version of the app. So let's keep the API online. And that can sometimes also mean that security bugs that have been patched can be still exploited. And in some cases, it's as easy as noting that API underscore v3 is being called and you just change it to API underscore v1. And then again, use usage of insecure account or resource identifiers. Now, this means cryptographically insecure identifiers, as in, you know, say, when it's sequential, it's like one, two, three, one, two, four, one, two, five, and so on, but also identifies which have a particular level of importance in one context, and they necessarily may not in the other. So I'll cover that as well. And to sort of get into the first example here is right, this is a bug that was found on pay you by Srikanth Lakshmanan. So the bug, I mean, which what this basically allows for to happen is if you know someone's email address, who has ever used the pay you service in the past and pay you is quite a large gateway. So if you have carried out, you know, online transaction, there's a huge chance you may have once used pay you to process your details. And if that person used credit or debit card to carry out that transaction, and accidentally ended up not unticking the save my card function, you could just pull the last four digits of the of their card and the type of card. So like, in this case, it's American Express. And you could just pull that by entering their email address. And that becomes, I mean, that itself may be a little problematic, because it's only 16 digits. And, you know, you may be able to get the rest by other means, like American Express cards only start with a certain digit. And if you know information about sort of bank, they have been numbers are also quite, which are publicly available are also quite a way to figure out what the first four to six digits are. In any case, that's not all. So once you get to this page, you can just you have the option of paying now. So while you're only able to see the last four digits, the actual when you do click on pay now, the actual full card is stored. So it will carry on to the transaction and you enter the wrong pin number, I think three times and card is blocked for the original user. And yeah, so that becomes a problem there. And again, so how this actually happens is what I was mentioning earlier, the use of insecure account identifies. So an email is, as we know, not an insecure account identify in most cases, because, you know, usually isn't tied to your payment information in such a straightforward way. But then you have to consider either sorry, then you have to consider that if someone can just in the checkout flow, enter the email address, and there is no verification done, that that is a sort of example where email is probably not the only thing that should be used. But again, this happened, you didn't have to be signed up to pay you, you just had to click remember my card, actually to remove your card from this page, you had to sign up for an account and that was a caveat. But yeah, so the problem here is that the website assumes trust just by knowledge. So once they know that, okay, this person clearly knows the email address, and that means they are who they claim to be. That is one sort of clear fault there. And the other one is that if something like an email is being used as an identifier to give you this information, then on top of that, it should also ask, okay, well, we're going to send you a confirmation to that email. But then that becomes sort of a trade off between convenience and security, which is what ails a lot of people. Yeah, so lessons, I guess, what I just said is to first not trade security for convenience always offer security. And the second one is to say, to not use, you know, context sort of agnostic, I would say, identifies since email can mean different things in different contexts. Okay, now this flaw is different than the last one and the one that will follow because it's not entirely a web application security flaw, although it does start off as such. So again, this is similar to the previous one where which existed on pay you which allowed you to find the last four digits. But this one requires a bit more work, but the fruit is also quite better than what you could get with pay you. So here, as you know, PayPal is well known, I think global payment website. It allows you to receive send money and carry out transactions, whatnot. So if you go to the forgot password page on PayPal, and you know, you enter someone's email address, so the attacker would need to know the email address of who they're trying to attack. You'll be presented with the last two digits of the payment method and the payment type, sorry, the payment method type. And from there, what an attacker would do after that is you call the customer support line, which will put you in touch, well not put you in touch, which will have you interact with the interactive voice response system. So the IVR of PayPal would, you know, the automated sort of voice you hear every time you call et al or whatnot. And so you have the chance to interact with that IVR. And what you could do then is, since you already know the last four digits, the IVR will ask you to enter the last four digits to be granted access to the account or to be granted access to, I should say, be able to talk to a customer support representative. But sort of a side effect of that is you have only a hundred tries that you need to get to the last four digits. And I actually did it. I tried it. So it takes around 35 minutes because there's a certain time, which it takes for each phone call. And in about 35 minutes, what you can do is you can find how much money any given person has just on the knowledge of their email address, who they're sending money to. And you know, the last transactions, if they received money, if they sent it out just by having the email address again, this sort of signifies how, you know, one piece of information can then sort of be used to glean so much more. And this will allow for very easy financial profiling for, you know, whoever is being targeted. And yeah. So I mean, there's multiple supporting reasons for this issue to exist. One of them being that the partial disclosure of the credit card itself happened here, which is again, a sort of business issue which ends up affecting user security. And the other contributing factors were that the IVR really had no way of limiting attempts. So maybe the last four shouldn't last four digits of payment information shouldn't have even been used as a way to authenticate you to the PayPal customer support. And then rate limiting, of course, is contributing because you just call, hang up, call, and that's that. So I mean, sort of takeaways from this would be that, you know, context does have meaning. And if you have, apart from your web application aside, channel, say you have an SMS sort of service which allows your users to change information about their account, the level of security you offer should be consistent. And primarily with sort of what is used for authenticating users, that's where a lot of the pain, a lot of the pain occurs. And for users would be, you know, whenever possible, for whatever service you're signing up on, avoid providing information that can be correlated across services. So like say Facebook has your number and so might Instagram, even though they're owned by the same company, they have different sort of backends which may have different, you know, vulnerabilities. So for instance, if you forgot to take off one setting on your Facebook account, someone who has their, who has your phone number on the contacts list can just use it to get to your Facebook account. And you know, there's some issues with that, but let's move forward. So again, I'm just going to sort of hurry over this slide because yeah, so this was a bug that is, you know, that was affecting this American internet service provider known as Comcast. And what happened there was this is actually the Wi-Fi provisioning page for Comcast. And so this was meant to be used by legitimate customers who are looking to, you know, do the first time set up for their router. And so sort of what happens there is the problem is that this link can be used over and over again. And what an attacker had to do was just submit an account ID and a partial home address. I'll get to that later. So like a street or house number or even just the pin code. And that would allow them to sort of access anyone's access that particular user's wireless networks, you know, society and password, which is quite bizarre because this is a remote attack. So you don't need to interact. You don't need to, you know, launch any actual attacks on the wireless network protocol itself. But yeah, so again, and there's, I think more information to the attacker is that even if the normally partial of the home address, they get to know the full of it. And as well as the router and sorry, the gateway IP address for the router. Right. So, yeah, this is obviously it's green shot off someone else's Wi-Fi name and password that I was able to access. Sorry, that the journalist covering this was able to access through the portal. Sorry. And yeah, I mean, again, so the information which allowed, which would, which an attacker would require for this is just the account ID and several, you know, like a very tiny amount, very tiny bit from the home address. So I was actually unable to figure out how many say like if your address is one, two, three hack away lane, if you would need to enter only lane and you would get to know the whole address. But from when I was able to test with it. So one, two, three, if you put that as the address would work for one, two, three hack away lane. And yeah, so just account ID and, you know, for an internet service provider can be found on the bills that they send you or the emails they may send you someone can shoulder surf and be like, okay, well, I have this person's account ID or, you know, just gather it from another area where maybe the context is not so damaging as in this case, because in this case, if someone knows your account ID, they can repeatedly keep getting your wireless network SSID and password. Unless of course you switch your router to something not provided by this ISP. And yeah, another issue is that user input input wasn't properly verified. So the website just assumed since that, you know, I entered 123 that I was meaning to enter 123 whatever comes next. That's sort of like an autocomplete. But in this case, it really would allow something drastic to happen, because the attacker would then actually get the full address as well. Lesson learned from this, I guess would be don't toss your pen the trash because the hackers gonna go find it and hack your router. And yeah, sort of defense. Now we're going to be covering. So for I mean, you know, as a developer, when you're making a web application or service, you have to ask if, you know, where are all the points where an attacking can interact with any sort of user information, even if about themselves, because then that will help you sort of create first a mind map than an actual map of everywhere where any, you know, data transaction occurs. And so questions you may need to ask yourself could be, does your application make use of, you know, identifies which are differently, like which have different levels of sensitivity in different contexts? Again, that can be email ID or account ID and whatnot. Could also be phone number because phone numbers, I don't think in 2019 are quite private anymore. So, you know, applications like true call or even if you don't have it on your phone, someone that you know may have it on your phone and then automatically your details are also made public, which is quite a bad thing. But yeah, so you have to be sort of, you have to really analyze your application in the way of, okay, if I make this for 10 people now, that will be fine because those 10 people are very small user base, but assume a month from now there's going to be 100,000 people using this application. Now how many of those 100,000 people will be vulnerable to something new which wouldn't be the case if they, you know, they weren't members of my application? That's again, some questions you can ask yourself. And yeah, so sort of addressing any scenario where the, you know, these identifiers can end up putting users at risk. And of course there are multiple other issues which can allow for this. So like say a flaw like IDOR, like it's sorry, insecure direct object reference where, you know, you call an object by you usually it's done through a numeric identifier. So again, that was that can allow for user information to be extrapolated. And, you know, ultimately, there's resources online like OSP, which have a lot of the OSP being open web application security project, which have a lot of the flaws which usually end up sort of, you know, causing any sort of information to be leaked or disclosed without authorization. And if you sort of follow those practices, along with keeping it, keeping the mindset in hand that you, I mean, ultimately, you have to look out for users. And it's not going to be, you know, that users are going to be adopting. Like they will sign up using a fake email address and a fake phone number just just because your website may potentially get hacked. So the onus falls on the developers again. But even with that, and users have, you know, sort of a way where they can protect yourself, but a question a lot of people ask is why would anyone want my data in specific anyway? And, you know, sorry. Yeah, so that's not, I would say like really a question you should ask it's similar, it's similar to, you know, I have nothing to hide. So why should I care for privacy? So someone may not be looking to attack you directly, or someone actually may be looking to attack you directly because, you know, they may have information about, oh, that this person has a very high bank balance, or, you know, this person, you may just end up as collateral damage because, again, people target using demographic techniques quite often that if you fall under a certain set of categories, then you would be, you know, likely to be targeted. And yeah, so that becomes important. And then you have to sort of really embrace using privacy enhancing and protecting technologies like not even technology is just, you know, keeping that in your head that, okay, if I'm going to be signing up on this website, really what information am I providing and sort of to adopt that mindset that, okay, if I provide this website with certain information, and that website with certain information, it can be clean that, you know, so they can be correlated for and intelligence purposes, and which will ultimately help you not, you know, end up in a database speech somewhere on the internet. And yeah, good operational security goes a long way. So again, that depends on your third model. If you're a person who's sort of paranoid about privacy, then you will want to be very religious when you exercise it, when you try to exercise, you know, protecting your privacy. So that may include like getting a new phone number just to sign up on websites, but you know, that's again, something someone who's very paranoid about security and privacy would do. But again, so it does depend on your third model. If you think you, if you have reason to believe you're going to get attacked, you know, you may as well adopt it. Or if you just want to be on, you know, err on the safer side of caution, then you can do that as well. There were a few more slides, but I think this one is not the final document that I had. But in in most case, that that is pretty, pretty much it. And questions if there are any. Yeah, so I'll be putting up the resources for this presentation on my GitHub. I think my GitHub is okay, even my GitHub is limited from here, but you can follow me on Twitter. I will put up the resources for all of what I cited on this slide and, you know, sort of how to dive in as a person looking to do offensive security and also how to protect yourself as a developer, I mean, how to build applications that are privacy conscious as a developer. And yep. Any questions? So, um, so security laws like GDPR, they, while they may, I mean, they only cover data security in the certain sense. So if, you know, as someone working in intelligence, you are able to extract all this kind of information from different ends that may not really break a specific provision of the GDPR. Even in India, we're about to, I mean, we have a draft for the data protection law, which I think it's open for comments again, because it bring up new draft. So if you care about data security and privacy, I'd urge you to give comments to that about, sorry, side. Yeah. Any other questions? Yeah. Uh, thank current thanks for giving us a detailed view on the offensive security.