 Hello, everyone. I'd like to welcome you to my talk today, using Passive DNS for gathering business intelligence, or how to snoop on what your competitors are up to. I'm going to start with a quick introduction to the talk. In this talk, we will look at Passive DNS, what it is, and a little history. Some of the services available to query Passive DNS databases, how it can be used for gathering business intelligence, and finally, how you can mitigate business intelligence leakage. So a little bit about me. My name is Andy Dennis. I'm Director of Security at Modus Create, a DC area consultancy. We work across multiple fields, software engineering, cloud architecture, cyber security. Over the past 10 years, we've probably worked on it if it's an area of technology. You can learn more about us at www.moduscreate.com. I've 20 years working across software development, architecture, management, DevOps, cloud and cyber security. And in my spare time outside of work, I'm an organizer with B-Side's Connecticut. You can visit our website at b-sidect.org. We're hoping to hold the conference this November. Hopefully a hybrid event, but depending on the COVID situation, we have to revert to a virtual event only this year. But if you're interested in doing something in November, then please check our website out. I'm also the author of several books on IoT and a Raspberry Pi, and I recently co-authored a book on Docker, called Docker for Developers, with two micro-workers, Richard Billingdon McGuire and Mike Schwartz. So please check that out if you're interested in the subject of containers. So the genesis of this talk, how did this come about? Really, the question is, how do our customers prevent Snooper's gathering business intelligence? This question was a byproduct of the conversations we had back in January 2020, while poking around in risk IQ, while we were drinking some coffee, and then a brainstorming session over beers later in the evening. We wanted to understand, based upon the data we saw in risk IQ, how do we avoid leaking sensitive information about our company? And secondly, how do we plug these gaps? Is it policy, process, technology, a combination of them? We were also interested in, what's the best approach for helping customers keep new products under wraps before launch? How do they prevent competitors from poking around in tools like risk IQ and finding out sensitive information about new products they're developing or perhaps customers they're working with? And finally, we're curious, what are our competitors doing? And if we can see what they're doing, can they come and see what we're doing across the answer to that is yes. If we've left information available, which is picked up by passive DNS tools, then they can see what we're doing. So let's just do a quick introduction to passive DNS. DNS, which stands for domain name systems, maps domain names to IP addresses and it acts like a phone book for the internet. Passive DNS is a mechanism to store the DNS resolution data and that is essentially historical data and a mechanism that can then be queried. So you can go and reference past DNS entries within these databases and gather information that you need. And there's a variety of services out there that offer passive DNS services, including risk IQ and a few others that we will briefly touch on later in the talk. Passive DNS is useful for tracking down security incidents, rogue infrastructure. So for example, if individuals in the team have spun up servers and then registered sub-domains in Route 53 and perhaps appointing them to this infrastructure and then it sits there for a few weeks and gets spun down and you can gather information around, oh, okay, well, when did this happen? Why was this done? And really that going into a tool like risk IQ and having a look at these sub-domains that we'll see shortly can provide information around that. That can be quite useful for tracking down when rogue infrastructure was spun up and then asking the important question of why. It's also great for tracking down misconfigurations and legacy forgotten infrastructure. I mean, this is especially true when you have mergers and acquisitions. You sometimes end up with multiple different cloud systems, different services running in it and being able to go through and have a look at the sub-domains associated with the set of domains that perhaps this company that you've merged with owned is a great way of being able to track down some of this forgotten infrastructure that perhaps is sitting out there collecting dust and costing you money. We can use passive DNS to see when domain records have changed. For example, when the A record changed to a different IP address. And it basically works by capturing DNS records and storing them in a database or reference. It's at its heart, it's a historical source of information. It's talking of history. How did passive DNS become a thing? So prior to 2005, DNS history wasn't stored and thus was lost. Whether the intelligence services had some mechanism they collected it, I guess is open to speculation. But the state's quote at the time was that it was essentially lost and this led Florian Weimer to invent passive DNS. I believe Florian Weimer is actually an engineer at Red Hat now. And development started in 2004 and PDNS was first introduced to the community at the first conference. And you can actually go and read the original presentation at the link I've provided here if you're interested and there's some of the more technical details in there. So that's your thing. Please take a look at these slides after the talk and go ahead and have a poke around in the read because you'll probably find it interesting. This idea of the DNS logger in the paper will just quickly touch on that and the technical details. You don't really have time to go into a lot of detail but I think a quick high level overview is interesting. So passive DNS is based on the DNS logger architecture which was proposed by Florian Weimer. This architecture consists of sensors which send data to the DNS logger host. And these original sensors concepts were written in Perl and seeing I'm sure the source codes probably still out there on the internet if you want to go check it out. There's an analyzer module which inspects the data. And then this collector module which collects it and then it's stored within a database. And I believe the original one was written in Sleepycat which is a Berkeley DB. And I'm sure there's probably examples of that still out there if you're interested in checking them out. Then there's a who is server connected to the database and who is clients. You can query the server to gather that information that they're interested in by querying the database via the who is server. So it's quite a simple mechanism on the surface for doing this. So since that original concept, there's now many companies out there that are offering passive DNS. That includes Cisco umbrella, risk IQ which we'll look at shortly, security trails, circle passive DNS. I believe that's an API and you have to get permission to query it but effectively you can query the circle passive DNS API and gather those historical data that you're interested in if you have access to it. Spamhouse also offers a passive DNS service as well. So some of these offer community editions or they also offer free trials. So let's take a look at those three accounts. So there are a variety of free trial account services out there. This is by no means an exhaustive list. I'm sure there's probably tools that some of you used out there that you prefer or other products that I've missed off here which you wish it had included in the list. But if you do a Google search after this talk or during it, I'm sure you'll find something that I've missed which is perhaps even better than what's listed here. Risk IQ community edition, having things great, it's free. Then we have security trials and Spamhouse and most of you have probably had a Spamhouse for being the spam for our company. They have a passive DNS service. I believe security trials and Spamhouses are trial accounts. I've not really explored them but I'm sure if you go sign up and kick the ties and then you can see what you think. The risk IQ community edition is effectively free. It just has restricted services and you have to pay extra to use the other features that you might be interested in. You can, of course, also use archive.org, it's a way back machine, to browse historical snapshots of websites. So you could go back and look at a particular website at a particular point in time if a snapshot was taken and see what it looked like back then. So it's a nice compliment to passive DNS. In the upcoming demo, we'll use the community edition of Risk IQ. If you wanna sign up for an account to follow, along with, use the link in the first bullet point up here, it's community.riskicue.com and then you can start to kick the ties and have a look while we talk about some examples. So gathering information on competitors. Let's look at some of the privacy concerns that you should be worried about. What do sub-domains tell us? Sub-domains can actually tell us a lot of information about what a competitor's up to and what we're running as infrastructure or as products. So a cursory look at DNS records and something like risk IQ can reveal information such as what monitoring tools is in place. Is somebody's using Grafana or Prometheus. CMS products such as Drupal or WordPress. On-prem hosted solutions, for example, source control. Someone might be using GitLab on-prem and you can gather that information from looking at the passive DNS records. Client demo sites hosted on sub-domains. So for example, they may have built a demo for a particular customer that they're trying to sell a new feature to and that information is effectively available from the sub-domains listed in risk IQ. New products being tested on sub-domains. So this could be new products that they are selling themselves or perhaps new products that they have purchased from someone else and are running so that that kind of information can be quite useful from a business perspective. It might be you can tell if they're planning on launching something new. If you're a malicious actor seeing that someone's running a particular service or product on a sub-domain, such as those monitoring tools I mentioned before, CMSs can obviously be useful if you're planning on doing something malicious. And email services, mail servers, often these are listed in there. So from an information security perspective, these are types of areas you want to consider protecting, right? Because if a malicious actor can go in and see these, then they can target them. So apart from all the usual things, patching and so on and so forth, making sure that you've got suitable security tooling in place, you might want to use some of the techniques mentioned later in here to just obfuscate the fact that you're running that particular tool. So when it comes to competitors, we're kind of interested in what customers they have. How can you find out which sales deals, for example, a competitive one? So we can actually use the information from the sub-domains to discover what sales deals are competitive and likely one and which companies they might be working with or might be pitching to. So for example, we're gonna take this first URL here. We can see customer1.clients.example.com. This actually tells us that this particular target, example.com likes to list the people that they have sold products to under this .clients.example.com sub-domain. So for example, they might have customer1, customer2, customer3, customer4, customer5 and so on. So you can gather fairly quickly their client list by looking at the sub-domains if they use this format. So that's an immediate piece of business information that's being leaked out to the public internet. We can also see perhaps what the sales team are planning on doing or who they've been running demos for. So in this example, we have potentialcustomer.sales.example.com. Once again, if example.com lists all of their demos under the .sales.example.com sub-domain, then we can very quickly enumerate a list of the people that they're targeting. And we can use that information perhaps, you know, try and get ourselves in the door there or beat them on an RFP or similar. So these are the types of just simple business information that suddenly becomes readily available which you start to poke around in these passive DNS records on a tool like RISC IQ. So these are things to keep in mind and we're gonna probably look at some examples. Well, not probably, we're definitely gonna look at some examples in a moment of where customer1.clients.example.com existed. And then we'll talk about some of the mitigation techniques. Compared to secret products. So apart from who they're working with and who they're targeting to sell products to, perhaps we can gather information about what products that they are planning to release in the near future and perhaps have leaked that information by accident onto the internet. So for example, let's say we're looking at RISC IQ at example.com and we find that this particular target has decided that their staging and development service are gonna be listed along with in front of it the particular product that is running on those that's a server or set of infrastructure. So in this particular example, we've got new iotdevice.development.example.com. We can imagine you might have new iotdevice.staging.example.com or new iotdevice.uat.example.com. But basically what I'm getting at here is the fact that we can then go in and query and see that this new iot device which could be a product name. It could be, for example, if you're building a new car or developing a new fan or developing a new monitor or a laptop or something like that, that product name.staging.example.com or development.example.com is really advertising that this thing exists. Now if your competitors see this and they've been gathering other open source information or perhaps not so open source information and they can put together a picture of the upcoming product that can maybe even get the brand name in advance. So that's just the kind of thing you might wanna think about and keep under wraps because you don't wanna accidentally leak that information before you're ready to actually pull the plug on going live and then share the product name and the branding with the rest of the world. If you consistently use practice of using sub-domains for new products, then customers gathering BI can simply monitor PDNS records and over time we'll just start to get a picture of every time you're releasing a new product. It appears under this particular sub-domain, perhaps my new product.staging.example.com. They could use it to their advantage. It's also very easy for those gathering business intelligence to open a free account on risk IQ or similar and gather this information without your knowledge. So as some of you may have done while we've been talking about this, you can go create Gmail account, you can then create a risk IQ account or a security trials account or one of the other services and then use that account to go poking around essentially anonymously a bunch of domains and seeing what's running under them sub-domain list and what was there historically. So that mind, you know, you should probably think about some mitigation techniques but before we get to those, I mentioned we're going to look at a real-world example. So we're actually going to look at a real-world example that was a bit of a scandal because I think that just makes it a bit more interesting. So risk IQ, let's look at this example which is to explore the tool and the one we're going to use is Cambridge Analytica. So some of you will be familiar from a few years back the Cambridge Analytica scandal and we're going to go poke around at risk IQ and look at Cambridge Analytica.org and see what we can find. So I'm just going to switch over to the web browser and here we go. So I'm actually logged into risk IQ and for those of you who are following along and creating an account, you can go in and then you can run a query against Cambridge Analytica.org. And once you logged in, you can see a bunch of tabs under here and you've run a query resolutions, who is certificates, sub-domains, trackers, components, so on and so forth. We've reviewed them all, but you can have an explore. Depending on whether you have a free account or a paid account, then you're going to get access to different services. I believe sometimes some of the services they have available for a certain amount of time for free just to give you a chance to take a look at them and see if you're interested before they encourage you to pay for the full version. What we're interested in is the sub-domains. So if we take a look through these, we can see some pretty interesting information right off the bat. We can see that, as I mentioned, some companies like to use this client sub-domain. So we see clients.cambridgeanalytica.org. Immediately, that says to me, perhaps we're going to find something underneath here, such as client1.clients.cambridgeanalytica.org. We can start to enumerate who they were working with or perhaps who they were attempting to sell services to. Other technical information about their infrastructure is also available. It appears that they had C-panel. I think when we look down here, we can see perhaps Kubernetes was running. I mentioned before about mail servers. We can see an nginx ingress point to Kubernetes. And here's a really interesting one here. We've got clients.cambridgeanalytica.org and we've got nyulangom. So this looks like if there's potentially a client or somebody that cambridgeanalytica was attempting to sell services to. And at a moment, we're going to go back and look at some research, just to tie these two things together and see if, hey, was this right? Is this actually a client cambridgeanalytica or someone they attempted to sell something to or was this just a coincidence? So let's just take a look and see if perhaps anything's still running at nyulangom.clients.cambridgeanalytica.org. And no, there's nothing there. But cambridgeanalytica.org is still up, which I was very surprised about. Looking at this though, I'm not entirely sure who's managing the site. It does look a little suspect. I don't think this graphic here is necessarily something that the cambridgeanalytica team would have used as an advert. And then the more we look down, you can see that those broken image links, I don't think some of these hyperlinks work and I'd be cautious before clicking on too much on here. So I'm not sure if someone's come along and purchased the domain name after they abandoned it and is now running something else behind it, but it's quite interesting to see that this is still up. So let's take a look at the slides again. So I mentioned cambridgeanalytica and we saw that subdomain there. Security researchers were able to probe active and passive DNS records back when cambridgeanalytica was in business and I don't think they are anymore and gather quite a bit of information around them. Including these customers and sales targets. So we thought of NYUlangon.clients.cambridgeanalytica.org. So in fact, if we do a little bit of research, we can see an NBC news article from back in 2018. And within this article, they actually discuss how there was some consternation between NYU Langon and cambridgeanalytica, whether they were clients technically or whether they were being pitch services to, there was certainly some, I guess, problems there and it blew up in the press. So I think evidently this is an example where your product name or your company name being associated with another company's, perhaps something you wanted to keep under wraps has ended up actually having some negative press. So that's a separate point which I'm gonna touch on. It's kind of closely related to this whole don't use your client name as a subdomain, but there's also the other problem is do you have control over what other companies are doing when you're purchasing services of them and there's a scandal associated with them? How do you avoid not getting caught up in it and ending up with NBC reporters knocking on your door? So that was just a quick real world example and you can do some more digging. Unfortunately, we don't have enough time to do that, but I think it's quite a fun exercise. Submit agation techniques. What can we do to plug the leak? How can we avoid being a victim of someone stealing information about our upcoming products? How can we protect our customers by not accidentally leaking information about their upcoming products when we're working with them? How can we not leak our sales list and so on and so forth? So of the scape and subdomains. So I like this little quote here at top. My co-worker Richard had used this somewhere and I borrowed it and had rewritten it slightly, but he said, remember resolutions from requests end up in public DNS even if the hosts are referred to a private or firewall. So a number of techniques we can use to ensure we don't leak information. First sort of obvious one is don't use client.salesexample.com or customer.clientexample.com naming formats. So if you saw with the Cambridge Analytica website, if you put your customers out there as a NYUlangon.clients.cambridgeanalytica.com, you're kind of exposing them to the press if something goes wrong. I'm not sure Cambridge Analytica realized that at the time. We'll let the journalists do their jobs there, but I think obviously there was quite a bit of embarrassment there. Don't use clients.salesexample.com because really you're giving away who you're planning on trying to sell services to. So these two methods should be avoided. They seem so obvious when you're setting things up. Oh well, just group all of our customers under clients.example.com. But of course you've accidentally then just given away a load of really useful information to competitors. Obviously you can sit sensitive information such as your proof of concept websites from products and services behind a VPN and use internal meaningless domain names. For example, cabbage.example.internal. Use private zones and avoid public DNS. If you keep everything locked down internally and give it obscure names that only people would currently know about then that can help immensely. If you do need it to be publicly accessible you can obfuscate sub-domains. So in this particular example we've got chicken.egg.example.com. So we could use a mapping scheme on the back end that only people would currently know and of course the customer itself if they had to go visit say a UAT site where egg is the equivalent of client and chicken is the equivalent of the client's name and you could use any number of foul to represent whichever customer you need to. But in this case, client.egg.example.com could be you know, customer.clients.mywebsite.com and really only your team knows about that. Another thing you can use is an anagram generator to create sub-domains. So thanks to my colleague Richard Bullington McGuire for this one who's quote I stole at the top there. wordsmith.org anagram it's a great tool you can basically put in a word and use that as a seed. So come up with a initial phrase, use that as the seed and then from there create a set of just random sub-domains and then sub-sub-domains which are then used to host infrastructure on that is not production infrastructure but perhaps has to be available publicly for some reason or other even if it's locked down behind passwords and login mechanism. So we've done this in the past ourselves with working with customers when they've had a product and there has to be a staging or a UAT website. We'll use a tool like this to come up with a set of random names that are really meaningless to anyone that's poking around. But on the back end there's a mapping like I mentioned before the chicken.egg.example.com and that mapping is obviously very useful to the engineering team and to the business unit that's working with this particular product before it becomes live but to anyone else poking around you're just not gonna know what it is. And you can also use that for infrastructure as well. So we saw a seed panel and we saw Kubernetes and we saw mail and some other tools. It may make sense in order just to obfuscate and just make general reconnaissance of your site more difficult using some random names and subdomains. And then really what that means is anyone poking around in it they may take that link and go explore it but if then they're blocked by whatever mechanism you're using for authentication and authorization it's gonna reduce the amount of information they're getting especially the login screen is particularly non the script and doesn't give away what the actual services that's running. Like the overkill for some of you but it's something worth considering anyway. Then finally train your team to think about security first when it comes to business intelligence. So this isn't just a job for the engineers it's for the whole of your development team the architects, the engineers and the product owners. So really getting together coming up with a mapping scheme thinking about the security first approach of baking in the obfuscation of the product name on the client list so that when you're basically building sites out you're not linking that information. So whilst security by obscurity is not the best approach to the vast majority of things as a link in the chain it can be quite useful when it's used in the right fashion. So in conclusion, what do we learn? Passive DNS is useful for security researchers can be great tool, you know if you're using it in the right fashion for understanding what's going on with your own infrastructure and I'd recommend if you're interested to go and reading a bit more about the use of passive DNS for security. It's also unfortunate useful for competitors gathering business intelligence on you to keep that in mind as we've talked about throughout the last few slides and we saw with the example from risk IQ that information is all available to anyone who set up an account so best to avoid putting it in there in the first place. Think about what you're leaking through your DNS entries so as I mentioned the customer list, sales demos and of course also think about if you're purchasing services of a third party and they're including your name within this structure of your name.client.example.com that could be a problem. Also consider the fact that some of your customers you may have no legal reasons why you're not allowed to list them so avoid doing that and avoid getting into hot water on your part and if third parties are putting your name in and there is some sort of scandal you wanna try and avoid that if you can. Come up with the policy and process to avoid what we've mentioned before so that was really the last point I mentioned on the previous slide it takes a whole team to build a product and take the whole team to defend it really security shouldn't just be one team's responsibilities the whole of the group that's responsible for a particular product or service. You know an educated team so watching this talk would obviously be helpful there's lots of great resources out there probably a lot of them much better than my talk on why you should use obfuscated sub-domain names for products and other techniques you can use to help protect you from leaking information but educating your team is a great first step. So with that said, the Q&A session will be starting momentarily over on Discord. I'd like to thank all of the folks from the Recon team once again for putting this on this year because I understand it's been a great challenge of COVID and you know having to run things online and not being able to do things in person so thanks you doing a stellar job and for anyone looking for a job please check out modiscrate.com that's my final plug today we've got loads of open positions and we're always recruiting engineers and security folks and DevOps so please go check it out. And with that said, thanks once again and I look forward to speaking to you on Discord. Bye bye.