 I'm Luke Young. I'm an undergraduate student at a big university that had nothing to do with this research. I'm a security engineer originally from Minnesota. I'm currently working in the Bay Area. Say that again. Is this better? Sorry about that. I'm also the founder of Hydrant Labs LLC which has graciously funded this research. It's funny how you can do that when you're the only employee. In case you guys didn't catch that, that means unlike a lot of the speakers out here, I'm not hiring unless you want to work for a 19 year old kid for minimum wage. If you do, there's my email address. Also, if you have any questions about the research or you'd like to send me legal threats or both, that's my email, or you can snail mail things to the who is info on the project domain which will be listed at the end. So as usual, we're going to start with a quick rundown of what I'll be talking about today. I'm going to talk about what a bit flip is and the history of their exploitation. After that, I'll get into bit squatting which is a specific type of bit flip exploitation. Finally, we'll move into my research on bit flips via bit squatting and we'll finish things up with a complete code release, partial data dump followed by Q and A. So what's a bit flip? A bit flip occurs when a bit flips from a 1 to a 0 or a 0 to a 1. It's a pretty simple concept. It can happen for a variety of reasons. Heat, electrical problems, radioactive contamination, cosmic rays, among others. I'm not going to focus very much on what causes a bit flip. I'm going to instead focus on how we can actually exploit them. However, we will take a quick track into the history of bit flips. In 2003, a paper was published by Princeton University titled Using Memory Errors to Attack a Virtual Machine. In the study, they literally took a 50 watt light bulb, put it over a memory module to intentionally induce bit flips, and then used it to escape the JVM. Since then, a variety of research has been done into bit squatting, which I'll get into a bit later. However, it wasn't until 2014 when a paper from CMU was published that investigated the use of DRAM flushing to intentionally induce bit flips. Many of you probably heard of this by its more common name, Rohammer. In 2015, Google's Project Zero team showed that it was possible to exploit a bit flip to actually gain kernel privileges. Now what is bit squatting? I keep saying this word. Bit squatting was a named coined by Arnhem Deinenberg. It refers to a specific exploitation of bit flips via purchasing domains that are one bit away from the legitimate domain. In the hopes that a bit flip will occur and the user's traffic will be directed to your domain instead of the original intended domain. So let's take an example. We take the domain cnn.com. You can see the binary representation of it right there. If the last zero in the second end, or in the first end, flips from a zero to a one, you can see it changes to con.com. So normally these domains aren't actually registered, and so their request will fail silently or the user won't ever actually notice something happened. And so the idea is with a bit squat to purchase these domains instead. Now how do you actually come up with all the possible bit squats? You can't just flip every single bit. So if we take a legitimate domain name, let's say www.defcon.org, and you can see the binary representation of one of the letters in E, for example, we can actually throw away any bit flips that occur in the first bit because in seven bit ASCII it's always going to be zero. And in most cases we can throw away any flips that occur in the third bit depending on where you're indexing from. That's because this represents the case of the character and in domain names this is irrelevant. Unfortunately, this doesn't leave us with six possible bit flips. That's because domain names only have certain valid characters, primarily A through Z, zero through nine, and dashes. In the case of E, this gives us five possible flips, U, M, A, G, and D. Now I mentioned there were some exceptions to that rule with case. And sometimes a letter, if a flip were to occur in it, it changes to a different character. So, for example, an N can flip to a dot or vice versa or slash can flip to an O. So these can actually happen where a flip will occur and it will still complete to a valid domain. Or, for example, a sub-domain can change to www.ndefcon.org. This was actually researched at Defcon 21 by Jason Schultz from Cisco. He explored the possibility of bit flips in the new GTLDs such as .exchange.cloud. So using this information, we can find there's actually 43 possible top level registerable bit squats of www.defcon.org. I've written a tool called BF Lookup that will generate a list of all possible bit squats of a given domain. It will be linked on the project site, which I'll show at the end. Now, I want to quickly note some of the previous bit squatting before my research. During Defcon 19, Arnim Dineberg first coined the term bit squatting, and more recently Robert Stuckey and Jason Schultz conducted more extensive research into bit squatting, which was actually the inspiration behind project bit flip. What is project bit flip? Project bit flip is my research into how do you actually exploit a bit squat? It makes a lot of sense of how a bit squat can occur and how it gets sent to potentially the wrong person, but how do you actually exploit that to pivot somewhere? So let's take an example. Let's imagine an internal site, say a corporate wiki or continuous integration system that includes jQuery from the jQuery CDN code.jQuery.com. Since your browser is loading this page for the first time, it doesn't know who code.jQuery.com is, so it's going to initiate a request to the user's DNS resolver. This would be Google DNS, Comcast, Open DNS, et cetera. Now, suddenly a bit flip occurs. Code.jQuery.com has become code.jqueasy.com instead. This could have happened in memory on the device running the browser. It could have happened in the memory of the NIC. It could have happened in transit when a checksum was recalculated or even in the memory of the DNS resolver itself. Now, the DNS resolver, if we assume this is a cold request, doesn't actually know who's in charge of jqueasy.com, so it's going to ask the DNS route via a authoritative name server lookup. It's going to send that. The DNS route is going to answer with NS1 and NS2.bitflip.com. Now, what is this domain suddenly? Bitflip.com is the domain I purchased to act as the control domain for the entire site. And it has a one in it because someone else already owned bitflip.com. So, these answers get sent back to the DNS resolver, which the DNS resolver is then going to use to send a query to the project bitflip server. It's going to send, it's going to ask for code.jqueasy.com. Now, this is also where this is going to begin to deviate a bit from a standard DNS question and answer. We're actually going to send back two answers to the DNS resolver. One for code.jqueery.com and one for code.jqueasy.com. Now, the reason we do this is we don't actually know where in the process the bitflip has occurred. So, we don't know what answer the DNS resolver is expecting. So, we send both back to them and let the DNS resolver pick the correct one and ignore the other packet. The DNS resolver is then going to send this back to the browser, which it will then use to issue a HTTP Git request. Now, if you were attentive, a few slides back, you may have noticed those two different answers actually had different IP addresses. And this is because it allows us to determine which type of bitflip occurs most commonly. Because this HTTP Git request will be triggered for a different IP address depending on which flip was accepted by the DNS resolver. Now, this HTTP Git request is going to get sent to the project bitflip server. Now, this is also where we get to deviate a bit from previous research. Instead of just answering this with a 404 or 200, we're going to be a little mean. We're going to send back a 301 moved permanently. This has the effect of permanently caching the bitflip in the browser's cache. So, that subsequent page loads are then directed to the project bitflip server even when a bitflip hasn't occurred. And we're also going to send it to a unique subdomain of bitflip.com. Now, the reason for this is the browser is not going to know who's in charge of that subdomain. So, when it gets the request, it's then going to have to issue another DNS question. It's going to send that to the DNS resolver, which will send it through to project bitflip. This allows me to directly tie a specific user's browser with their DNS resolver. Because you may have noticed at this point there's no way to tell which browser initiated a specific query which came because they all originate from the DNS resolver. Following this, we have a pretty standard path. The answer gets back to the browser. The browser issues a HTTP Git for jQuery.js. Project bitflip receives it. And we find the answer with a 200. And we send back this tracking JavaScript. Wait a second. I asked for jQuery.js, not tracking JavaScript. Too bad. I'm not jQuery. So, instead we're going to send back this, which the browser, believing it originated from code.jQuery.com, is going to faithfully execute in the context of your internal site. Whether it be a continuous integration system, internal wiki, basically anything with any important data on it. Or any site that a user would be tricked into entering credentials into, believing that they are on the original site. Now, how do you actually build all of this? This is great in theory, but actually answering all these queries is complicated. The second tool I'm releasing is BF DNS. It's a Golang DNS server specifically designed to answer bit squat DNS queries, along with BF www, which is the litey configuration that I used to answer all these queries, along with a bunch of PHP scripts, gross, that are used to answer the actual tracking JavaScript. I keep mentioning this tracking JavaScript. What does it actually track? Basically everything that you can track with JavaScript. It pulls the user's installed plugins, user agent, time zone, language, refer, the document title, the screen size, the resolution, the current URL, the do not track cookie, the installed fonts that are via flash, and then it also pulls the local IP addresses on the system via WebRTC. Some of you guys may have seen this from the beef talk. There's a way to pull with WebRTC, the session description protocol actually contains all the local IP addresses installed on the system, and so JavaScript has access to those even if they aren't the route that was used to access the site. And these are internal LAN IPs, so in addition to external IPs if your computer is directly connected to the Internet. It also pulls the cookie names and a SHA-256 of their value. You could actually pull the cookie values. I don't want to get sued, so I just pulled the 256 hash of them. Now, we need somewhere to host all of this. So we need to select a host. We need somewhere that supports multiple IPv4 addresses so that we can answer each question with a different IP address. We also want IPv6 support so that we can evaluate IPv6 usage. And also somewhere that bandwidth is really cheap in case a bandwidth spike occurs if a big DNS resolver were to cache one of our results, such as Google DNS or Comcast, because then it would be serving it to their millions of customers and that would result in a lot of traffic. Finally, I wanted somewhere that was a smaller company in case there were letters or legal threats sent to them. Somewhere that would actually look out for their customers and wouldn't be able to just say, go away, pick another host. I ended up settling on a host called Ramnode, small VPS, three terabytes of bandwidth a month. It cost about 15 bucks a month. Finally, we need some domains other than the project domain. So I don't know how many of you know a college student, but we're really lazy. Whatever option requires the least amount of work is the option we're probably going to take. So rather than building a list of domain from a bunch of data sets like the Alexa 500 or other similar sets, I fired up a web proxy and browsed the internet for a day. At the end of the day, I looked at all the top sites that I'd hit and looked for ones that would have interesting data. We got the mic falling over. Sorry. Hopefully. Nope. I took care to only grab sites that would yield interesting data and try to explicitly avoid any sites that would have data such as HIPAA or PCI. The first site I bought was Google user content dot com. It serves images for Google sites. Main reason I purchased it was because it's a really, really long domain name. And so there's a lot of opportunities for a bit flip to occur in memory, especially when browsers copy the domain name multiple times. In fact, here's all of the possible bit flips of it. Or rather, here's the 72 of them I was able to buy. There are 79 possible valid ones. Now, I hadn't actually set up a proper server at this point. So I pointed to that, the VPS and ran Netcat on port 80. Those of you that are in IT have probably had what I like to call an oh shit moment. You actually run RMRF on the wrong server. You accidentally shut down all but the server you wanted to. Something like that. This is one of those moments. For those of you that can't read the tiny text up there, that's a request for mail-attachment.googleusercontent.com. As it turns out, Google user content dot com serves all mail-attachment downloads for Gmail and Google apps. Oops. To make it even better, by their very nature, those links are valid without session cookies. Meaning that each misdirected request, I can go grab that attachment myself if I wanted to. And so I decided to actually look a little bit more at the domain I bought and what else has served on it at this point. It turns out it serves not just mail-attachments, but the OAuth authentication for Google, Google fonts, Google cache pages, and Google translated pages. I'm sure there's no valuable data in any of that. Moving on from there, I decided to take a look at Amazon, specifically cloudfront.net. If you're familiar with CDNs, it serves a lot of really popular sites such as ESPN, amazon.com itself, among a whole bunch more. There were 43 possible bit squats for which we're already registered, so I registered the rest of them. Moving on from there, continuing with the Amazon theme, I took a look at AmazonAWS.com. It serves pretty much all AWS services as subdomains of AmazonAWS.com with the exclusion of cloudfront. This includes Amazon S3, elastic load balancer, and EC2. Interestingly, this is one of the few domains I came across in my research that a lot of the bit flips were already owned for. The other one being Akimai, who actually owned all of their bit flips. In this case, Amazon owned 33 out of the 38 possible bit flips. However, the rest were registered by someone else, except for one. AmazonAWS.com was, of course, I wasn't actually satisfied with a single bit flip, so I decided to buy subdomain bit squats where the dot changes to an N of Amazon S3, EC2, and elastic load balancer. Moving on from there, I decided to take on doubleclick.net. Those of you who are familiar with ad networks know that this serves Google's ad network, and it primarily serves them over JavaScript, which makes it a great target. There were 45 possible bit squats, 19 of which were already registered, so I registered the other 26 of them. Moving on from there, I hit Apple.com. Being a short domain name, a lot of these were already flips were actually valid sites and not actually owned as bit flips, so I was actually only able to grab one of them, AppleG.com. But continuing with the Apple theme, I took a look at iCloud.com next. If any of you have an iOS or OSX device, as I'm sure a lot of you do, your device will check in with iCloud.com regularly for backups, contacts, and a lot more. Additionally, most Apple accounts have an iCloud.com email address, which is delivered to this domain. Moving on from there, I decided to look at WebDev. I don't know how many of you have ever heard of jQuery. JQuery is a compatibility layer. It makes IE suck less, among other things. It's used by over 70% of the top 10,000 sites, making it a really good target. I registered the 15 available ones of that. Continuing with the WebDev theme, I hit discus.com. It serves comments for about three quarters of a million sites. And finally, the peak, there's 24 purchased, the peak of my WebDev phase, Google- analytics.com. I'm guessing a lot of you know what this is. It serves analytics for pretty much every site ever. It's the most widely used website statistic service in the world. Interestingly, I wasn't the first person to have this idea. 53 of the bit-squats were already registered by another party. However, I was able to grab the remaining 10, shown there. Hopefully you're beginning to see a theme here. Moving on, SFDCstatic.com is the Salesforce CDN. There were 42 possible bit-squats. I bought all of them. I'm going to pick up the pace a little bit here. ASPnetCDN.com, Microsoft's AJAX CDN network. It serves a lot of Microsoft sites and a lot of jQuery plugins. Another 38 domains registered. GoogleAPIs.com, it's Google's JavaScript CDN. Another 22 out of 39 registered. Gstatic.com. This one's a fun one. It's Google's static content hosting. It serves things like the Chrome Internet connectivity test. So when you plug your computer into the network and suddenly Chrome says, oh, you're online now, it's hitting gstatic.com. Additionally, it serves things like the Chromecast login page, along with a lot of other stuff. The other thing that makes this interesting is that this is one of the domains that was purchased by both Artem Deidenberg and Robert Stuckey in their research, yet the domain was freely available, not purchased by any other entities or by Google itself. I was able to grab 19 out of the 30 possible bit squats. Finally, to finish things up, Facebook, CDN, YouTube CDN, Twitter CDN, and there we go. Now, that was 337 domains. I know you all were keeping count. You're probably asking yourself, how did I pay for all of those? Coupons. I don't know how many of you have seen the TLC show Extreme Couponing. That was a documentary of my life for about a month. Eventually I actually ran out of coupons and I found one-in-one. One-in-one has a nice advertisement on the site for 99 cent domain names. And if you go and try to buy one, the first one is 99 cents, and then when you try to buy the second one, it charges you $11, which doesn't seem quite fair. So if you start the process in an incognito window and then log in halfway through, it's still 99 cents. So after doing that, about five minutes per domain, I got this email. Dear Luke Young, you have exceeded the limit of our current special offer. Further orders placed under this offer will be canceled. Sincerely, security team. And that's when I stopped buying domain names. The final statistics, I bought 89 domains from GoDaddy, 255 from one-in-one, at an average cost of $1.62 per domain, coming to a total of $545. Now, the next thing I realized is I was actually missing out on a lot of data. Most of this traffic was coming in over SSL. So I decided to buy SSL certificates too. And those of you that are familiar with SSL probably know that I need what's called a wildcard SSL certificate. This is because I don't know the subdomains that are going to be requested for each of these. You also probably know that wildcard SSL certificates can be expensive. A wildcard SSL certificate from DigiSert is $595. Of course, you get bulk discounts and other things. But if you just do easy math there, that's over 200 grand, which I don't exactly have lying around for a fun little side project. I think there was able to find a solution. Some of you guys may have heard of them from other talks. StartSSL is a kind of unique model. You can pay a one-time fee and then get SSL certificates from them at no additional cost except for EV certs. So at a cost of $60, I was able to get wildcard SSL certificates. Now, because this is such a unique model, StartSSL has a very manual process. They don't have an API for requesting certificates or anything like that. Which makes both requesting the certificates really, really annoying for 337 domains because you have to do domain verification on all of them. But it also makes it so that you're more likely that someone's going to notice what you're doing. In fact, 17 of the domains were flagged for manual review by StartSSL who then approved all of them by four different employees of the company I might add. Here were the 103 certificates that I was issued. Now, those of you that are attentive, you might have noticed that I bought 337 domains, yet I only got 103 certificates. There's a good reason for that. I'll try to fix the mic real quick. On August 25th, I received this email. My login certificate was revoked for quote, abuse. Please contact us. Reaching out to StartSSL, I received the following quotes. I'm sorry, but for high profile names, only the name owner should be able to get certificates for it, and those resembling them closely never issued. Followed shortly by quote, most certificates really shouldn't have been issued to start with. Oops. Here's the actual excerpt from the StartCom certificate policy, and I'm going to paraphrase here. StartCom will not issue certificates whose domain names might be misleading or have well-known brands or names that are part of requested host names such as GoogleMe.com. So I'm guessing the fact that I own domains with their example in it was probably not so great. I went back and forth with Eddie Nigg, the CTO of StartCom, letting him know about the project and suggested revoking only the Google certificates. Here was the timeline on that. So I don't know how many of you have ever actually been to a stand up comedy show, but one of the golden rules is that you don't take your phone out during someone's set, mostly because you get made fun of by the comedian, but also because it's really rude. I was at a comedy show with some friends and my phone started vibrating like I was getting a phone call, and then it kept vibrating for five minutes straight. When the set ended, I took out my phone and I saw this. In the end, they revoked 81 certificates with two emails for each revocation. Now, I'd like to note that I had full wild card certificates for all these domains for two months before they were added to the certificate revocation list. Not that that even matters, some of you may remember this from Heartbleed as a lot of browsers like Google Chrome don't even check the certificate revocation list, meaning that if I were malicious, I could continue to use these certificates at the risk of being sued by StartCom. I chose not to do that because I don't want to get sued. However, they did leave 22 certificates unrevoked. When I inquired about why they didn't revoke these, I received the response that everything we haven't revoked so far was considered not so problematic and hence we left them to expire naturally. Now you'll notice all the domains that are remaining seem to not contain the trademark name of whatever company it is. That's likely the reason they were left behind, however, I don't know that for a fact. Now, moving on to the future. The EFF has just announced their free automated certificate authority called Let's Encrypt. In addition, so using something like that, it may be a lot easier to do exploitation as it's a completely automated process and you don't have to worry about your demands getting flagged for manual review. They currently don't support wild card certificates but are looking into it. You could also use a much larger provider like DigiCert at a much higher cost depending on how much money you have to throw away at this. Now I suppose those certificate revocations kind of beg the question, did anyone else even notice? Well, the first and only public case of someone noticing was by a third party, it was x8x.net. He noted that all the BitFlips of gstatic.com were suddenly registered along with some other domains by quote, the same individual with name servers at bitflip.com. So at least someone is having fun. I thought that was it. Until I went to a friend's house, I went to check on the site and I saw this. I quickly logged into my server console and found that the server was alive and well. It turns out the error that I was seeing was with DNS resolution. I figured I just broke in my DNS server or actually it was handling packets wrong so I hit the ever trusty Google DNS check and it worked. This is where things got a little bit weird after a bit of further investigation and a few panic calls to friends in other states. I verified that I wasn't actually crazy. Comcast's DNS server was refusing to answer requests for bitflip.com. Further investigation reveals that it's only for A and quad A records of the subdomains of bitflip along with the root domain. Now here's the weirdest part. The requests were still getting forwarded to my server, however my answers were not getting forwarded back to clients. This makes it really hard to pinpoint exactly when they put this block in place and I still don't know this day. I tried to reach out through the business class support line, never really gotten an answer. Short while after noticing the Comcast incident, I got an email that my credit card payment for the server was declined for the month. A quick call to my bank and I was even more confused. They said they were approving the transaction and that I had adequate funds. I reached out to Ram Node via a ticket that was escalated directly to the CEO. That's the advantage of picking a small company who reached out to Stripe, their payment processor. He received a response that quote, we have reason to believe that card has been associated with fraudulent activity. What makes this odd is that card was only used with about five vendors. I can count them all in one hand and only one of them, this server, was purchased through Stripe. Makes it really odd is that Stripe, not my bank, was refusing this transaction. I actually reached out to Stripe on my own independent of Ram Node and they said quote, we are indeed blocking at an RN due to a level of risk on this card that we're not willing to take. I know this is a very vague reason but for security purposes I'm limited in how much information I'm able to give out. Still don't have a solid answer on this one. Needless to say I paid for the month on another card and then paid through the end of August ahead of time in case we were to ever occur again. Now I want to take a second and have everyone look at this slide here and see if you can tell me what's wrong with this slide. I put my bank on there. About two hours after the DEF CON CD slides came out, I started getting password resets. Whoever is doing this, I don't bank with them anymore. Nice try though. Now, moving on to the part that you guys are all actually probably here for the data. The first question people ever ask when I tell them about this is is this actually even a problem? Is there even traffic to these bit flips? Let's start with a simple graph. This is the traffic I received during a section of the time the server was running. Now, some of you probably can't see that because it's a little small, so I'll give you a number. I received over a million DNS queries every 24 hours for over a month. Now you'll notice that that graph is kind of all over the place. And there's a lot of reasons for that. Different servers cashing my results for longer. Along with I actually transferred these domains, some of them away over time and so it's not the cleanest data set. Now, of those million DNS queries, about 4.8 of them resulted in corresponding TCP connections. Of those TCP connections, 85% of the initiated SSL connections completed the handshake and issued an HTTP request. Now those of you that were paying attention earlier might have noticed that I had 22 certificates out of the 337 domains. Those numbers just don't add up. The best answer I can give you on that is a lot of people have misconfigured systems that are actually accepting SSL certificates that are for the invalid name. Because these aren't valid SSL certificates that I was serving, they're just not for the domain that the user requested. Now, what about those HTTP access logs? The server handled about 2.4 million requests and I was able to determine that repeat users, so users that had cached the result because of that 501 redirect, remained cached for an average of 4.33 requests. So this means that after a flip occurred wherever it may have occurred in the request process, you continue to try to access jQuery.js for me for the next 4.3 times that you visited that site. I can also tell you all sorts of interesting things from this traffic. I can tell you the most common languages that had users had on their system. The graph on the left is the HTTP accept language header and the graph on the right is the languages returned from the JavaScript. Interestingly, it seems like the JavaScript executed on Chinese machines much more than any other language. Not really sure why that is. It's also interesting that most of the traffic was coming from how the Chinese language set and this actually doesn't match up with standard data sets of what most common accept language headers are. So for some reason, bit flips were occurring more commonly on Chinese computers. I can tell you things like the most common screen resolution or IPv6 adoption, which was abysmal, about 1.67 percent of queries. I do have to note that I wasn't actually forcing users to try an IPv6, so this was just clients that preferred IPv6 over IPv4. I can tell you things like browser usage. You can see the Wikimedia, so Wikipedia's traffic over February is on the left. Their breakdown of browsers versus the traffic I received is on the right. You'll see a significantly larger chunk is coming from Chrome. It could be because Chrome is doing more memory copies in each of its requests. It could just be because more of the users happen to be running Chrome, along with a larger proportion of IE. I can tell you things like OS usage. Same statistics, Wikimedia on the left and my traffic on the right. A large portion of traffic coming from Windows 7 and Windows XP with a much smaller portion coming from OS X and iOS. I can tell you things like cookies. I have acquired 240,000 cookie names and values. Now these cookies could be anything from Google Analytics tracking data to actual session tokens for say your Gmail login. The top cookies were from Google Analytics by due and for some reason weather.com. So I have literally thousands of users zip code that they always look up weather for saved in my data set. I don't know why. I can tell you interesting things like here's someone browsing Amazon.com looking for an iPad and their session cookies were sent happily to me and you wouldn't want to buy an iPad. We can look at things like here is someone trying to log into OAuth and their session cookies being sent to me along with their OAuth token. Or things that are more interesting like iPhones checking in to download in this case just checking into the app store or in this case trying to download an actual app from me. Some of you may have seen some of the hacking team. Their research was into serving malicious applications to users that could be utilized in something like this where a malicious application is sent back instead of the application the user intended to download. As long as it's signed with a valid certificate iOS will accept it and install it. Or we could just have iOS devices asking me for software updates. I actually did some research into this and you could serve them different software updates instead. I also do things like the HTTP refer value. Here were the top Google searches that I pulled from the refer values. For some reason a lot of people really want to know about wood birthday gifts for their wife. I have no idea why. The other thing I pulled was local IP addresses. I was able to pull 158,000 IP addresses off varying systems. 12% of those were non-private IP addresses. This means that those systems were likely directly connected to the internet or using a lot of them are IPv6 addresses which by their very nature don't go through a net. I can tell you the most common local IP addresses. Super fascinating graph. They're pretty much all 192.168. Primarily consumers. Where it gets more interesting is when you look at other traffic. For example, here's the SMTP traffic that my server received. You can't really see it out there. It's kind of small text. The numbers on the side are 20,000, 40,000 and 60,000. That's how many requests I was receiving each day. That's someone trying to send an e-mail, a bit flip occurs and it gets sent to me instead. If any of you have ever worked on mail systems, you probably know that SMTP doesn't really work well with TLS. Pretty much everyone uses self-signed certificates. Even if you weren't using a self-signed certificate, I actually own the certificates for these sites as well. So it doesn't even make a difference. And you'll notice that big jump there at the end up to about 60,000. That was June 19th where Hotmail cached my entry for iCloud dot com and sent all e-mails that were intended to go to iCloud users to me instead. Oops. In continuing with the SMTP theme, I was looking through the traffic and I set up a query that would show me where the origin BGP route came from. And I found that 38% of my traffic was coming from AS13414 which is owned by Twitter. Primarily those two subnet ranges. And you'll notice that pie chart there is for what the DNS requests were coming from them. They're all for iCloud dot com. As it turns out, these are mainly MX and A record lookups and they were resulting in SMTP connections. These DNS queries resulted in about 390 connections attempts per day. And based on this info, I'm guessing that what was happening was a bit flip was occurring and Twitter was trying to send password reset e-mails, promotional e-mails and other things to users with an iCloud e-mail address and sending them to me instead. I reached out to Twitter who said after some discussion it looks like we're trying to restrict outbound traffic from our network to bit flip domains. This should address the specific problems you outlined without having to worry about the domains and who owns them. Now, how did I actually run all these queries? I wrote a tool called BF Splunk. It's source types for Splunk and a bunch of queries. It will be listed on the site at the end. I love to say Splunk sponsored this talk but I never actually called the salesperson back. Oops. Now, how do you actually remediate this? Buy your bit flips. You probably already buy the typos of your domain, buy your bit flips as well. It's really not that more expensive and it can actually save your users a lot of problems. You don't have to actually answer them, you don't have to do anything, just own them so that someone else doesn't. The other thing you can do is use ECC memory and set up an RPZ for internal flips. However, that doesn't protect your users, that just protects users within your network. Now, before I get into the data release, I want to talk about what I did with the domains. I reached out to all the companies whose domains I purchased for this talk and offered to transfer the domains to them for free. The first company to get back to me was Salesforce. I wanted to take a second to really give props to their team. They had a response with authorization to transfer the domains in under two hours and the domains process was initiated in under 24. Next company was Apple. The domains were transferred in about two days. Following them was Amazon AWS with the domains transferred in a little over two weeks due to some scheduling issues on my end. Facebook followed them closely by a tad over two weeks. Microsoft, they're being transferred right now. Now, those are the companies that accepted the transfers. Some companies like Twitter actually said that they didn't want the domains. Here was the exact quote. We don't actually try to prevent bit flipping attacks by registering all the nearby domains due to the fact these attacks are relatively rare and that we own a lot of domains just to be quite an undertaking. So we're not interested in acquiring the domains you have. Please just maintain possession of them until they expire, which they have. The next company was Google which after three weeks had a very similar response saying their domains team was not interested in purchasing them. Those 154 domains are now up for purchase. Now, the companies that did actually transfer them, some of them didn't actually do it right. Here's the who is info for two of the domains. They're still pointing the name servers to me even though they own the domain. So I'm still getting all the user traffic even though they own the domain. I'm not going to call out the companies that they are because I just noticed this about 10 minutes before the talk so I haven't given them a heads up. Now, onto the data release. I'm happy to say I'm releasing complete logs in JSON format. So complete DNS logs for every single query I received. They contain the source, the port, the queue name, queue type, pretty much all that. The info is self-explanatory on the site. Along with anonymized web server logs which contain things like the user agent, the accepted language header, the HTTP host they were intended for, the method. They do not contain the URL or the URL query and they contain a hash of the source address so you cannot tie it back to a particular user. In the same vein, I'm releasing anonymized SSL and SMTP logs which both contain the same hashed source IP. And finally, here's my contact info along with the project website which contains all these data dumps. At this point, I think I have a very short period of time for some questions if anyone has any. Did anyone offer a bounty or anything to help you cover the cost? No. And I wouldn't have accepted it if they did because that puts it in a bit of a, it's already a gray area legally and that can push you if you're accepting money for it or making a profit off of it, it can push you a bit more. I did get some care packages from companies though. Question for you. Did you do anything to look at double bit flips to get an estimate for just how frequently these happen? Is it one out of every billion queries, trillion queries, what order magnitude? So you're referring to when two bit flips occur? Yeah. Using the ratio between two bit flips and one bit flip, you could figure out the ratio between one bit flip and zero. I did not. However, I did receive, if I go, if you look at the data, actually a lot of the queries I got where a bit flip had occurred, lots of other bit flips occurred in the actual query. So in the URL would be malformed as well. Indicating probably that those users have really messed up memory. So I don't have statistics on that. It would be very interesting to look into that. Any other questions? Does enforcing name server bailiwick have any effect on your responses? Can you repeat the question? Does enforcing name saver bailiwick, does that have any impact on your responses? I do not know. Any other questions? How did you figure out who to contact in these companies to get quick response? How do you know who to email in Google or Microsoft about this, just being someone on the outside? So all of these contacts out to the company were sent to the security at alias or to whatever information they had listed, if I just Google the company name and security. A few of them did have to get pinged through personal connections to make sure they actually got handled. All right. Thank you very much.