 presenting Jacob Thompson independent security evaluators. Give him a hand. Okay, thanks for the introduction there. And this talk is called CREAM for cash rules evidently ambiguous and misunderstood. So the root here is websites use HTTPS because content is sensitive, like an online banking application, credit card statements, payroll information and so on. The reason they use HTTPS is because that data is too sensitive to transfer over an open network without encryption. And it only comes from there that if it's too sensitive to be sent over the network without protection, then maybe it shouldn't be written to disk without encryption either, especially without the user's knowledge. And in the past, many web browsers were cautious about persistently caching information just based on the fact that it came over an HTTPS connection, whether the headers said to cache it or not. And just to clarify here, I'm not concerned about memory caching, but only persistent caching to disk as in you close the browser, it's still there. So I actually was offline at one point and bored. So I opened my copy of Firefox and not having anything to read, I went to the disk cache. All right. So I was very surprised to see things there from my bank, like check images and account summaries, recent transactions. So my impression was that browsers did not cache this information. So after that we looked at 30 sites at ISC along with some of the other analysts at ISC who helped me. And we found that 21 of them were causing information to be persistently cached in the latest browsers. They were either sending no caching related headers at all, or they were sending headers that were non-standard or obsolete and only worked in certain browsers. So the first thing I'm going to do is show you a couple of the pages where we found information cached to disk and what it was. Then I'm going to look at some of the history as to why this has been so inconsistent and how somebody could get confused over whether this happens at all and how to prevent it. I'm sorry, but our speaker is a first time speaker at Defcon. He's ever spoken before and he came up and he sincerely asked us, please do not interrupt my talk. So we're not going to interrupt his talk. We're just going to sit up here quietly and have a drink. And curious to Defcon. Okay, guys. So the first thing I'm going to show you is some of the information we found in the disk cache and then go over some of the history about what browsers used to do, what they do now, what the standard says, and what people's impressions are when you actually go out on the internet on websites like Stack Overflow and look for this. Then I'm going to go over a couple of recommendations for how we think it could be more secure. So starting with some evidence here. ADP is a very popular payroll processing company. Does anybody here have their check done with ADP? Lots of you. So we had someone at ISE who had previous history in ADP. He logged in to their web interface and looked at a payroll statement. We found that it was cached and like last four digits of the social security number and last four digits of his bank account number which might be used for authentication purposes on other sites. So you can see how could this possibly go wrong. ADP was sending some caching headers but they were non-standard and obsolete and they were only interpreted today by IE. So if you went to this site in Firefox or Chrome, this was left behind. Another site was Argus which processes pharmacy claims for health insurance companies. You may not have heard of Argus but in Maryland our Blue Cross Blue Shield company uses Argus to handle their pharmacy claims. And we logged in to the health insurance and went over to Argus to see the pharmacy claims and without any caching headers at all were the name of the patient and what medications they were on and what the dosage was. So that may not be the best thing to have sitting on your hard drive. And this was sent with, once again, no caching headers so even IE would cache this. Our final one might be a little more surprising is Equifax which does credit reports. After one of our analysts at ISE went to Equifax and accessed his credit report it was cached and this includes information such as the obvious credit score and name. But also a credit report by definition has a list of all the accounts that you have reported to the credit reporting agency. And if you've applied for new credit recently or checked your credit report they often use questions such as it looks like you have a mortgage from three years ago, what's the payment and stuff like that which you could get from the credit report. So here's a full list of the 21 sites we found that had some form of caching issues. Some of them big names, banks, others not so big but it's a pretty big spectrum of different sites. And here is some of the types of data we found in the cache. Some of it not so severe like name then others more concerning like date of birth, last four digits of SSN. A private label department store credit card had full account numbers and they don't use expiration dates so that's not good. VINs if for your auto insurance and so on. So before I go into what all these non-standard headers are that these sites may have been using if at all, how do you just prevent this caching in all the browsers that are popular today. And that's with these two headers. Don't use them as meta tags. There is some historical precedent for being able to do that but it's not reliable. Pragma no cache is the old non-standard header that goes all the way back to the mid-90s when SSL was first introduced. And you need to pass that header due to a special case with Internet Explorer 8 and earlier when the server is speaking HTTP 1.0 as opposed to 1.1. That sounds like a little edge case but I'll get to that in a little bit. For all other cases including IE9 and later, cache control no store is what's in the HTTP standard specifically for this purpose. They talk about preventing information from being revealed on backup tapes which is the same concept. So what are some of the headers that we saw that don't work? And fail. So cache control no cache. That's in the standard but it is about preventing a user from seeing information that is stale. It says to the browser you have to go revalidate this before you use it out of your cache to serve another request. It has nothing to do with security but despite that when Microsoft first implemented support for it back in IE4, they decided to interpret it with the same meaning as cache control no store. They stayed with that for a while all the way through IE9. Then in IE10 they started to start following the standard. So this is something that is still changing up until today. Pragma no cache is an obsolete header that predates HTTP 1.1 and it has the meaning to IE. If this is over SSL don't write it to the disk cache at all. And it still works in IE. Cache control private we actually saw on a handful of these sites. It's not intended for web browsers at all. It is about caching proxy servers that are accessed by multiple users and it says this information is specific to one user. You shouldn't use it to serve another request by a different user. Cache control and meta HTTP equivatags does not work. The pragma header does but there's some buggy behavior about it and we have more detail about that on our white paper on the website. So meta cache control does not work. At least for the purpose of preventing disk caching. And finally passing the cache control no store header when the server is using HTTP 1.0 it's ignored by IE8 and earlier. And that seems a little weird. Why would a server speak a header that it doesn't understand because it's too old? Actually until very recently the Apache mod SSL SSL support would automatically downgrade a connection from version 1.1 to 1.0 if IE was the browser that requested it. This was to work around a bug and persistent connections in IE 5. And it was still there until two years ago it was patched in the main branch of Apache. That change is still not percolated down to all the various Linux distributions including the latest copy of CentOS. So there are many servers out there that still have that behavior of downgrading to 1.0 including the demo site that I'm about to show later. So that's a little weird behavior. You would never realize that unless you actually followed through, have a site send that header and look in your cache. So adding to this confusion as to what works and doesn't work today, things were different in the past. The first browsers like early versions of Netscape when SSL came out didn't cache anything that came over HTTPS. And even some later browsers followed that and even today Safari still does that way. A server sending something over SSL it's never cached and a server has no way to mark something as nonsensitive to make an exception to that. Firefox did briefly experiment with that in version 3 that being allowing a server to mark certain things is not sensitive and I call that an opt-in policy. They actually use the header cache control public as a hint to say go ahead and cache this. I know it's over SSL but this is just a CSS file okay. Then there are other browsers like older ones especially that only allow the pragma no cache header as a way to mark individual resources as not to be written to the disk cache and I call that non-standard opt-out because non-standard behavior works and standard headers don't. Then IE as it came into newer versions started to support cache control no store but the pragma no cache support was there so I call that generous opt-out be generous in what you accept as a browser which is often applied to other parts of HTML rendering and so on. I have the three versions of IE listed separately here because they have individual variants on that policy but the main idea is still there old behavior, new behavior both works. And finally newer browsers such as Chrome and Firefox 4 and later either the server sends cache control no store or it gets cached period the end. And because of this discrepancies between browsers there's also a lot of confusion out there in the community about what they really do. If you go on your search engine on Google and search for either of these phrases like browsers do not cache SSL or browsers do not cache HTTPS you will find results some of them new some of them old telling you that web browsers don't cache things to disk if they came from SSL some of them from Stack Overflow blog posts mailing lists even a W3C mailing list and that may have been true when it was written especially some of them that say well except for IE browsers don't cache SSL but that's not true today Chrome and Firefox especially in fact this quote below comes from the OWASP application security fact somebody who should know better right and it says if a web page is delivered using SSL no content can be cached and this may have been true with Firefox 2 and earlier if those were the browsers you were looking at but that part of the standard is just not there there's no specific behavior that all browsers follow as far as SSL. So let's look at the browser developers who decided like Mozilla let's change our caching policy after all that would increase performance. Well on the bug that was entered into Mozilla that decided to switch their caching policy from opt in to strict standards compliant opt out one of the comments on the bug said among sites that don't use cache control no store the correlation between SSL and sensitive is very low and those 21 sites it doesn't work that way. So where do we go from here we have a lot of sites out there on the internet and a lot of browsers that are interpreting something very differently should browsers assume the website will take care of marking things as sensitive the websites either use headers that they think market is sensitive or they don't realize that they need to be doing it in the first place. So what do we think should be done? Well first of all the obvious thing is fixed web applications after all the HTTP HTTP standard does say this is the header to use if you want to prevent this caching. So in the long run cross browser compatibility is about more than the latest HTML5 tag and the semantics of XML HTTP requests. It's also deeper meanings of the HTTP standard that could have security consequences such as disk caching. I have fixed browsers as a maybe because it could be reasonably said by a browser vendor that the standard says you send this if you want to prevent caching. However at the minimum browsers should interpret that pragma no cache header. Despite it being non-standard it did have that meaning back in IE and even in earlier versions of the Netscape when they briefly experimented with an opt out policy. If they're not if they're willing to do that they could go a step further and switch from opt out to opt in. A browser is not required to cache anything so they could go back to that Firefox 3 policy where nothing is cached unless the server says cache control public. More importantly maybe the bad documentation out there that says browsers don't cache this or use pragma no cache or use cache control no cache that should be fixed. Obviously we can't fix mailing list archives but at least wikis and other things like that should be updated and anyone doing security assessments of web applications should be aware of this issue. Finally and this is probably the most controversial and least likely to happen. Maybe the HTTP standard should take this into account. In fact if you look at RFC 2616 and you search it for SSL there's a grand total of one occurrence. And if you search it for HTTPS there are no occurrences. So while that might make for a nice layered architecture where the protocols on top and encryption is underneath you shouldn't be ignoring security consequences or assumptions that people are making. Especially when there's this historical behavior among different web browsers. And finally I've actually put an HTTPS site out there that tries different combinations of caching headers so that you can go back to your disk cache and look at it and see what really happens in the browser you try. So before I bring up that site just as an experiment does anyone have a question if you want to go to that microphone there? If not I'll go onto the demo site. Okay? Yeah. Okay so Safari was on that list as not caching at all and the mobile version works the same. Chrome was on there as strict standards compliant opt out and Android browser works the same. So it's possible on an Android device that these things could be getting cached to disk just like the desktop version. And it's a little harder to replace your browser on a mobile device too. There is a slightly older version of the slides on the DVD but all the content is the same. I can't hear the question it's something about PDF. How does the fact that it's a PDF file how does that affect the caching? Okay so it's possible that due to the fact that when it's a PDF file if they're using the Adobe plugin it's probably depending on the implementation maybe caching it to a temporary file anyway. But we tried it in Firefox and sending the cache control no store does work on a PDF in Firefox especially for the new built-in reader. Yeah. Okay so he's asking if you are a web browser user can you reconfigure it to go back to the old policy or just not cache HTTPS? And Internet Explorer there is under advance there's an option for do not save encrypted pages for disk. And I attend an 11 that is supposed to work but in earlier versions there were problems with not being able to download files over HTTPS if that wasn't able. In Firefox there's a hidden browser preference that I have in our white paper on the website that you can set the opposite way to go back to the no cache SSL policy. For Chrome we tried writing an extension but didn't get anywhere with the APIs they provided and Safari you have nothing to do because it doesn't cache in the first place. All right. Yeah. Possibly but where do you store the key? Because you want to be able to exit the browser and go back in and you've got to recover the key and if you can recover the key then so can any other application potentially or just use a memory cache. But that's valid for like a mobile phone where there's not as much memory. Okay. We're ready for the demo site. Okay. All right. So I'm in Firefox. The first thing I'm going to do is clear the cache so that this is valid. All right. Zero bytes. And next I'm going to visit our test page. And you can do this also if you would want to. Once it loads here, it worked earlier. I even have a ETC host entry. All right. So what this is doing it's a main page with a little description of what the issue is, held a check for it. And then I have some iframes down here. Linking the pages that have been configured on the server to send various combinations of these headers. And I have a small explanation of what they're supposed to do. So after you visit this page I'm going to close the browser just so we're guaranteed that it's in the disk cache. And then go back in. And we'll go to the magic URL. All right. And it shows all the disk cache entries. And I've named those files on the demo site so that it tells you what headers it's sending. So no headers dot HTML is not sending any cache related headers. There's no cache control or pragma. Then there's no control no cache. I'm proving that this does not work. All right. So it might be influencing the decision of whether they're validated or not before we're using it, but it has nothing to do with disk caching. Cache control no store is a meta tag. It's not there in the headers. It's in here if you look at the hex dump. That doesn't work either. And it probably shouldn't work because for meta tags to be a weird condition where you can cache it and then parse it or parse it and then cache it and then it would be some weird buggy behavior. And in fact if you look at IE, who does support the meta tag for pragma headers, there are documentation of bugs telling you if it's over a certain size and put the meta tag at the bottom of the page or something like that. Crazy. Okay. And pragma no cache is not working here in Firefox. And one more is cache control public, cache control private, and cache control private doesn't work either. So you can go to this demo site and various browsers and check the behavior and prove it to yourself. And closing any more questions. Somebody, yeah. Exactly. We haven't tested that. The best thing would be the justification in Firefox as to why they changed it between version 3 and 4. Maybe some of this has done some statistical testing on it. Exactly. That's why they changed it in the first place. So as long as the site does the proper behavior we'd be okay. But that's not happening right now. This is Firefox 22. And it hasn't changed then. So, anything further? Okay. Yes.