 How many PDFs do you get every day? One, two, too many maybe. Today we have Ido from Checkpoint Security here and he will tell us how bad PDF files really are. Hello everyone, I'm Ido and today I'm here to present bad PDF. Our Windows credentials can be used, our Windows credentials can be leaked by manipulating PDF files. This research was conducted following publication regarding stealing Windows credentials by manipulating Microsoft Outlook. We wanted to show that the same can be done via PDFs. So the flow of this presentation will be to first talk about PDF in general and its file format, then talk about the attack, flow of this vector, present the proof of concept, and finally talk about mitigation and the impact of this attack. But first let's get this part out of the way, a bit about myself. I'm a security researcher at Checkpoint, previously served in the IAF's Air Intelligence Wing. My academic background includes a Bachelor of Science degree in Information System Engineering, and unfortunately I can't quite speak German so we're going to have to talk in English. So let's start by talking about PDF in general. PDF was first developed by Adobe in the 90s with many versions being released afterwards till this day. So written purpose was to allow you to present various types of data from text, images, web page links, etc. Regardless of the environment the file was opened in. Now as for the file structure, a PDF file has four main components. The first of those is the adder, which basically contains the version of the PDF file. For example, this adder notes that the PDF version is 1.3. Second is the body, which contains all the objects that makes up the PDF. And we will talk about those objects next. As we can see here we've got several different objects, page, contents, parent, and so on. Third is the xref, also known as the cross reference table. This table contains info for how to access the aforementioned objects. Again, this is an example xref. Finally, we've got a trailer. The trailer basically contains the metadata of the PDF, as we can see here. And here we've got the four parts together, making up the PDF. The adder, followed by the body, the xref table, and the trailer. Now I've mentioned objects, so let's go about this quickly. There are eight types of possible objects within a PDF file. Anything from numbers and strings to arrays, streams, and even another object. But the one that interests us the most for this attack is the dictionary object. So let's expand upon that one. A dictionary is an object representing a table, which contains pairs of other objects, keys and values, called entries. A page object is a dictionary representing a specific page within a PDF. It consists of several required and optional entries, which we'll see in a moment. Here we've got an example page object, as you can see. As the contents object, parent, and many others. Now, let's talk about the relevant entries for this research. First is the AA entry, also known as the action dictionary. This is an optional parent entry, which defines action to be performed in general. Usually comes with either the C or the A child entries. The O entry is a child entry that defines actions to be performed when the page is open. Similarly, the C entry defines the actions to be performed when the file is closed. Sorry. More relevant entries. The S entry describes the type of action to be performed. In this case, we see go to E, go to embedded. Go to embedded opens an external PDF file without notifying the user. And I think you can kind of guess where we're going with this. The F entry describes the location of the other PDF file. And finally, the D entry describes the location within that file to go to. So as you can see, we've got an action dictionary with an O entry, sorry. Where the F entry points to appendix.pdf, the D entry points to the contents of that page, and the S entry is basically go to embedded. So we've talked about PDF in general, how it is structured, and about the objects and entries relevant to our attack. So why do we go for here? Now we'll see how we can use this information in order to steal our victim's credentials. Here you can see the proof of concept that our team developed. We've used an action dictionary with O entry to get the exploit to execute upon opening the file. We then use the F, D and S entries to point the PDF reader to an arbitrary file on our SNB server. Injecting these five entries into a page object is all we have to do in order to weaponize the previously benign PDF file. So let's see what happens when this file is opened. As you can see, the exploit code now points to dummy.pdf on our malicious server. There are two things that we should note here. First, the remote file does not have to be a PDF file. It doesn't even have to have an extension as long as it actually exists on a remote server. And second, while this example right here abuses the go to action, go to embedded, the action, go to R, go to remote is just as vulnerable. Now, when the victim opens our PDF, the action dictionary we've injected into it activates, causing the PDF reader to request the arbitrary file we've specified, in this case, dummy.pdf, from our malicious server, in this case, again, 4.4.4.4. This prompts the victim's Windows client to try and authenticate with our server using NTLM single Simon. As part of that protocol, the victim's NTLM details, domain name, username, NTLM hash, and others are sent to the remote SNB server. These details can easily be captured by the attacker, as you can see from this Wireshark example. In this example, a user named Alice attempted to authenticate with our SNB server, and we can clearly see a account name, a domain name, and the NTLM hash. We can then use publicly available tools in order to crack the victim's NTLM hash, thus obtaining the Windows credentials, which in turn can be used to, can be leveraged to spread further into the target network. So, to sum up what we've just seen, this is an exploit that requires very few lines of code. In fact, it could be written as a one-liner, but we've opted not to do so for the second readability. This means that even without any previous knowledge of PDF structure or vulnerabilities, an attacker can have a weaponized PDF file within minutes, less if he uses one of the many public implementations of this vulnerability, as we'll see in a bit. The victim has only opened the malicious PDF in order for the NTLM credentials, sorry, NTLM hash, and by extension, the Windows credentials to be linked to the attacker. And this is the biggest takeaway. It's all done without any security, alert, or evidence of the attack. There aren't any malicious processes running in the background, no changes to the registry, no residue, nothing. Now, let's see an example of an attack utilizing this exploit. Here we've got an attacker running Kali Linux and he's setting up Responder, which will basically be the SMB server for this purpose. The attacker then uses one of the public implementations of bad PDF in order to generate a malicious PDF file that points to his Responder service. And as you can see right here, this implementation uses the same proof-of-concept code that we just saw. Again, FDNS entries within an action dictionary. The attacker then hosts the file on the HTTPS server, and switches to the victims machine, which will now open the file. Let's open it again for a good measure. Switching back to the attacker's view, we can see that the NTLM hash was leaked several times as the victim went to the trouble of opening the file multiple times. But in the end, we've got the leaked hash right here. So, we've successfully captured the victim's NTLM hash. What do we do with it? Simple. Now we can crack it using one of the many publicly available tools. We can use Joender Reaper, Asket, OpCrac, or even Google Search for an online cracking tool. The actual cracking process is as simple as pasting the captured hash into the GUI, or the terminal, and running the tool in its default configuration. As you can see here, in this example, we've opted to use Joender Reaper, which comes pre-installed in Kali Linux. Again, very simple to use. The only argument was the file containing the captured hash. As you can see, cracking the NTLM hash of user gpn19 gave us their password, betpdf. Now, you may have noticed that the total cracking time on this example was roughly 13 minutes. The length of time required to actually crack the password will vary both with the complexity of the password and the strength of the machine running the cracking process. So, now that we've talked how this exploit works, let's think of a scenario where it could be used. The most naive scenario that comes to mind is the most obvious one. An attacker uses an exploit, in this case, betpdf, enticing a target to open the weaponized file, and steal the credentials. Having done that, and he's got initial foothold within the network, the attacker then deploys a payload, a bot, a virtual, a miner, and so on for using other exploits to spread laterally toward the network. Now, this is fine and everything, but we want a scenario a bit more unique to our exploit, so let's think of another one. Here the attack starts in a similar manner. The attacker entices a user, a victim, within the target network to open the malicious pdf file and steal the credentials. Except this time, the attacker uses the stolen credentials to access a file server and inject the exploit into a pdf that is viewable by every user within the network. That way, the attacker can harvest credentials from the entire network without being noticed. It's kind of scary. Now, let's take a look at a couple of the public implementations of betpdf. First one, I think we're familiar with this, Metasploit module. This module can generate a malicious pdf from scratch, and it can also inject the malicious code into an existing pdf. Let's take a quick look at that. You can see it's got the pre-made pdf here, and, again, the same PLC code. Other implementations can obviously be found in GitHub, and amongst those, there's one that really stood out that both generate a malicious pdf and runs the responder service for the attacker, so it doesn't have to do anything other than insert his own IP. Sorry. As we can see here, again, the same proof of concept code, and it also runs the responder. Now, let's talk about the impact of this research. In general, betpdf raised awareness to pdf vulnerabilities, getting published in many services, such as Blipping Computer, ZDNet, et cetera. We've also seen the rise of multiple tools implementing the result of our research, as we've just seen in both Metasploit module and GitHub repository. The betpdf hashtag was also trending around time of this publication, and, finally, this vulnerability was assigned CVE-2018-49-93. You can see example articles about the publication. As for mitigation, there are several ways to protect yourself against this stack. On the network level, you can deploy IP solutions, and I won't name-drop anything here, that will detect the weaponized pdfs, and it should be fine. On the OS level, you can apply Microsoft Options of Optional Security Enhancement, which basically prevents NTLM single-sign-on attempts from external sources. You can also patch over pdf reader. For example, Adobe Reader published an update on May last year, which was supposed to protect against it, but in February this year, it was apparent that this patch could be bypassed. So another CVE was assigned, and another patch was released. Foxit Reader, again, another major pdf reader, since version 2.1 is no longer vulnerable to this attack. So let's wrap things up. We've seen how simple minimal manipulation to a pdf file can lead to a victim's credentials leaked to the attacker. The attacker must only understand the structure of a pdf file in order to abuse this vulnerability, or, alternatively, he only needs to know how to serve GitHub. In addition, an attack using this vector leaves no evidence both doing and after the exploitation. To make things worse, an attacker can leverage this attack vector to cause even more serious damage within the target network. So that's bad pdf. Thank you. So thank you very much, Ido. If you have any questions, please raise your hands, and I will come over to you with the microphone so that everyone can hear your question. Okay, since no one has questions, again, thank you very much for your talk, and the insights to bad pdf. And for the rest of you, have a nice day.