 Hello, I'm Didier Stavens and today we are looking at XPS files. Simply put, those are Microsoft version of PDF files and they have been used lately in phishing attacks. So let's take a look. So here we have our XPS file and it's actually a zip file as you can see here. So the XPS file format, XML paper specification by Microsoft is mainly XML files inside a zip container. So with my tool zip dump we can take a look and here you have all the files in them. So these here are content types. That's for the open packaging specification and we can take a look at this file because it can help us identify the document we are dealing with. So select one and dump and as you can see it's XML and here you can see for example XPS document structure. So this tells us that it is an XPS document. It explains for the different file extensions what the type is. So for example fdoc here that we can find here fdoc and fpage that we can find here fpage. fdoc is an XPS fixed document and fpage is XPS fixed page. So the page here is in file 12. So let's take a look at file 12 and dump this and here already we see our malicious URL used in the phishing attack. Now we could grab for this with HTTP but then we get nothing and that's because that's actually Unicode. What we have to do is with the zip dump say that it is Unicode and have it converted to ASCII like this and then we can grab out URLs, UFDs, URLs but also here URLs used as namespace identifiers. We can also have an ID of the text itself by looking for Unicode string like this and then here you see the values. This is a secured encrypted attached file open with your professional email credentials copyright 2018. So that is the content. There's also a very fast way to extract URLs from these files using my researcher tool. So we do a zip dump and we take option uppercase D that will dump all the files or content of all the files inside the zip file. It will dump it to standard out. So we take the zip file and then we pipe this into my tool or research and we look for URLs like this. Now the only thing that we have to do is also use option E to tell our research that we want to convert the binary data that we receive into strings because this zip file also contains images, GPIC, PNG and also Unicode. So we can just look for ASCII strings. We have to extract them using that option. And then here you get all the URLs. There are a lot of URLs used as an ID and here you have the URLs that are used in the phishing attack. To make it a bit more readable you can use option U to have a unique list and then you have no duplicates. Every URL appears once. And if this is still too many URLs to look through, you can use one more option. Well, it's not actually an option, but it's another regular expression. And that is instead of regular expression URL, we will use regular expression URL domain. And this will match URLs exactly like the URL, regular expression, but it will only output the domains found inside those URLs like this. And here you can see those are all legitimate URLs used inside those types of documents, but this one is not.