 He's a Max's security researcher at Lookout. He's been doing this for about 10 years. He spent a lot of time in obfuscation, exploit development, security research, previous Black Hat speaker, currently focused on mobile security research and working on his PhD. He'll be telling you about some of the internals of Pegasus Malware today. With that, I will turn it over to Max to take it away. Thank you. Hi, everyone. My name is Max Bazelli, and today we'll talk about the Pegasus internals. I'm from Kyiv, Ukraine. I currently work as a security researcher at Lookout, and last few years I've focused on jailbreak techniques, so that's why I co-founded the Frye Depple team, where we're working on various iOS jailbreak, including 8 and 9. So, Pegasus. Pegasus is a high-quality espionage software that can be used for complete surveillance of a device. It does everything from stealing your personal data up to remotely activating a microphone or camera on a device without any indication it's really happening. So, in order Pegasus to work, it needs to jailbreak a device first, because the iOS sandbox prevents application from spying on each other. So, that's why Pegasus rely on a tried exploit chain to completely own a device and install persistence that can be used on each device reboot. Here's a really terrifying list of target apps, including even the known as most secure ones, like Telegram, WhatsApp, Viber, and I'm pretty sure you can find your favorite messenger in this list. Before going to a deep technical analysis of the vulnerabilities used, I wanna tell you a story how we get a Pegasus sample. So, please met Ahmed Mansour, who's mostly known for his job as a human right defender. He's even a recipient of Martin Annals Award, sometimes called the Nobel Prize for Human Rights. So, on August 10th this year, Ahmed received a message with a text that someone in a state prison got, someone in prison in a state prison. And he received another text with a similar thing the next day. But previously he was targeted by a hacking team in 2012 and Gamas Finfisher in 2011. So, now instead of clicking on a link, he contacted Citizen Lab because he was working with those guys before. So, he sent a link for a Citizen Lab to analysis and we are in a as a local research team who would get initial sample and a link from a Citizen Lab. So, in this story, I mostly will focus it on our technical part. So, in order to work, Pegasus rely on a tried and exploit chain and it uses three stages. So, on the first stage, it uses a memory corruption to achieve a remote code execution in a Safari context. After that, it jumps, after it is on a device, it jumps to a second stage and use two vulnerabilities to exploit a kernel. One is used for bypass the kernel address-based layout optimization and another to achieve kernel code execution, kernel-level code execution. And finally, on the third stage, it installs HPNR software and use a special trick to achieve on-device persistence. So, I will focus on each stage more detailed. The first stage, as I say, is a single-use perfish URL that will be invalidated after a first click. It contains obfuscated JavaScript, the first thing you do in it, checking for a device type. Is it iPhone? Is it iPad? Is it 32 or 64-bit? And based on information about device processor type, the different versions of shell code will be downloaded, which is in a stage two. And finally, it exploits remote code execution, vulnerability in a webkit to execute the shell code. So, what vulnerability would use it? CVE-4657, remote code execution in a webkit. Basically, the vulnerability is user-free that achieve it by using two bugs. And in a sample that we got, it's not stable because it relies on a webkit garbage collector. The problem itself lives in a market argument buffer that can be exploited by usage of the defined properties. So, defined property is a method that defines new or modified properties directly on an object. It takes a few arguments, the object itself and the properties objects, which can have descriptors that constitute the properties to be defined or modified. It has a pretty simple algorithm containing few loops on the very first iteration of each property descriptor checking for formatting. And after that, get appended to a descriptors vector. And to make sure that the reference to property descriptors not becomes stale, they need to be protected from being garbage collected. For this purpose, market argument buffer is used. We see it's a very, very end, mark buffer append. So, market argument buffer prevents object from being delicate. And after each property has been validated and it's okay, the define of property necessitates each of the user-supplied property with the target object. And here is the problem here because it's possible when the defined property will be called, it's possible to call any user-defined JavaScript methods. If in these JavaScript methods, garbage collection can be triggered, it will delicate any unmarked, hip-backed object. I will go a little bit deeply in the details. First of all, a few words about market argument buffer and JavaScript garbage collector. So, JavaScript garbage collector responsible for delegating an object from a memory when they are no longer referenced. It runs at random intervals and base it on a current memory pressure and current device types and so on. And when garbage collector is checking if object should be delegated, it walks through the stack and check for reference to an object. Reference to an object also may exist on application hip, but in this case, alternate mechanism is used called the slow append. So, market argument buffer has initial inline stack contain the eight values. That's mean when the ninth value will be added to market argument buffer, the capacity will be expanded. It will be moved from a stack memory to a heap memory. This is what the slow append is doing. Slow append move stack from a, move buffer from a stack to a heap. And now, object not automatically protected from a garbage collection. And to make sure they were not delegated, they need to be added to heap's markedly set. This is what we see here. So, slow append is trying to acquire heap context and it can be acquired adding an object, like marking an object by adding to a markedly set. And here's the problem, because when the heap context can be acquired, it can be acquired for a complex object, only for a complex object. So, this mean for primitive types, like integer, Booleans, and so on, they are not heap backend object and they will be not marked as a markedly set. And there is a bug in the slow append. We should call it just once. So, this mean when the buffer will be moved from a stack memory to a heap memory, and one of the properties will be a simple type, like an integer. It will be not automatically protected by a garbage collection. And all the next corresponding values will be not protected as well, because they're back to a slow append. Here we see a picture that's illustrating it. And in reality, the reference to JavaScript objects still exist, but if in a call to define our property method, any of the user-supplied methods will be called, they can remove this reference and object will be deallocated. So, to summarize all the information, here is how it can be exploited. So, we specify and props object, which contain 12 descriptors, and first nine of them well as a simple type, like zeros, eights. Which mean when the P8, which is the ninth well, will be added to a markedly set, it will trigger the slow append and buffer will be moved from a stack to a heap. And the next corresponding values, which is like length and which not number and array, will be not marked by a markedly set and not automatically protected by a garbage collection. What happened next? When defined properties will be called on a length property, it will try to convert not number to a number, which force that users define it to string method will be called. The string method remove last reference for an array and for the garbage collection cycle by allocating large amount of memory, which leads that object will be delegated by a garbage collection. The very next thing it is doing is reallocate the new object over a stale one. So, this is how specially crafted use of the free was used in Safari to achieve remote code execution and to execute a shell code. The shell code exists in a second stage, which is a payload, which contain the shell code, compress the data. The most interesting for us is the shell code because it's used for a kernel exploitation in Safari context and to compress the data basically is a loader that downloads in the group the next stage. One of the vulnerabilities used is a CVE-4655, which is an infolik that's used to bypass a kernel address layout on the emulation. It exploits the information that constructor and IOS centralized binary method, they miss bound checking. So, that means that attacker can create OS number object with really high number of bits and call it within the application sandbox where I already enter good property bytes. Here's how it looked like. So, IOS centralized binary is a method to handle binary shell as a data in a kernel. It converts a binary format to a basic kernel data object. It supports different container types, sets, dictionaries, array, object types, strings, numbers, and the point of interest is OS number. So, as we see here, it has two arguments, value and length, and there is no real check that's for length, for the length property. So, this means we can control the length that is passed to an object. And why it is a problem? Because here is a constructor for OS number in it, and as we see the length property passed here, it's new number of bits and it's override the size variable. And the problem that size is used in other methods in the case that OS number, number of bytes, which leads that return value of number of bytes is now fully controlled by attacker, which is real bit, because it's used next in IOS to enter good property bytes, which handle OS numbers, and it's used number of bytes to calculate the object length, OS number length. But unfortunately, it use stack basic buffer to parse and save OS number value. And what happened next? It is copying memory from a kernel stack to a heap using the attacker-controlled length, which means we can specify how many bytes will be copied from a kernel stack and return to user length. This is what happens. The first thing we're doing, we create properties array that have a dictionary, which have an OS number with a high length in our case is 256. Next, we need to spawn a user client by calling IOS service open extended, which will deserialize OS number object and create this object in a kernel. And now we need to read it by calling IOS to enter good property, which leads that we copied the 256 bytes of the kernel stack memory and the kernel stack memory will contain kernel pointers and from a kernel point, we can determine the kernel base. So now we get a kernel base and we can jump to an expert ability, which is CB4656. It's user-free to achieve a kernel level code execution. It exploits information because the seated index macro does not really retain an object and we can trigger it within the application sandbox from IOS anserialized binary. Again, IOS anserialized binary, it's a function that parse and deserialize object in a kernel. It supports different data types, different container types and the interesting thing, it supports kioserialized object. It means that we can create a reference to another object. It'll be really useful in the future because in a way of deserializing and parsing objects, IOS anserialized binary saved object pointers to a special objects array and using setAtings for it. And as we see, setAtings just save object pointer to array with some index, not retaining it. That's bad because the next code, which casting OS ring to a symbol, it is releasing the object pointer. What does it mean? We still have an array that holds all the object pointers, which is objects array, and we just released one of the object but still hold the pointer. If we can create a reference to an object, we can exploit user-to-free. This is what happens because kioserialized object allows us to create a reference and we will just call retain on already deserialized, already-delicated object. This is how exploit looked like. So first of all, we create OS dictionary that will contain a string that do it to a bug will be delegated. So now we need to relocate it with our controlled object to fit in the same memory spot. As OS string, in our case, OS string class in a memory will be 32 bytes, we need to allocate the same size. For this purpose, OS data is a perfect candidate because we can control OS data buffer, buffer size and buffer content. So what we can do, we can create a fake OS string with a fake wittable and this fake wittable will point to some gauges in a kernel. The very last thing we need to do is trigger user-to-free by adding kioserialized object. So once again, OS string got deserialized, delegated, we allocate new object which is OS data buffer, which will point to the same memory spot will get a user-to-free. So after getting user-to-free, I guess let's use some kernel patches to disable security checks, like patch set UID to easily escalate the privileges, bypass MFI checks by patching out MFI get out of my way, disable consignment enforcement by patching CS informant disable variable and finally, remount system partition to be readable writable so it can execute a loader for the next stage that we'll download and decrypt the next stage. The next stage contain the real SPNR software that will be used to sniff all the SMS, all the calls, all the personal data. It have three groups. One is the process group, which have a main process sniffing services, the model that uses C protocol to communicate with command control, like a process manager and so on. The next interesting thing is a group of the dial-ups because Pegasus rely on side-to-substrate, the jailbreak framework, called rename it as libdata and use the side-to-substrate to inject dynamic libraries into application process. So in our case, we have a dynamic libraries for Viber, for WhatsApp, MIM, which will be injected to application space and install application hooks. And the last thing is com-apple-itemstore-defile, which is a JavaScript that contain code and shell code that will execute, that can execute unsigned code. I will focus on it next. So the bug exists in a JSC binary. JSC binary is like a helper for JavaScript core, JavaScript engine in Apple. And it can let to unsigned code execution. In combination with RTBodyG trick, it can be used to completely gain a persistent on-device. It exploit that it is a bad cast in set-early-value method. And fortunately, it can be triggered only from GST application context. So what is the problem? It's exploit a problem in JavaScript binding, set-impure-gator-delegate, which have in C++ was function set-impure-gator-delegate. This function takes a few arguments. One is a impure-gator, and the second one is a generic JS object that will be set as this impure-gator-delegate. The problem will be, next slide, so we just parse two arguments and call a set-delegate. The set-delegate called set, which finally call set-early-value. Here is a problem, because there is no real check that the object type passed to set-impure-gator-delegate is really impure-gator. So this means that if any other object type will be passed, it will be improperly downcasted as impure-gator pointer. That's what happened here. So it's a bad cast that have no real check for object type, and which lead that we can override one of the object fields. Here is the same function, but now decompile it in IDAPRO. So in our case, impure-gator is a base variable here, and the delegate is this generic JS object. We see that a pointer, which is base plus 16, can be overwritten with a pointer-to-a-delegate, which leads, if we see on the right, JS array buffer view class. If we pass JS array buffer view class as a first argument, the m-vector field will be overwritten with a pointer-to-a-delegate, which is really bad because it can lead to readable-writeable primitives. To explain that, I guess I'll use the two data views. I will call them data view one and data view two, and call a set impure-gator delegate on both, which leads that m-vector field in a first data view will be overwritten with a pointer-to-a-second data view. And now by setting and reading values on the first data view, we can override object fields in a second. While we need it, we can map the second data view as entire process memory, by overwriting second data view array buffer offset to be zero, by overwriting second data view lens to be four gigabytes, in a case of 32-bit process, and set type as first array type. So basically, second data view, now it's mapped into entire process space, and we can set int to get arbitrary reading write anywhere in a process memory. The same thing can be used even to get execution primitive. But in this case, we can call set impure-gator delegate twice, and instead of exposing the entire process memory, we can leak just an object address. If you can leak an object address, we can create this function that have like hundreds of empty try-catch constructions, and force G to compile. And in this process, special readable writable executable memory segment will be allocated. We can leak address of this GIT segment overrided with the shell code and execute. So this is how the bad cast can be used to like re-exploit even a kernel on each boot. It's used with a persistent mechanism, which is RT by DD. So the problem is that system spawning RT by DD service, which is a special early boot argument. This means if we take any other binary signed by Apple and name it as RT by DD, it will be spawned on a boot. That's what Pegasus is doing. So they take JSC binary, which is signed by Apple, name it as RT by DD, then take JavaScript that contain exploit, make it a sim link, call it early boot, which leads when the RT by DD will be spawned with early boot, it will call JSC with JSC exploit instead. So with this trick and the bad cast, it re-exploit kernel on each device boot. There are some tricks Pegasus use to make it harder to reverse engineer, like use one-time links. So after you click on any of the links, they will be invalidated and now redirected to Google or other sites. It's re-encrypt all the payloads each time they are downloaded just on a fly. And of course, they're trying to hide itself to make look like a system component. Of course, it's blocks, are your system updates to make sure you cannot patch your device just on a fly, to clear all the evidence, clear Safari history and caches, and we found a self-destruct mechanism that can be triggered remotely or on a timeout. So in addition to this refined list of supported applications, it records any microphone usage, any camera usage, GPS location, keychain passwords, even including the Wi-Fi and the router one. Why do they need the router? I don't know. Application hooking. So how it's operate, as I mentioned earlier, it's use site as substrate. And with the help of site as substrate, it preloads dynamic libraries into application process and intercepts some critical functions. It uses sign jack to run into already running processes. So this is like a high-level picture of how it looks like. So all the application-level critical functions and the framework-level critical functions are intercepted by Pegasus. So now Pegasus can control them, can collect them, can peg them, can send to command control and so on. To summarize, Pegasus is a remote jailbreak spotted in the wild. It's pretty scary because it doesn't require any user interaction. And the last similar thing was like five years ago when the comics released his jailbreak's mystery. This year, look at the desk, I used one of the trident vulnerabilities for his jailbreak. I want to say a special thanks to Citizen Lab for helping us with achieving Pegasus's sample. All the lookout, resource and response team, the divergent security guys and all the individual researchers who was involved in the research. There was a list of some useful links which contain a 44-page PDF report with really, really deep details on vulnerabilities. Just use it even with a difference between 32 and 64 bit ones. So if you're interested, please take a look. I think that's it. Do you guys have any questions? Okay, please keep it brief. We only have some minutes left for the questions and if there are any questions, please go to the microphones in the hall. And we start with the signal angel from the internet. Thank you. Is there any way to build your app protected from this exploit? Yes, because the Pegasus used some of the known jailbreak techniques. It is possible to detect, for example, that system partition is remounted as readable, writable. It could be one of the indicators that some generic jailbreak is running on a device. Or check for a special file that Pegasus used, but better check it in general for jailbreak patches, the kernel patches. Please try to stay a bit quiet. We are still in the middle of the Q&A. If you don't have to leave now, please stay seated until afterwards. And if you have to leave now, please do not talk. Microphone 3, please. Hey, what's the user experience during this? User experience? You mean when you get a device infected by Pegasus? Well, the scarcity there is no real indicators on a device that you get something. You click on a link. Your mobile browser opens and just closes and crashes. This is it. There is no new applications spotted on your visible applications and so on. But in real, it's running like three new system services. They're not visible to a user. Thank you. And please another question from the internet. Thank you. Have you any idea how active this exploit is in the world? Say it again, please. Have you any idea how active this exploit is in the world? I'm sure it was a very tartarite attack because this exploit is pretty expensive. For example, 0.0 pays one half million for remote jailbreak like this. So I don't think the afters of this exploit is very, very expensive. So I think it's very, very tartarite attacks. It's hard to predict how many devices was infected by Pegasus. Now we know about the monster one. So again, I think it's very, very tartarite because it's very expensive. Thank you for this answer. Microphone number five, please. Hi. Do you have any more information on the NSO or the group that's behind it? Are they using any other software and how spread is this in the world again? Yeah. So in this case, we focus it mostly on technical details of the Pegasus itself. But CitizenLab made their investigation on NSO and the NSO is like cyber arms dealer. So please take a look in a CitizenLab report on that. So they have much more information. Do we have a question from the internet? Am I overlooking anyone? No, then this is it. Thank you for your talk.